BITS routinely performs maintenance on Broad systems and technologies. BITS Quarterly Maintenance windows tend to affect a wider range of systems, which may disrupt diverse daily activities. This checklist will help Analytics users identify steps to be taken before, during, and after a BITS Maintenance window to ensure that reporting is available to the relevant stakeholders.
Before
- Subscribe to BITS notifications https://intranet.broadinstitute.org/node/5215/
- Determine who in Analytics will be available during/after the shutdown
- Review BITS list of expected database and applications outage during the window
- Back up services/databases as applicable
- Edit Announcement Banner formatting on GPInfo Jira and LabOps Jira for expected shutdown time
<h3 ><span style="color:#ff0000;"><font size="5"><p><strong>GPInfo Jira will be shut down starting 9pm on 6/7/2019 in preparation for BITS Maintenance Window on 6/8/2019. Service will be unavailable until 6/9/2019 in the morning.</strong></p></span></h3><h4>Email <a href="mailto:analytics@broadinstitute.org">GP Analytics</a> if you have problems or questions.</h4>
During
- Pay attention to BITS communications via emails & BITS Slack communication channel: #bits-quarterly-maint
After
- Review that production, development and test systems are working (Erik)
- Check for SEQPROD, GAP_PROD, JIRADWH disruptions via Tableau Server Background Tasks for Extracts (if extracts are red/broken, troubleshoot connections)
- Tableau Desktop or SQL Nav (may be a problem in tnsnames.ora, e.g. the listener being brought up incorrectly)
- If there’s a problem, email DBA (dba@broadinstitute.org)
- Verify Analytics Web Server is working after Maintenance window closes - http://analytics:8090/etl_runs
- If the Analytics Web Server shows failed runs or seems broken, restart the Web Server and restart Spark following this documentation: https://broadinstitute.atlassian.net/wiki/spaces/AN/pages/666763449/ANALYTICS+Unix+server+-+useful+tools+and+commands
- Check reporting-errors emails for Heartbeat errors to see if other issues exist, and to verify that all issues have been resolved
- Check DependencyVisualizer for broken connections, if applicable