/
Analytics Group Meeting Notes

Analytics Group Meeting Notes

Future topics

  • Review and reconsider Tableau Server subscription

Vacations: (please add to GP Infx calendar)

  • MM - Tentative dates Nov 22,30  Dec 7,14,21
  • CG - 2024 Dec 6, 13, 19-20

Unassigned RPT tickets: Link to filter in Jira


2023-07-13

  • Monitor error and warning emails - how to capture errors that need immediate attention.
  • Jira admin flash training - issues like
    Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration.

2023-07-13

  • Tableau server - negotiated new license model. We keep current configuration - 8 core server with unlimited number of Explorers. Still would like to clean up the server from inactive users and generally, host only GP stuff and users.
  • Covid ETLs and extracts were stopped. We would need to retire some reports and/or replace them with new static boards. Waiting for heads up from Heather.
  • Quality System revamp - on hold new requests for reporting on new operational metric.
  • Quarterly goals were just posted. Still to figure out which ones will need our attention. Watch out for Clinical BGE, DRAGEN for NVX, Clinical WGS, Walkup data to buckets (replace Walkup portal with SIDR)

2023-06-22

  • Tableau server - may need to switch the license mode.
  • Retire Covid reporting, development of static reports if needed.
  • Revamping quality reporting - each Lab team is reviewing their quality metrics. Expected incoming requests.
  • Start to review ODI-related projects and changes that we'll need to step in.

2023-02-09

  • Tableau server user clean up.
  • Organizing users and GP projects on Tableau Cloud

2022-12-01

  • Broad Retreat 12/12- 12/13
  • Vacation time around holidays
  • Next year's goals and projects
    • Jira upgrade. Need to learn how to manage workflows in Cloud Jira
    • Financial reports migration to Tableau Cloud - Sky Watch, ZDTR.
    • Software Development Life Cycle (SDLC) - need to document our processes at a minimum. 
  • Tableau Server minor version upgrade - done ? How about the latest reported vulnerability ?
  • eMERGE reporting request

2022-11-03

  • Broad Retreat 12/12- 12/13
  • Vacation time around holidays
  • Cross training of WGS team member starting Nov 21
  • Tableau Server minor version upgrade - done ?
  • eMERGE reporting request
  • Tableau Cloud - make sure we move SkyWatch to cloud completely. Discussion for the rest of the content - on prem vs. cloud. Have to quantify the effort for migration and cost of Tableau cloud usage.
  • Covid extracts refresh less frequently, consult with Covid Lab

2022-09-08

  • Microbial Seq on Element - update from the first meetings (Chris)
  • Tableau Server minor version upgrade - schedule ?
  • Tableau migrations assessment - progress from last week ? To do?
  • Dynamic Work Design workshop - highly recommended

2022-08-25


  • JIRA tickets clean up - clean up your own list first and then Unassigned.
  • Jira migration to Cloud - has to happen within a year
  • Tableau 
    • Legacy License renewed
    • Explore migrating Financial reports to Tableau Online
    • Assess migration to Tableau Online of all GP reporting
    • Assess what / how to switch to subscription licensing model if we're forced to - clean up Tableau server , classify users. Asses financial cost mostly.
  • Analytics presentation at All Hands -  9/12.

2022-06-16


  • Tableau server upgrade - dev is done; prod TS - when, what RAM.
  • Quarterly Maintenance Window (QMW) Jun 25 6am - 10:30PM. Includes Disaster Recovery Site Testing. TBD whether DR for GP will be done and if we need to be involved.
  • Analytics presentation at All Hands TBD - 8/29 or 9/12.

2022-05-26


  • Tableau server upgrade.
  • GP priorities 
    • PacBio manual scale up- new version of SMRT Link would need new reports or new metrics to be added to the existing report; some Lab work gets automated (manual transfers, QC), will need to track downtime of instruments in Jira (stay tuned).
    • Blended Exome/Low Pass WGS product - new messages, new product goal.
  • Analytics presentation at All Hands.
  • Tableau conference TC22 - feedback.

2022-04-21


  • Organizational - vacations, Tableau conference
  • GP priorities 
    • To be launched - Twist TCap, PE-CGS, JBX
    • Current quarter - PacBio manual scale up, Blended Exome/Low Pass WGS product
  • Training Chau - peer work for Jira and Analytics requests.

2020-02-18


  • Deidre's email - was lab hourly load the right report? - All set
  • Addition of "previous_results" to beacon_samples changes - Done

2020-02-17


  • Last agenda items before Covid 
  • Reassign or Close Zach/Kristen/Rafael's tickets - done.
  • Structure/Issue Matrix demo (AB)
  • All of Us / Mayo - done. Analytics tools capture all data.
    • sample IDs for PDOs will be 10 digits, e.g. 3456789210, not SM-12345
    • Does Analytics assume an SM-XXXX format, e.g. for BSP/RackFinder? Where else? Will affected Tableau or ETL code be used for All of Us? (BQMS report)
    • Sample metadata in Mercury DW (GPLIM-6760)


2020-02-27

  • How will we finish Tableau upgrades? Notes from January:
    • goal is to upgrade tableau-dev and tableau to Tableau Server 2019 in January 2020
      • InfoSec gave us an extension until February 14 but let's beat that
      • remaining blocker is custom view relative date ranges
  • Quick intro into http://analytics:8090/DependencyVisualizer (Nasko) / isLatest (Mariela) / PDO Star ETL broke 2 weeks ago - explanation (MM)
  • Work with LIMS to get sample metatdata - not in BSP anymore (MM)
  • AOU starting with genotyping. Not extracting here; stuff comes in as DNA.
  • RQC and SAWs working fine (pipeline hiccup with LSIDs)

(earlier in month) 

  • Dragen DEMO - status monitoring (Nasko 15min)

2020-01-09

  • TS2019.4 has several open bugs that are problematic:
    • Live CSV file unions can break, Tableau has reproduced, workaround will be hourly extracts (Cloud Queues report) 
    • MKD is affected by a blending problem (update, Christina has worked around this, MKD now in QA)
    • Custom view "relative date" goes awry when refreshed from TS prod to TS beta: all units get switched to days (e.g. 6mo becomes 6d, 20h becomes 20d)
      • KC will create a plan to check all our custom views in case the bug isn't fixed by the time we upgrade
      • KC will add that plan to the upgrade test checklist (and formalize that checklist of workbook functions to check)
  • Review of LIMS Quarterly goals:
    • Discussion of what we need to do for datamart planning, where ETLs will occur (especially for DRAGEN metrics)
  • KC & CG will document the user-facing DM refresh tool and we can test it and publish it this quarter.
    • Now has google auth (RPT-5789)
    • Now includes RGHQS refresh (RPT-5791)
  • Let's finish the "where's my data" tool & dependency visualizer, publish them this Q (most work is done already)
  • MM is following the hybrid LCSET (mayo barcodes). May need to change PK in  DM but awaits analysis first. Expects more info by next mtg.
  • Outstanding question: what kind of IDs will sample aliquots get in Mercury when they are Mayo-derived?

2019-12-05

  • Demo of Nasko's RunEtl tool. The tool is out for internal testing. Still in discussion what other ETLs to  be automated and exposed to end users 
  • AoU - new samples IDs are 10-digit strings. A new version for Rack Finder is needed and will be built by Mercury. Our group has to figure out what we need for samples received in Mercury and have to find if and where we parse the existing SM ID or where we pre-pend them with "SM-"
  • Switching Tableau test and Tableau-beta on dec 12. Production TS upgrade is aimed for the end of Jan 2020
  • Demo of Tableau Catalogue - a replacement of audit tools from Interworks. General impression is that the views are too flat and hard to find information.

2019-11-21

  • New SWE between 320/105, Tom's hiring in LC but it's a tough salary, so we're low on lab support
  • Retiring CRSP PolicyStat. HIPAA training needed for most of us.
  • DRAGEN metrics comparison: Mariela made a GDoc explaining the current "important" analytics metrics for comparison. Jim's looking at it for consistency. We're currently on track.
  • Walkup v2: Stability, Automation (tickets, etc), submissions by non-Broadies. Release starting this week (today for Stability).
  • TC19 review: KC, ZL, CG will summarize sessions worth watching online (and worth skipping)
  • We're not too far from upgrading to Tableau 2019.4 (lots of new features, plus Tableau extensions): so we need to spot-check Tableau-Test to look for version  bugs
  • ETLs are now SQUIDLESS!!! (MM brought delicious muffins)
  • Is GapAutocall ( http://gapws:8080/ws/project_management/get_autocall ) still used for getting data? KC will confirm that it's not.
  • FP LOD Scores in Tableau – who needs them?
  • Changes in weekly Quality Reporting: Old QNow board is now QR board. Goal is to connect the different processes. 
  • Goals review

2019-11-07

  • FP LOD: Won't be added to Mercury DWH.  Analytics serves up scores from Pipeline tables anyway.
  • Jira 8: Looking good. Both GPInfo and LabOps will be updated at the same time.
  • VVP Volume QC: Datasource is a CSV file union.  Machine status logic in review.
  • RPT backlog was reviewed.

2019-10-31

  • AoU updates: Mercury will create PDOs and lab batches, e.g. ARRAY and LCSET tickets
  • AoU milestones: IDE application to FDA (est. Dec), then Array processing (est. March), then Genome processing (est. May)
  • Squid messaging getting turned off this week. No new lab samples being processed through Squid.
  • DRAGEN metrics being compared with Picard, also generating list of Picard metrics that may not have DRAGEN equivalents for lab
  • Nasko's CSV→DB service now can handle key/value pairs and column renaming. Rafael will work on testing with RPT-5642.
  • Rafael working on 10X/SingleCell with John Walsh
  • Mariela finishing online class, chasing down metrics, thinking about the self-service tool, and working on the Picard metrics list
  • Amy working on Jira 8 upgrades and decommissioning the CELL project
  • Christina owes Andrew some RQC metrics
  • Zach working on a combined QS+SAP revenue report during transition to all-SAP

2019-10-17

  • AoU updates: Plan is back to auto-creating PDOs
  • Dragen Metrics will be on a special page in Mercury. Tina can look and confirm that they look good, but this would be tedious. She'll probably use Tableau for regular metrics and Mercury for Dragen metrics. We may want to autopush all of them, but: The only way to save money is turning on Dragen and turning off cloud picard pipeline. Button Tina can use for random runs to "Dragen them" to compare to picard (using "DRAGEN metrics TEST"). 
  • Need to ensure Jim, John Walsh, et al are clear on the difference between sample & library metrics, and that lab operations may require library view
  • K to pull together summary of extractions reporting: business impact of not having reporting, what is the plan for transferring other extraction processes to mercury, etc
  • Nasko made some improvements to D3 dependency visualizer
  • tableau-ts is almost all set

2019-10-10

  • QNow updates
    • Illumina reagents - CAPA at Illumina (https://labopsjira.broadinstitute.org/browse/BQMS-1692)
    • Tom Howd is piloting a preplating pico protocol - Mariela confirming metrics
    • Mercury 1.1.7 released & other AoU stuff:
      • Rreleased new feature for sample receipt
      • NOT making SM IDs but instead using mashup of collab sample id + collab pt id 
      • future talks about not using LCSET/PDOs? 
      • AoU samples will be going to dragen, not pipeline
  • Sql Navigator and Sql Developer installed on new Tableau-ts server
  • RS - poster for softeng instead of Broad retreat, updates to kit building report due to finding out current datasource does not represent work done, monitoring 10X development
  • AB - continuing Bravo Jira/messaging study, monitoring 10X development with RS, release CLF JQLTDE when Tableau-ts is ready, next step for Jira upgrade is to get it on dev environment to test upgrades
  • KC - Working on TS upgrade final checks, pico/vvp fix and integration, kit building updates with RS
  • ZL - QS move to SAP. Big query/skywatch talk at softeng in 2wks, floor map updates to broadmap 
  • CG - Sara is happy with downloads from PDO Tracker where CG implemented a URL action to deliver fewer dimensions. Also debugged a rename of illumina agg data source - group needs improvment at putting github comments in views.
  • AM - monitoring BSP_sample where we're seeing "unable to extend temp segment" errors and waiting for a DBA response

2019-10-03

  • Public announcement of Broad using DRAGEN Illumina/Broad collaboration announcement (DRAGEN), blog post with more info on the software side of the collaboration
  • Amy and Erik are working on Jira migration enforced by BITS
  • RedHat upgrade form 6 to 7 - Nasko said our servers run 7 so we shouldn't be affected
  • Terra data repo presentation at Q Now board. All data will be in the cloud. Question remains how it is going to be made visible to Analytics
  • Nasko obtained 10 licenses for SQL Nav. Everyone to decide which tool will they use - SQL Nav or SQL Developer
  • Replacement of tableau-ts-beta with tableau-ts is almost done (KC)
  • Nasko released a new ETL (flat file loader) to load data from a file into a DB table. It is generic and can be used for any new service that needs that. In addition filePusher service can be setup to listen for new files in a location and call flatFileLoader to load the data into DB
  • Help wanted to explore new features in Tableau Server 2019.3.X vs. 2019.4 beta
  • Reconstruction of DataOps Library. Any new requests / suggestions are welcome
  • Rafael is preparing a poster for Broad Retreat

2019-09-26

  • Mariela & Nasko meeting with Yossi tomorrow (on PCA or ML for risk calculation and outlier detection)
  • Setting quarterly goals (carry several forward, add more)  Analytics FY2020 Q2 Goals (Oct-Dec)
  • Brainstorming for Sara G: PDO Tracker sample metrics download includes many columns they do not need and must remove every time a collaborator needs metrics.  Macros have been in use but occasional additions to the view break them.  Sara would be interested in developing/publishing/maintaining a new view strictly for her team.  She is also open to other ideas, like Google Sheet manipulations or Visual Basic, however she would prefer the route with the smallest learning curve.
    • New view in Tableau – no extra skills required
    • Standard Tableau export sent through UNIX (cat cut etc) – Unix skills requiredc
    • SQL curl script without Tableau – very basic command line skills required
    • Standard Tableau download, opened in Desktop – Tableau Desktop skills & license required
    • Tableau Prep!?!? (we have 50 TP licenses)
    • URL action in Standard Tableau report to call the CURL command
  • Mob review 

2019-09-19

  • Example of using Tableau copy/paste to clone tricky data connections and views (ZL: adding exomes to Cloud Queues report)
  • Rafael updated Analytics FY2020 Q1 Goals (Jul-Sept) to fix broken Jira links.
  • Kristen to create new Q doc for Q2.
  • Nasko added parameter options to txtExtract curl service. Documentation needs work to clarify how the SQL would be created and how the curl command would be altered.
  • "Mob" session will focus on TAG (translational analysis group) development (ichor, duplex consensus, smartSeq, etc)
  • When troubleshooting is needed on complex things that only Nasko or Mariela know, they will invite the group to watch over their shoulder in Arctic or Library. (Assuming it's not critically urgent). This will help us provide proper support to GP when they're away, so they don't need to be involved.

2019-09-12

  • Update Quarterly Goals document for current quarter and whatever moves to the next
  • BITS Maintenance window 09/15 - check Analytics tools and services on 09/16
  • What's next for Analytics Visualizer - equip it with a direction of links (RPT-5317)
  • Talk to Yossi about PCA methods before he leaves on sabbatical
  • BSP DM to continue populating seamlessly by Mercury. Bring this to Mercury team's radar
  • TAG team no longer uploads metrics in Big Query, they settle to keeping the metrics in a file. Remove the Big Query goal from the board
  • Low PF on HiSeq X still a fire
  • DRAGEN is working on 1 sequencer
  • CRSP portal is inside the firewall
  • SAP release coming, Mercury team will have more bandwidth to address Mercury-related requests
  • AoU will be a priority for LIMS team - sample receive, download metadata from cloud, accession, metadata validatio, PDO + LCSET/ARRAY creation. DRAGEN analysis immediately after sequencing,Mercury displays metrics, user OKs sending data to the cloud (no pipeline for cloud or RQC analysis). Fianlly, Mercury writes the metrics into DB. Need to find out which DB - Mercury or somewhere accessible by Analytics for reporting.

2019-09-05

  • Amy updating instrument QC reports (VVP, Artel, LeakTest)
  • Amy met with Emily Chambers from Cancer group (e.g. DepMap) to look at Jira/Google/Tableau integration
  • Amy to work with Rafael when he returns on Jira 8
  • Christina provided top-off repot for Jim to estimate DRAGEN resources
  • Discussed Maura's JMP presentation at quality review (was recorded)
    • JMP has powerful stats when used properly, e.g. by Maura
    • questionable whether others in lab can make sense of them
    • still want Andrew Bernier to present the Tableau alternative
    • and we can help a JMP power user connect to Oracle (like Linley did)

2019-08-29

  • GPInfo Jira moved back into the firewall. Changed links to Jira
  • AB working on Lab Instrument QC reporting with KC. Also on reviewing Jira upgrade.
  • Jira 8 updates:
    • UI changes
    • Batch emails
  • KC working on VVP reporting. Challenges on linking lab_metric and lab_event to show the right data. Will continue work on Extractions and Plating reporting when she returns.
  • CG working on trending report for Jim.
  • ZL working on Tableau testing on new server. AB’s csv unions have a slower performance in the new release. Also working on turning off alerts before switching the old server. Tableau Beta has a couple of bugs that need to be resolved.

2019-08-22

  • ZL testing Tableau beta (v2019.3) on Windows 2016 server (from Windows 2008) for upgrade planning
    • We will need to create a new server from scratch on Windows 2016
  • ZL demonstrated Comments function on Tableau Server, reviewed Tableau's 3 levels of aggregation (standard, window, LOD)
    https://10az.online.tableau.com/#/site/broad/views/aggbook/mindateaggregations
  • We could start reviewing new features on Tableau versions 2018.2, 2018.3, 2019.1, 2019.2 and 2019.3
  • KC demonstrated solution for complex CCLF query (creating a static table that contains old metadata so it can be exposed in their reports without breaking our queries)

2019-08-15

  • Lab changed library construction protocols, messed up with LC exomes appearing in reports. MM had to refresh 67 flowcells to solve issue SUPPORT-5659.
  • Reviewed unassigned RPT tickets.
  • ZL testing Tableau Server 2019.3.
  • AB to meet with Erik and RAS to discuss Jira Server 8 upgrade.

2019-08-08

  • Reviewed ETL Troubleshooting with AM
    • Most frequent violation in the last months: is_latest violations. Resiliency has been built to push is_latest bad samples out of the ETL and into a separate table for further troubleshooting, automatic PO ticket creation in place.
      • Tip for identifying is_latest violations from reportingerrors: Search on Gmail inbox for the RP or PDO that PDMs give us to track whether or not there was any violation report
    • DataChecker agent: Looks for different conditions for violations, sends automatic email on reportingerrors
      • Errors are logged in tables called err$_table/datamart_name
      • Resolution_timestamp needs to be updated once there is a resolution to stop Data Checker emails
      • RAS to build a Tableau report using table data from DataChecker.
    • ArraysQCProd agent
      • When runs take longer, new runs fail (are skipped)
      • If runs take too long (~5 hours?), Spark may need to be restarted, or run can be killed on Spark Master page
    • Organic Run
      • Data comes from tables that John Walsh creates/maintains
    • AutoPush service
      • If needed, can be killed on Unix crontabs
      • Code is stored in GitHub repository

2019-08-01

  • Analytics Q1 Goals review
    • ULP & Duplex Consensus calling report (CG) - Done
    • Positive control tracking (MMRF) – On hold until further notice
    • Pico & VVP reporting (Mercury) – Waiting for LIMS work
    • Extractions reporting (Mercury) – Waiting for LIMS work
    • Plating Reporting (Mercury) (KC/MM) – KC and MM meeting about this
    • Missing data tool (“where’s my data?”) – MM working on this on RPT-5560, will work on linking to RPT-4927
    • ETL resiliency (AM) – added to QNow board
    • SAP sales reporting (ZL) – working on report, tricky for GP given different processes in place for billing
    • Broad West reporting (KC) - Done
    • Lab instrument QC – Waiting for LIMS work
    • New exome workflows in Jira (AB) – in process
    • Merge BigQuery data from TAG (TAG) – Done in the context of ULP & Duplex consensus calling report
    • Report for “where’s the money?” (similar to BaseLoad) (ZL/CG)
  • Expected LIMS work may be delayed, QA/testing priority will be given to launching SAP during August
  • AM has been working on autoPush to be more configurable. Working on features for ETL Resiliency with MM (currently applies to violations of is_latest, PicardAggregation, AggregationQC etls)
  • RAS presented new Jira board, Analytics/LIMS Board. Made to ensure that we are all on the same page and clear on our requests of GPLIM team/monitor issues that are on hold/in backlog that may no longer be necessary.
  • Artel QC "show latest machine status" demo (AB)

2019-07-25

  • Issues with Tableau Server. ZL restarted server and TServer computer.
  • QNow notes:
    • Roof repair in Caribbean starting 8/5, will last 6-8 weeks. People from Caribbean will be sitting in Dead Sea Room, Cafeteria and 320C west wing.
    • Blood biopsy samples fire: collaborators are not sending sample spreadsheets (metadata) before sending samples. Samples need to be processed as they come in, and lab has had to create temporary “fake” metadata.
    • Library Construction identified bad well, will want to evaluate Roche adapters in reporting now that they’ve been using for a while
  • Meeting about BSP Freezer data being moved to Mercury Storage. This will affect SRS as we will not have access to Mercury data for Tableau reporting (data would be accessed through Mercury GUI instead of Tableau GUI)
  • SAP release of Mercury starting QA. Expected to not have QuoteServer quotes in a couple of months.
  • Working on building resiliency on ETLs. AM is working on testing a recent Oracle feature that can be included in ETLs that, in the event of a failed record, it places the record on a separate table for further troubleshooting and continues the original query.
  • AB working on Artel QC reporting, PacBio issue type creation, working on preliminary research for Jira version upgrade with RAS.
  • ZL showed ZDTR 2020 report for GP Billing (upgraded version of ZDTR 2017). Use of highlight tables, views as filters. Also Genome Aggregation Queues and Durations view in Cloud Queues report.

2019-06-27

  • Notes from QNow:
    • Discussed fires. ZL raised up Walk up server change as a fire.
    • Google Pipelines PAPI also raised as a fire
  • QPlanning:
    • Analytics Q1 Goals (priorities discussed in QPlanning)
      • ULP & Duplex Consensus calling report (CG)
      • Positive control tracking (MMRF)
      • Pico & VVP reporting (Mercury)
      • Extractions reporting (Mercury)
      • Plating Reporting (Mercury) (KC/MM)
      • Missing data tool (“where’s my data?”)
      • SAP sales reporting
      • Broad West reporting
      • Lab instrument QC
      • New exome workflows in Jira (AB)
      • Merge BigQuery data from TAG (TAG)
      • Report for “where’s the money?” (similar to BaseLoad) (ZL/CG)
    • Discussion of Green Team priorities. Prioritizing launching EDDY (replacement for E9)
    • Jon Thompson brought up pipeline timestamps messing with LIMS data as well.
  • AB working on 10X 3’ hashing workflow design before creating ticket type in Jira. Also added ‘Shearing Complete’ and ‘Post PCR Cleanup’ steps in several LCSET workflows.
  • MM focused on Extractions and Pico reporting data.
  • RAS developing views for NovaSeq Peltier logs, DependencyVisualizer improvements.
  • ZL was able to make improvements Cromwell API, consequently improving CromWatch performance. Also working on BigQuery reporting (billing data from Google).

2019-06-06

  • Review PRISM vial requests Tableau visualization using FreezerPro data from Oracle
  • Meeting with Green Team on Monday 6/3
    • No updates on any ongoing tickets (cleaning is_latest, timestamp issues, CRSP split)
    • AM and MM worked on CRSP split to include WGS metric that needs to be calculated for NovaSeq reporting
    • Green Team mentioned they’re developing an E9 submissions system
    • New metrics from Arrays will be included, AM will need to be included in Analytics ETLs from the Cloud.
  • Notes on QNow:
    • NovaSeqs have been very productive, becoming a bottleneck for GP Operations
    • Infinium failures high again
    • Mercury team still working on Fingerprint store
    • Collaborator Portal now available
    • BITS Outage 6/8
  • CG working on change documentation for RQC with AM. Will work on making test plan with Marcia and Susan.
  • Consider Tableau Prep opportunities for 2019.3 using Tableau Prep Conductor, e.g. https://tableau.broadinstitute.org/#/views/audit_20190212/Multipleconnectiontypespersheet
  • reporting-errors emails – some emails come from AM dev environment. We can ignore emails that include a HOST and is not HOST: analytics on email subject.

2019-05-30

  • Update on AoU:
    • Still working on IRB and FDA/IDE issue.
    • This will delay start of AoU
  • AB, ZL and KC met with PRISM group. Need to keep track of vial stocks and requests to pull samples from their systems. They use FreezerPro (commercial LIMS). Still evaluating what type of collaboration/project work we will do.
  • Meeting with Green Team on Monday 6/3. KC will review if we have anything pressing to discuss or if Green Team would like to present plans, or else reschedule.
    • DSP holding next Quarterly Planning meeting on 6/13-14.
  • CG working on tools documentation for Mocha (mostly Autopush, potential need to include Top-off). Needs to review an issue with stuck in RQC exomes. For liquid biopsy, there is a request to add vessel concentration data into Mercury (SUPPORT-5408).
  • AM thinking of building an ETL to query from Mercury UDS as a data source for reporting, depending on Analytics’ needs and potential strain to Mercury.
  • AB shared BQMS Lookup report with Tom M and Betty, will work next on VVP report improvements. Will follow up with LC team on need to updated workflow for Exomes.
  • MM working on preparing is_latest report. Discussed new proposal for Mercury DWH changes with KC, reviewing what pending requests for Mercury should be included/simplified/prioritized.
  • RAS working on Troubleshooting tool for PMs, Tableau report dependencies for DependencyVisualizer, understanding MOC Stool Controls report request.
  • BITS Maintenance Window 6/8

2019-05-23

  • Notes from QNow:
    • GP won a contract from Goldfinch Bio (75,000 exomes, 5,000 genomes) – still confidential
    • LIMS has been working on releasing Fingerprint Store (Pipeline cannot use it until Firecloud issues are resolved) and Collaborator Portal (BITS working on InfoSec configuration)
    • Bottlenecks behind and after LC (scaled up to 10 plates a day)
  • Notes from QPlanning:
    • MMRF (Multiple Myeloma Research Foundation) project being mapped out this week on Dead Sea board
    • MOC group needs data analyst, lack of one may affect the usability of the data they’re getting.
  • Review of FY2019 Q4 goals
    • Updates to RQC data sources and reviewing/testing scripts for autopush (CG, AM)
    • CG working on top-off tool, RQC2 updates ready. Also working on SAW.
    • Documentation/troubleshooting guide of metadata changes/broken data for PMs (RAS will work on this)
    • Need to separate ETLs for SEQ_PROD and CRSP_PROD (AM)
  • AB testing BQMS Lookup tool on Tableau, working on Automation reporting changes
  • RAS worked with AB on reviewing changes on SGE workflow with LC team and LIMS team.

2019-05-09

  • Notes from QNow:
    • MKD reporting needs improvement so that uploads from the team are uploaded to the right path
    • Mercury implemented a 2nd instance for web service requests from Pipeline Cloud.
  • Notes from QPlanning:
    • Evaluation of a piece of equipment called Lunatic (from Unchained) for Pico processing
    • Planning for Town Hall meeting next Thursday before our Spring retreat
  • AM working on script for issue with past/future dates in Pipeline aggregations data (will potentially create a “wiggling dates detective” agent to flag these inconsistencies). AM has also made progress on Broad West ETL building for reporting. KC collecting user requirements for this reporting as well.
  • CG working on “stuck in RQC” reporting. Also working on PDO disassociation issues where PDOs do not appear on top-off when requested.
  • RAS investigated JIRA Service Desk (SUPPORT project) comment bug, Infinium Process Views reporting improvements, learning to copy production Jira into Dev instances. Looking into Microbial reporting request for controls. Will work on upgrading Jira Service Desk in the next week.
  • AB working on BQMS lookup reporting. Scott mentioned that during a CLIA visit this week, they felt confused at the Artel QC report. Teaching RAS to copy Jira instances during the past week.
  • KC and ZL met with BSP team looking to buy Tableau licenses. They’re looking to visualize Terra usage data.

2019-05-02

  • Notes from QNow board:
    • Fire with Mercury release and Walkup LIMS error messaging
    • Pico moving to Mercury
  • We need to evaluate how to maintain data from analytics.bsp_sample & bsp.analytics_sample DMs
  • Meeting with the Green team on Tuesday 4/30.
  • ZL working on financial ZDTR reporting on Tableau. Published Expense Explorer report this week.
  • AM to follow up with Erik about BITS Outage planning/registering non-Tableau tools on Icinga.
  • At last week's AoU call it was announced that we could be receiving some test plates as early as the week of 5/5. Goal: work through our receipt and accessioning process and report back any issues.
  • MM has been investigating mixed-PDO vessels in Mercury. Release has been included in Mercury, data backfills are being done. MM still needs to review if anything else needs to be backfilled.

2019-04-25

  • Notes from QNow board:
    • Infinium report issues (GPLIM-6212)
    • Initiative to move sample storage from BSP to Mercury (RPT-5407)
      • Need to take into consideration BSP is not only used for Sample/Extractions reporting
      • Question to think about: If there is data that users can already access through Mercury User-Defined search, should we really replicate Tableau reports with Mercury as a data source?
      • This will be a good opportunity for us to prioritize what new reporting to develop/maintain.
    • Auto-billing Exomes per PDO (to support planned increased scale in Exomes)
  • AM working on DWH “Metrics” web-service will absorb services such as Illumina Summary, Base calling metrics (Broad West), Peltier logs service.
  • CG mainly focused on switching non-clinical over to new SAW service.
  • MM focusing on project to merge Walkup and production metrics into Analytics DMs. Troubleshooting issue with dates on Metrics tables recording with UTC time zones. Progress in resetting is_latest flags for aggregations. Following Mercury ticket about mixed PDOs.
  • RAS working on JIRA BQMS and Germline Exome workflow improvements, following up on Infinium Process Views bug, metadata changes.
  • ZL working on QuoteMaster report for SAP revenue reporting. Ongoing work on SkyWatch

2019-04-11

  • New adapters are being discussed, which may impact calibration & pooling.  MM will check in with Tom.
  • CAPA (Corrective Action & Preventive Action) - this is for audits, essentially an escalated BQMS ticket. We should think about how to use it for software problems. This may help our issues with the pipeline interactions.
  • Is_latest errors: MM has a document describing the effects. Short version, Business impact: agg metrics impact billing, selection of correct BAM file, possibly others.
    • we don't know how they're setting the is_latest flag.
  • Mercury FP store will finally kill GAP. It's unclear whether analytics will be involved. Currently, pipeline calculates LOD but they don't store how they calculated (what they used).
  • Walk-up: JW is curling walk-up metrics to analytics DM so we can use these in the future. Maybe we can start merging walk-up metrics with regular metrics? (source is in organic_run). we'll want to ensure that we can also differentiate BroadWest.
    • It's undetermined how we'll handle accessing BroadWest data; GCP Slack channels had some chatter about this kind of thing.
  • Peltier logs are now getting curled into a new DM.
  • PDOSTAR manual refresh: we may start seeing a warning, don't interpret it as an error. 

2019-04-04

  • Notes from Quarterly Planning meeting (4/3)
    • AoU delayed to November
      • FDA requiring an IDE medical device authorization for both Genomes and Exomes processes
      • IRB initial consent forms inadequate, unable to do anything with received samples until forms are updated and reprocessed for every participant.
    • Doubling on scale in Exomes may stress informatics. Scale from 20 plates a week to 52/week expected to be reached in July. This will increase importance on RapidQC autopush (let's call it "Cloud autopush" to reduce confusion)
    • Bayer collaboration with Broad/CDot to develop a molecule for EGFR mutation detection (lung cancer). Project will require blood biopsy/Twist exomes custom panel work.
    • Push on 10X and Microbial scale growth before AoU work
  • Reviewed FY2019 Q4 goals https://broadinstitute.atlassian.net/wiki/spaces/AN/pages/780337225/Analytics+FY2019+Q4+Goals+Apr-Jun
  • 1/3 of Broad already using SAP for quotes instead of QuoteServer. GP expected to transition into SAP by April. Virtually the rest of Broad by new fiscal year.
  • ZL and MM met with Chris and Phil from Green team to discuss is_latest and data type open issues. No definitive resolution yet, interest in having general ongoing check-in meetings with Analytics. However, there's a bit of a plan, MM thinks we'll be okay once they fix the bug.
  • Nasko to change the language on IS_LATEST cop (detective, wasted time, etc)

2019-03-28

  • Still no resolution to open Pipeline PO tickets. This should escalate to QNow board as a fire as it is affecting PDO data. It will be useful to have a blacklist report available to view what samples are caught in open tickets.
    • AM could produce a flat file with DependencyVisualizer data that could be used for building this report
  • Quality issue with MKD reporting, because of BITS moving files that were hard-coded into Tableau report. Issue solved.
  • No final details yet on Sequencing data loss issue.
  • Notes from 3/27 Planning meeting:
    • It may be helpful to let GP know who in the Analytics team should be included in conversations regarding different projects.

2019-03-21

  • TAG meeting with Green Team (03/14):
    • Meeting helpful to get a better understanding of both teams’ work. Need to escalate tickets by sending emails to managers instead of comments on tickets.
    • Interest in the Green Team to review business impact of pipeline issues.
    • Waiting for meetings to be set up with Green Team technical members to discuss issues with open PO tickets.
  • DSP 2-day planning meeting:
    • ZL and KC attended.
    • Presentation of different DSP projects that different teams are working on.
    • Number of DSP stakeholders have increased from only being GP to a number of different organizations, consortia, etc.
  • GP Planning meeting:.
    • Data Gen Council goals:
      • Reduce storage costs by 80%
      • Better customer experience
      • Samples to ramp up to 300,000 samples/year

2019-03-14

  • Issues with BITS update:
    • MM noticed Oracle DB did not come up properly on Sunday. Tableau glitch associated to this. Fixed by Monday.
    • AB noticed Analytics ETL Server down on Sunday.
    • May need to come up with a plan for next disruptive BITS update.
  • QNow board:
    • Pipeline still fixing array board.
    • Lab still working on improving HiseqX and NovaSeq performance.
    • Janki mentioned that Pipelines in a Month project will be renamed/changed into Scalable Pipelines.
    • Pipeline services had turbulence post-BITS update
  • Products meeting:
    • All of Us being registered as a medical device project
    • Waiting for response from UK Bio Bank

DataGen planning meeting:

  • Mention of cutting some goals in the planning board by 50% for next Q

AM developed a script that will help restart this and the Spark service if Analytics ETL Server goes down again. Presented new page with Unix scripts for restarting Spark and AnalyticsWebServer "ANALYTICS" Unix server - useful tools and commands

KC met with Jon Thompson, noticed some of the information needed for Pico does not exist in DWH. Through these conversations noticed there may be an opportunity in leveraging conversations between LIMS and Analytics on data availability. MM having similar issues with projects such as Exomes deemed “complete” but still requiring backend systems wok. Need to implement a formalized process where projects are not signed off as “complete” until all of these issues are resolved.

  • AB designed swimlane diagrams to visualize interactions between Lab, LIMS, Analytics, and QA stakeholders. Having issues of this type with Automation.
  • AB also mentioned the PRP project type (Product Realization Process) as a way to follow up on signing off projects as "complete" from the LIMS/Analytics perspectives.
  • Potentially add a “Needed in DWH” field on JIRA for new requests

ZL presented update on RackFinder report

KC to create meeting to review workshop on innovation, Ladder of inference, and high-quality advocacy & inquiry

2019-03-07

Notes from QNow/Planning meetings:

  • HiSeqX and PF going down. NovaSeq yields are not as high as other Illumina customers. Lab looking to increase yields. Reason could be the adapters the lab is using.
  • Fingerprint Store is still stuck. It is being exposed outside of the Broad firewall and seems to require InfoSec approval.
  • Collaborator portal blocked as well.
  • Halfway through reprocessing Exomes in the Cloud
  • IBM project with Broad: Mostly DSP using machine learning for cardiovascular disease understanding, may mean more clinical genomes/exomes for GP

Reviewed ZDTR report ZL has been working on. Dynamic ways to visualize P&L data.

AB has been working on general instrument/automation team report improvements.

RAS set up 10x single cell workflow. Working on general JIRA configuration tickets, list of Custom SQL data sources for Dependency Visualizer, started work on GP Analytics Confluence pages revamp effort  Unable to locate Jira server for this macro. It may be due to Application Link configuration.

CG working on Liquid Biopsy Samples reporting and RQC2 improvements. Looking to starting work on RQC3 to include SAWS.

KC attended Women in Data Science conference, HR workshop on Innovation/productive conversations. Looking to present ~20 min on the topic to the team.

AM performance issues on Cromwell viewer, improvements on DependencyVisualizer to include JIRA Cloud PO tickets. Sandbox available for TAG team to create dependencies.

MM dealing with consequences of is_latest flag issues. Created JIRA dashboard to visualize Pipeline and LIMS dependent issues.

2019-02-28

KC will get rid of the Tableau 6 archive in gp-reports/Tableau. We should all keep removing old stuff.

Goal: New functionality to show whether samples have related BQMS tickets. RPT-5278 is an example

Nasko and Charlotte made progress toward Charlotte's team handling is_latest issues before they leak to analytics

Nasko presentation on JqlTableauExtract service/DependencyVisualizer http://analytics:8090/DependencyVisualizer

  • (example inspiration: RPT-5310, which has a tag in the description making it show in the D3 output)
  • technical skills required: D3 (template from web), editing/adapting javascript, wrote scala to interrogate Jira descriptions, scala to check ETLs & their statuses
  • Datachecker can auto-create Jira tickets with the description tag if datachecker encounters a problem.
  • Suggestions for the tool may be added to  Unable to locate Jira server for this macro. It may be due to Application Link configuration.

2019-02-21

Discussion of Pipeline issues with is_latest flag and data_type changes (PO-16120) in metrics.aggregation.

  • Ideas/Potential solutions:
    • Prepare a written summary of how/what these issues affect
    • Implement a Data Checker query to verify potential primary key breaking due to data_type changes
    • Thinking about workaround processes we could implement internally (such as new ETL queries)
    • Preparing a Tableau or JIRA report showing open Pipeline (and other) issues and what reports may be affected by these issues
    • Come to Pipeline Office Hours next Wednesday and present how our issues affect other operations
  • Twist Exome updates/delays: Lab issues, Pipeline issue where Exomes couldn’t be pushed into RQC
  • KC proposed identifying a primary/secondary Analytics contact person for each “capability area” in GP Planning Meetings so that the team kept in the loop on changes/improvements that we may need to work on/be aware of.
  • AB was told by Erik that there’s an issue with JIRA and timeout sessions for CLIA compliance. They are currently looking into this issue. VVP update went into production this week. Noticed an issue in which BSP could not create tickets due to permission updates in the SUPPORT/Service Desk. Also working on GAPREQ Cut requests workflow request and on writing documentation on group alerts on Tableau.

2019-02-14

  • New version of Mercury went out on 2/13. Fix for Germline Exome workflow.
  • ZL working with Andy Hollinger on report comparing Nexome and Germline Exome performance. Made a demo of the view. Germline Exome seems to have greater coverage at a lower input of bases.

ZL updated the ZDTR report to include a viz that shows expenses by CO accounts, GL accounts and materials.

AM working on request to retrieve metrics for Walk Up Sequencing reporting.

KC meeting with Heather Jenkins (from DSP) about GP/DSP interactions.

CG reviewing report for old exome samples reprocessing in GCP that needs automatic refreshing. Also working on Selection Pool Creation and RNA Top Off tools.

RAS working on Pico Process Controls QC report, Infinium Process Views report improvement.

MM working on Pipeline BAM files is_latest flag issue. Also waiting on Mercury new version for RGHQC reporting.

Tableau Tip: Accidentally published without tabbed views? Easy fix: https://tableau.broadinstitute.org/#/workbooks/12925/details

Reminder about Friday Feb 15 info design talk (auditorium): https://intranet.broadinstitute.org/node/6919

2019-02-07

BITS considering utilizing Tableau for their reporting (reporting, outlier detection, alerting). DWH stored on Google Cloud. New BITS information security Broadie (Will Hedglon) using trial license.

Germline Exomes queue growing. Workflow working fine. Improvements/Fixes that are being made include working on missing messaging on Mercury and Picard metrics.

Discussion on new Admin toolbox add-on for JIRA. Useful for ease in replicating workflows and transitions, performance management on add-ons that are being used.

AB: Rolled out Microbe LCSET, need to add this on BQMS reporting. Working with RAS on GPInfo JIRA Service Desk redesign. Working with MKD team to remove MKD project from CRSP JIRA (scoping stage). VVP update on hold.

AM working on Samples Analytics workflow service with CG. Realized they may have truncation problem, working on testing where and how data is being truncated by using the reports’ URLs. Issue with extracting data from JIRA text files – will demo JqlTableauExtract service next week.

CG working on ULP WGS report. Testing new Selection Pool Creation and RNA Top Off tools.

KC working on republishing Infinium reports, and reviewing bugs on some data calculations. Published ARRAYS_MERCLOUD as a published datasource since 2 reports are using this source. Working on Pico process QC reporting on Tableau.

RAS working with AB on GP Info JIRA Service Desk redesign, and on Kit Building & Receiving report improvements. Rolled out Customer Feedback BQMS issue type on LOJ this week.

MM added new Seq technology into PDO STAR 5. Also worked on Germline Exomes reporting. Worked on renaming samples ticket in which renaming occurred in RQC.

RAS and AB presented a demo for the Admin Toolbox add-on.

Design control requirements for using Tableau clinically or for All of Us (notes from Jan 11 meeting with Betty)

  • document and validate which data sources are used
  • document and validate any calculations and constants
  • provide an audit log
    • PSR table for sample workflow requests
    • Tableau Server log
  • restrict client IPs to non-lab computers as they don't have screen locks
  • testing
    • continuous automatic regression testing
    • manual testing for big changes

2019-01-31

  • QNow board:
    • Announcement of 40-50% price cuts for whole genomes and exomes effective 2/1
      • Cheaper to sequence using NovaSeq
      • Twist exomes process more efficient
      • Sequencing volumes going up
    • 4 platforms at Broad are using SAP for quoting instead of QuoteServer, 4 more platforms will be using it by next month, the rest of Broad will use SAP after that
    • Tom Howd announced hardware box for variant calling on sequencing. Hardware solution may work well short-term as a workaround until DSP solution is available.
  • AB to do Admin Toolbox for Jira demo next week
  • MM has been tracking Twist exome set, working on PDO STAR requests. Analytics reports ready for NovaSeq S1. Helped Nexomes team troubleshoot low outputs reporting.
  • AB working on VVP reporting updates. Met with Seq team looking to understand how to improve Jira boards.
  • RAS working on Kit Building and Receiving Monthly Totals view, debugging Grand Total calculations on Infinium Process Views, Service Desk revamp on Jira
  • CG working preparing reports to support sample analytics workflow service changes. Also on requests for aggregations and others.
  • KC working on array data review.

Recent instances in which reports fell through the cracks due to turnover and delays: (arrays PDO review, GAPREQ tube return).

AM working on Sample Analytics workflow service. Including custom fields. Incremented singleton feature. Fluidigm credential fixed for access. Tiger issue last night where 3-4 ETLs failed. One of them included PDO STAR failure involving aggregation particle (of 28+ characters). Issues with BspSample ETL, asked BITS for help.

ZL doing outreach, working with the Development office (fundraising) to help on their Tableau reporting. Also working on mapping reporting for Broad African project.

2019-01-24

QNow board

  • Infinium rates going back up again
  • Twist pilot scheduled to happen this week.
  • HiSeq %PF dropping. Unsure if it may be due to a lower volume of genomes being sequenced.

CG has been working on RQC reporting and Exomes. Also updating Liquid Biopsy report.

RAS working on Service Desk revamp (RPT-5153), Infinium Abandoned Samples report, Confluence edits.

AB working on Tableau Alerts, revamping GPLIM workflow, new Microbe and Germline Exome (Twist) issue types on Jira.

AM working on Sample Analytics workflow service. No new features will be added to old version, but new version rollout will require careful work with ZL & CG.

There have been a LOT of changes in BSP_SAMPLE DMs, causing occasional 70-minute ETL runs. Unclear why so many samples get updated.

KC reviewing information on Sample Intake v1. Ran Audit Software on Tuesday night, will want to review what data sources have columns that have not been used to optimize Server. First thoughts are a lack of consistency on naming data sources and columns, it may be necessary to come up with a standard naming process.

Array data review - KC to talk with Jim and Mike. The plan is unclear, re: tracking the gesture to show that data have been reviewed.

2019-01-17

Fire: wrong GSA chips were used.  Lab is looking into a better way to pull chips.

Tom says they're trying to scale exomes to 20 plates per week. That's a lot to run through RapidQC. 

  • Twist exomes are the new exomes, they're going on NovaSeq and getting one read group. This may preclude the need for RapidQC human intervention at all at some point, though the RapidQC pipeline will still be critical for supplying metrics.
  • We should monitor the pooling penalty on the new exomes.

There's been some difficulty getting on the same page with Green Team about what metrics are important. Frustrating. We don't want another round of "this horrible fire could have been prevented by having the metrics people said we could scrap"

Mercury currently doesn't have a place for collaborator to enter the sample's original material type. Tina has to ask Maegan for "was this saliva"

KC to set up meeting with john w

  • amy: vvp to XL20 counts
  • mariela: sample attributes, especially those going into bsp_sample DM?
  • kc: inititation, pico rework, etc

BTUG was really solid yesterday. rank_percentile, extensions, etc.

Andrew B solidly using Tableau Desktop now. MM guiding him as needed.

v2 calibration being used for hiseq, no calibration being used for novaseq. Also, clinical and research are going together. However, only one script used by lab, so we need an automated way to determine which calibration factor to use on each sample in a plate based on clin/res and nova/hi.

ZL learned PanCan is a custom panel. MM will learn more.

BITS updating Broad Crowd. Erik will update ours to stay in sync.

Chris working on ULP stuff

KL made ticket about missing extractions info with liquid biopsies. there's a problem with our DM and it's in two systems.

Nasko: Zombie hunter has been struggling because the BITS script got changed that reports the username. Nasko adjusted our script and suddenly it worked again, identified picard runaways.

Nasko: new workflow JMSqueue updates in place 

2019-01-10

Reviewed demos of using views as legends on Tableau. Please share feedback with ZL if any:

  • SAP quotes view
  • Broad Space view

Quarterly planning meeting notes (1/9):

  • Reviewed planning priorities for Analytics (on Dead Sea Room board). Need to reach out to potential priority requestors/gather more information to confirm expectations from Analytics group:
    • Pico process controls
      • KC will look into existing tickets with RAS, open a new ticket if necessary.
    • VVP volume check QC
      • AB working on this report.
    • Stool extraction process
      • Potentially be related to Katie Larkin’s ticket RPT-5124. KC will verify information, CG and RAS to assist as necessary.
    • Array data review
      • CG and KC to review how Mercury data looks compared to Tableau.
    • Sample intake v1
      • KC working on this committee with Robb, Jon Thompson. Consists on receiving samples and upfront Pico in Mercury. MM wanting to keep in the loop.
    • PanCan 2.0
      • ZL to learn more information from Justin in a meeting this afternoon. PanCan consists of a new cancer-screening panel. May require Custom Selection panels.
  • QNow board being re-filled during this week with this quarter's info.
  • AB working on Artel QC and VVP Volume QC reports, showing in a meeting this afternoon
  • RAS working on reviewing Confluence page organization for the Analytics space. Will follow up with KC on Analytics priorities.
  • MM working on analyzing pools on NovaSeqs. Failure on PDO refresh during the Holiday break, fixed and refreshed yesterday. Refreshed risk data report. Will also be helping Andrew set up Tableau Desktop.
  • KC working on reviewing priorities this week.
  • CG working on Liquid Biopsy reporting and RQC2, getting ready for Exomes.
  • AM working on new samples analytics service. Will be tied to PDO / PDO Sample data types.
  • Janice Mann will no longer be supporting 320C BITS stop 4-5 days/week. We will need to monitor any issues that happen so we can report back to BITS, or report any good ideas.

2019-01-03

  • We'll be able to make a Q1 impact with the QA arrays project. KC to work with Nasko to set priorities for this and the CountMeIn project
  • Nasko built a new CSV extract tool for Mike Wilson in MacArthur lab, which is really powerful and generic. Allows non-users to extract data from our datamarts without a direct connection. End user will need to prepare a list of target input and output fields, and then analytics can create a specialized SQL with associated ETL name, to produce their output.
    • Future: we will need to assess the risk of having our webservice tools like this one available to any users
  •  Chris continues work on the tricky table calc to batch samples into chunks to create a limited URL.
  • There was another Oracle meltdown (backups broke) over the break. 
  • LIMS team is pushing really hard on the CRSP portal, which is slow because testing needs to be outside the firewall?
  • Trying to get ability to add aggregation particle UI to mercury. To reduce developer intervention required.
  • Amy trying to work with Robb, Betty, Wendy. They want to reduce SUPPORT desk burden: new features vs bugs vs tasks. But what is realistic? It's still murky.
  • We're still straddling confluence on-prem and cloud. It's frustrating at times. It's lowest priority on Erik's list, though, so there's no timeline for migrating all spaces to cloud. Amy will look into putting a banner on our old space. 
  • MKD plate tracking is the last remaining piece of retiring CRSP JIRA.
  • MM almost ready to roll out the calibration factors for novaseq pooling.

New pages in Analytics space:

2018-12-13

QNow Board

  • NovaSeq pipeline working properly again.
  • Twist Exomes having potentially good results. Justin Abreu still working on adding more indices in order to achieve better results.
  • RapidQC Exome metrics in process.
  • Squid functions on Mercury, lab needs to process Nexomes through Mercury for testing.

Tableau 2018.1.6 upgrade successful. No bugs or side effects so far.

  • Tableau beta on version 2018.1.7. A couple of bugs seen, keeping monitoring them before updating production.

Count Me In group seeking more Tableau visualization

AM developed a script for the MacArthur lab project KC has been working on with Michael Wilson. KC to meet with Michael during retreat to show the solution as a viable tool instead of using Tableau.

MM working on %PF fire. Also on an issue with missing LS-IDs. Now working on calibration factors for NovaSeq genomes.

AB working on documentation for pending pages. Wanted to demo new version of ArtelQC report but there was a VVP fire this morning.

RAS working on documentation updates. Will work with CG on revamping Tableau office hours.

CG having issues with table calcs on RQC2.

KC part of the Soft Eng retreat committee.

AM working on extracts .csv ETL. Also on Cromwell troubleshooting. Met with the developer from CMI to discuss work together. Working on implementing new authenticator protocol OAuth2.

2018-12-06

QNow Board

  • Submissions fire has been resolved (submissions to DBGAP are one sample at a time)
  • Infinium fail rate has decreased after implementing 1:1 mentor/mentee work with Infinium experts shadowing
  • NovaSeq bug to become the first BQMS CAPA (Corrective Action/Preventive Action) escalated issue
  • Yossi is scientific owner of pipeline
  • Sample intake and storage for AoU will use Mercury for Pico Queue. KC working with Robb & others on specs.
  • Sequencer Equivalence (MM) released, being showed to Illumina
  • AoU will be all Arrays. GP will need a similar report to RQC2 for rapid triage QC of Arrays, providing Sample Status, there would be a need to include policies, technical documentation (batch them and mark some for review)
    • Additional need to match Sample reporting with potential BQMS deviation ticket involvement (alert if any given sample has been part of a BQMS ticket) Data review process for All of Us may fall on Analytics.
  • Tableau Server upgrade:
    • Ready to upgrade Tableau Server Dev this week, Production next week

      versionkey featuresextract formatadmin tool
      2018.1 (tableau, tableau-dev, tableau-beta)

      Hyper extracts

      Viz in Tooltip

      Hierarchy filtering

      Step and jump lines

      hypertabadmin
      2018.2

      Tableau Services Manager (tsm) (Upgrade steps: external link)

      Nested sorting

      Dashboard extensions


      tsm


      2018.3

      Set actions

      Dashboard navigation buttons



      2019.1

      Tableau Prep Conductor

      Ask Data (natural language processing)

      no new legacy text or excel connections
      https://community.tableau.com/docs/DOC-22075



      2019.2no support for existing legacy text or excel connections

2018-11-29

QNow Board updates:

  • BITS Maintenance Window on Saturday 12/1
  • Betty Woolf reporting several BQMS tickets caused by human error
  • Reminder to RSVP for Holiday Party Thursday 12/13, 4:30 pm, (deadline: 11/30/18)

BTUG Meeting at athenahealth this afternoon

Krypton to retire in 2019 – need to use \gp-reports\TABLEAU_Files

Tableau reviewing bug through which embedded extracts refreshed during the day do not update using Refresh. Reloading the page will be the best way to refresh data as a workaround.

Tableau Server upgrades:

  • ZL and AM working on TabCompare QA testing between different Server versions
  • Team to review Tableau 10.5 and 2018.1 new features in preparation for upgrade

AM working with BITS to develop a script to stop and restart Spark services. Having the ability to use Docker container in Win10.

CG working on a Liquid Biopsy ULP report request. Also working on updates for RQC2.

RAS working with KC on ARRAYS_MERCLOUD datasource improvement, including additional Infinium Process Views calculations, working with AB on including CAPA and Audit Findings as issue types into BQMS, refreshing LOJDev. Working on checking Metadata Changes for consent withdrawal (SUPPORT-4493 and SUPPORT-4705)

AB reviewing including Customer Feedback (COM) issue type into BQMS. Working with Dave on VVP reporting. Reviewing use of the protocol field in preparation for LCSET types and workflow tracking meeting requested by Tom Howd for next week.

MM working on finalizing the Relative Performance of Sequencers report (Seq Instrument Performance). Also working on cDNA request to review Pico QC reporting, and new Genome Pool Calibration on NovaSeqs.

ZL reviewing financials such as Cloud spending and alerting teams that are overspending, also reviewing quote and revenue tracking reporting in order to stop using Quote Server.

2018-11-15

QNow Board

  • Holiday Party Thursday 12/13, 4:30 pm, RSVP by 11/30/18
  • Fires: Library Metrics cleared, Infinium Array Chip fire
  • Malaria aggregations not working at the moment, PACBio being brought to GP
  • New CLF report almost done
  • Fingerprinting API service live last night – Fingerprint Store
  • Killing SQUID and Collaboration portal -> “Ready for QA” status could take up to 3 months
    • Collaboration portal is outside firewall, so requires BITS in QA process.
  • We should start porting legacy text/excel connections (e.g. Zach's stuff, Broadmap, CustomSQL,etc)

Amy demonstrated functionality of Viz in Tooltip for ArtelQC report

2018-11-08

KC and RAS scheduling conversations with product owners grid about data needs/requirements we need to be aware of for our reports (BSP/Mercury data source)

MM working on LCSET performance report comparing Hiseqs vs. Novaseqs. Reviewing Spark Tiger Scala code.

What is the best way to involve GP in Tableau use?

  • Revamp Tableau Office Hours? Including food?
    • “Come and watch us improve X report live”
  • Meet with individuals to understand needs?
  • AB worked on updates to CLF resolution. On BQMS, including CAPA and Audit Finding issue types. MKD project type still active on CRSP Jira, looking into it before “killing” instance.

RAS working with KC on Ribo Green Quants report with Extractions group. On hold until we gain a better understanding on where/how to extract the needed data from Mercury in order to build the report to user specifications. Working on Tableau updates project/new features on JIRA.

2018-11-01

AM able to make progress on Spark Tiger, made improvements on ORSP Agent performance. Next step will be to improve RAQC performance.

Overall impressions from TC18:

  • Maybe create a high-level dashboard “How are we doing?” – akin to morning walkthrough
  • Potentially schedule time to watch TC18 keynotes together
  • Considerations about data security on Tableau for finances and HR data
  • Set actions very interesting – look into how to incorporate them
  • Most sessions already available on TC18 website

Considerations about Tableau Server upgrades:

  • User experience
  • Formal/informal testing of workbooks
  • New features on newer versions (10.5, 2018.1).
  • Goal: Upgrade to 2018.1 by end of year.

Admin group at Broad interested in a separate site for Tableau Server.

Notes from QNow Board:

  • Fire about Genomes and Exomes library data being joined incorrectly
  • Green Team reported that Exomes in the Cloud may be close to being brought to the pipeline
  • Squid retirement reported ready for QA
  • ULP metrics report – CG to meet with KC and RAS

2018-10-18

Notes from QNow Board:

  • Extractions ULP report ticket feasibility being discussed
  • Exomes in the Cloud: We need some Picard metrics, ZL and MM will be working on discussions for explaining these requirements with Steve, Justin and Maura and incorporating these requirements into the pipeline.
  • Updates from AoU: Illumina selected for Arrays. Genotyping samples may be shifted to March 2019, we will have more time to prepare for AoU and process existing inventory for TopMed.
  • Once external libraries are on Mercury (expected to happen 10/18 evening), Squid may be retired. Expecting to configure FP in order to retire GAP as well.
  • Discussion about MOCHA as a replacement for PolicyStat.

Mercury User-Defined Search training today at 1:00 pm with Jon Thompson

  • Purpose: Understand how to help users understand in what cases they may use Mercury’s UDS for reporting.
  • Will also help to understand key BSP attributes from UDS in order to request ETL to be done from Mercury

There may be a need to reconfigure and stagger scheduled extract jobs on Tableau Server

Review of Oct-Dec 2018 quarterly goals: Think about what goals to add. Current structure may not continue after Q2.

What are potential standard components or labels for ticket follow up that could be implemented on Jira?

AM reported the following Spark issues:

  • Spark and current Tiger can't cooperate freely since Spark is stuck with Scala 2.11 (Tiger is on 2.12). Command-line inter-communication is being evaluated - it allows each one to be in its own JVM/Scala container.
  • Orsp agent take unreasonably long time (15min) to complete. It's not a blocking issue (it's ran once a day), but not knowing what is happening is not good
  • Orsp: change the WebServices layer (switching from Dispatch to AkkaHTTP)

AM also reported that lessons and tricks learned would put us in position to attack more complex tasks (like MetricsFromCloud ETL if we need it)

KC working on updates for CLF

CG working on RQC2 and ULP report.

RAS: BQMS Beta Dashboard, shadowing AB on CAPA ticket configuration on LOJ. Extraction Seq Metrics & Extractions Ribo Green Quantification reports updates with KC.

AB working on updates for Artel report and Automation reports. Working with Nasko on Scala scripting, learned to do flatMaps.

MM working on Exomes in the Cloud. Starting Sequencers validation and NovaSeq run times.

2018-10-11

AoU project expected to last 5 years. There may be changes on the Analytics Group work to support other areas in Broad.

High volume of Genomes (expected for the next 3 months), low volume of Exomes

Broad West contract pending.

Subscriptions enabled on Tableau Server (800x600)

  • They work on any view

Processes/reporting that rely on BSP data

  • Pico TATs
  • Pico Quants
  • Freezer locations / SRS
  • Fingerprinting
  • Analytics BSP Samples reporting
  • Kits tracking (building, receipt metrics)
  • Lab Operations

MM making updates on clinical reporting.

KC working on updates for CLF.

RAS will work on scheduling time for Mercury User-defined search, working on BQMS reporting with AB, Wendy and Betty

CG following up on project with Steven and Tina.

AM had issues with RAQC credentialing. Zombie Checker working well. Working on Spark Tiger issue with Scala.

2018-10-04

Review of Tableau Prep use presented by Mike Dunphy.

  • Count Me In project.

Think about enabling self-service subscriptions on Tableau Server.

  • May require designing special versions of visualizations that are email-friendly.

No Strategy Board meeting this week. Town Hall at 2:30 pm today with All of Us celebration.

CG requested new computer, expecting to be delivered next week.

AM found issue with RAs QC, found another issue with Spark being stuck with Scala 2.11. This will affect usage of Tiger3 features.

AB working on making updates to drive Automation changes. VVPs won’t be analyzed through Artel.

RAS worked on finishing Infinium Process Views chip heat maps and data source changes with KC, updating beta BQMS report to be reviewed with AB and Wendy.

MM published estimated end dates report for NovaSeq. RGHQS extracts have been replaced (except for 2 extracts KC will be taking care of).

2018-09-27

ZL, MM and AM had a call with Green Team re: Exomes in the Cloud. Picard metrics may be included. Target for this may be October 15th.

AoU numbers for Year 1: 50,000 arrays and 15,000 genomes. TBD on arrays being Infinium or Axiom. Expected 3-5 year contract, contingent on Y1 performance.

  • Announcement party for GP on Thursday 10/4.
  • GP Outing may be pushed for later in the year or early next year. GP Holiday party may be larger to include both.
  • Analytics concerns: Will AoU affect killing Squid? What other ongoing projects may be put on hold to accommodate AoU?

LOJ update on 9/26. Security issues that are being fixed. Existing scripts were causing issues with the update that are being fixed.

AM made updates to Sample workflow, including PDO, PDO Sample and Data Type.

AM also identified an issue with Zombie query on Tuesday. Query running for 7 hours. Query designed for ZombieChecker to identify user running zombie query and sending automatic email. This will allow DB resources to be available for users.

AM working on issues with Cromwell server. When more than 20-30 projects are running in parallel.

KC helping Rafael with ongoing Tableau projects, working with MM on RGHQS. No more blockers with CLF, High Throughput project should be done with one additional report. Waiting for answers from CLF team.

CG finding that RGHQS2 published extract data source is failing consistently. PDO Start 2 taken out of the server. Met with Steve and Tina about RQC2, looking to have more functionality built into it. MM helping with adding Sample Coverage Normalized to RQC2 data mart.

RAS working on improvements for Sequencing Performance Metrics and Walk Up Run Metrics Dashboard with KC; also on BQMS report updates with AB. Supported CLF Jira updates (custom fields and validators to workflows). Looking to review potential upcoming Analytics projects gathered during onboarding with KC.

AB built Tower of Hanoi on Jira. Passed Advanced Jira workflows exam. Working on incoming BQMS updates on Jira for high-priority tickets. Waiting for Single Cell team for feedback on workflow design details.

MM working on end times for NovaSeq runs. Waiting for Steve’s feedback on Extracts report. Supporting Susanna on Post LC ticket. Also identified a Swap with Tina that had to be reviewed.

ZL worked on a couple of projects with the TAG group: ichoreCNA (linking ULP and ichoreCNA reports) and Single Cell (uploading data to Google Cloud SQL. Cross-database joining between Google and LOJ may be possible).

KC to create an Onboarding page with RAS to account for information on RPT-4910 and other disparate sources.

AM proposing to have a “RIP” party for defunct server. Potential date 10/11 (given that there will be a GP Town Hall/Cocktail celebration on 10/4).

2018-09-20

GP received a letter of intent for AoU funding, specific percentage between arrays and genomes will be known in coming weeks

Phase 1 of Exomes in the Cloud will use Zamboni. There will still be a need for metric reconciliation with Picard metrics and Rapid QC.

AB working with RAS on improvements to BQMS report (including Instruments, failures by Instrument type, etc). Also working with AM on learning Scala for Artel report project.

CG meeting with Steve and Tina about RQC2.

AM found an issue on JiraWeb connector still connected to Tiger2. Switching services to Tiger 3. Thinking of a streamlined method to download metrics to DataMart.

MM has been working on replacing extracts with DataMarts. Preparing a list of metrics/requirements that we may need to do for Exomes in the Cloud.

2018-09-13

HiSeq team hiring additional personnel.

Green Team working on Exomes in the Cloud. We haven’t gotten feedback yet.

Positive comments on AoU, nothing specific until Monday 9/17.

MM has been migrating ETLs to Tiger3. Replacing RGHQS2 Full Extract with direct connection to SLXRE2_RGHQS. This may be published as a pass-through directly to the data mart.

AB working on updating BQMS report to accommodate CAPA deviation issue types from CRSP Jira. Meeting 9/14 to gather feedback. Also following up on LabOpsJira update date (email thread going on regarding different ideal times of day)

RAS working on Process and Validation Controls and Infinium Process Views update. Positive feedback from Extractions team, awaiting further potential feedback.

Updates on RQC2. CG included a provision on how to include saliva samples, working on further improvements. Having issues with Windows 10 update, will wait on AB’s troubleshooting with BITS.

AM confirmed that as of 9/12 all web services switched to Unix 7.6 server. Services working well expect JQL for a couple of hours. Between 9/7 and 9/10, individual ETL and Spark cluster were also moved to the new server.

2018-09-06

Strategy Board meeting this week discussed a fire GPLIM-5752

Reporting-errors conversation to be held in a separate meeting

MM to review flowcell and library blacklisting for ETL

KC to review tickets RPT-4970, RPT-4971 and RPT-4972 with RAS

MM reviewing Illumina Output report for performance issues.

AM doing work on support of labels in the Jira DWH layer. Still working on TDE hyper extracts

KC mostly done with CLF, may publish a draft and wait for team feedback before making final changes.

RAS continued with onboarding meetings, will review potential analytics reporting that have come from these meetings with KC. Continued Jira Admin Support training and workflow management with AB.

CG to give Sam list of users requested.

AB reviewed Workflow XML with John Walsh and RAS

ZL noticing slow extract times on Tableau Server, may have a separate meeting to see how to stagger reporting updates.

2018-08-30

New date for AoU decision: 9/17

NovaSeq and HiSeq backlog from the past weeks resolved

VVP tips supply has run out

CRSP Somatic pipeline old, commitment to be updated

Mercury QA: Changes that may affect us: Fingerprinting may be done in Mercury instead of GAP; External libraries are being tested to be uploaded to Mercury.

Exomes in the Cloud pipeline may be live on October 1st.

  • We may want to implement test-driven development for upfront troubleshooting/debugging.
  • AM wants ID of a sample run as both exome and genome to test split API requests.

A lot of the TCGA samples are failing, “Banks pipeline” being implemented to solve these problems.

LabOpsJiraTest having performance issues, Erik working on solving. Automation changes coming from Betty and Wendy up that AB will work on and will need AM’s help.

Ongoing onboarding for RAS, meetings with GP team members. Pending review of extractions report with KC.

MM has been finalizing operational tickets. Issue with consent withdrawals refreshing flow cell data when all runs were marked as “cancelled.” Having a couple of issues with NovaSeqs, some runs have 0 duration or with an “undecided” status.

CG catching up, working on updates for RQC2.

AM update on new Unix server being built (RedHat v7), BITS says we’re unable to run hyper extracts and TDE at the same time.

ZL saw issue with hyper extracts of RGHQS on Tableau Server. Issues should be resolved before updating to Tableau Server 10.5. TSM and SSL issues must also be resolved before updating to Tableau 2018.2.

MM will look into necessity of RGHQS full extract.

2018-08-23

High amount of backlogs from Novaseqs has been reduced this week.

Walk-up issue on paired runs being misreported (SUPPORT-4431) – Assigned to Mariela

Green Team resuming Exomes in the Cloud, it’s assumed that if we win AllofUs in a couple of weeks, they may need to shift priority to Clinical Array Pipeline.

Attempt to bring Walk-Up to Broad West in September. Some runs will be initiated in San Francisco, it’s possible we may not get data from those runs (contingent on overview of Verily machines by John Walsh).

  • We would be interested internally in comparing metrics with those from Cambridge, in order to increase performance. Verily will run their own pipeline, we would ideally like to see their data for benchmarking purposes. Outcome TBD.

Trouble with 80X Genomes in the past, but with NovaSeqs they may be able to do them in the same library.

BITS trying to retire internal Stash server and migrate everyone to GitHub. LIMS already moving code into GitHub.

Mercury to become fingerprint server instead of Gap. This will allow the pipeline to check fingerprint to Mercury instead of Gap.

All of Us application very significant, we will find out if Broad will get it on September 7

BTUG next Wednesday, August 29 (2-5 pm). RSVP

GP LIMS Outing to Escape the Room on September 4 (12-5 pm). RSVP with Amy.

Nasko working on pushing new UNIX server with BITS to substitute current Windows server. Operating system has to be updated because Tableau extraction libraries require it. Nasko will have the team try different applications before migrating officially. Not sure on what to do with the current UNIX server yet.

Bug existing with QC metrics that Mariela will review, and then catch up with recent tickets (such as SUPPORT-4431 and RPT-4904 with Nasko)

Rafael has continued attending team meetings within GP to become more familiar with processes, also attended a Lab tour earlier this week. He’s been shadowing Kristen on RPT-4498 and learning Jira functions with Amy.


2018-08-16

Visitor: Julia Nash (rising 9th grader)

Future: handling library & flowcell blacklists for ETL (let's also look at that Confluence page of fix recipes, since it's kind of disorganized)

New strat board format. We are in charge of finding ways to represent our work and show how it integrates

Novaseq backlog is pretty bad. Tammy feels stuck. Sheila suggested hiring more people and Tammy said she's been asking for that for 2 years.

Mercury LIMS: next big project is taking on Fluidigm FP. 

Mercury team is getting VarioSkan into Mercury but we're retiring them, replacing with Gemini plate readers anyway. Maybe switch also from picogreen to accublue?

CCLF we want to wind down, but 

Janki said Exomes in the Cloud is getting back into development. Also the AoU will require clinicalization of pipeline, and may use Affy instead of Infinium.

Nasko wants our feedback about handling the ETL hiccups while he was gone (which were quite minor, especially compared to 2015/6). Let's discuss next week.

KC making progress on CCLF, hoping to publish next week

Chris working with NovaSeq issues. Met with Tina for RapidQC. Working on Topoff tool, too, but we're kind of working on two versions at once.

Rafael is doing well with onboarding. Required trainings are complete, training/learning with resources is ongoing. He wants to meet individually with our various customers to gain clarity on their roles and how they connect to our work.

KC will work on getting UDS demo/training for our group from the LIMS team.

2018-08-09

Welcome Rafael!

Zach will kill the obsolete reporting group (google group)

RPT-4919 WGS events: This can be a topic for the Tuesday LIMS meeting: Can this be done in Mercury or do we really need it in Tableau?

What's the plan with plating in BSP vs Mercury? We dont' want to build a new Seq Plating report if it's going to become obsolete soon.

Chris is working with users to negotiate which functions from RapidQC v1 are necessary in v2

Kristen finally published the extractions/seq metrics correlation report. The hard part was finding the relationships between extraction samples and downstream sequencing samples.

Strategy Board is getting another redesign. Meh.

Zach created the ichorCNA report for Junko & the TAG team

2018-08-02

Kristen promoted to principal analytics engineer

Christina demonstrated new RQC2 and TOT2 tools

Cloud Confluence: Discontinue use of Analytics old confluence.  Cloud Confluence has LucidCharts not Gliffy.

Amy's SUPPORT views are extremely informative  https://tableau.broadinstitute.org/#/workbooks/13110/views

NovaSeqs do not write to file per cycle so new algorithm will need to be developed for RPT-4904

2018-07-26

Strat Board review: no rush moving exomes to NovaSeq (waiting on new Twist process)

Axygen tips are getting really bad, so we're looking for a new supplier (Agilent is costly but reliable in the meantime)

Permissions: FireCloud or CRSP: Grantreader permissions have gone wild and it's costing us for all those downloads. They're revoking and tightening those up. 

14 samples were lost in extraction when an experienced user skipped a step. Let's look into ways to shake people out of monotony. Dynamic photos? CountMeIn photos saying THANKS thrown onto the Tableau screens?

Meeting with Mercury team yesterday: billing hotfix went out (susanna noticed the frameshift, it was due to obscure java problems)

Analytics may be the first test project to move from onprem Confluence to cloud confluence

We need to parse NovaSeq PF values from summary.csv. Let's meet today to do it together (find ETL, etc)

Amy helped ID some of the SUPPORT painpoints in JIRA to respond to JT's overloaded support backlog. Tableau report in demo status, very useful 

Amy looking into which JIRA gadgets & plugins are being used so we can upgrade to the optimal JIRA version

Chris topoff improvements are ready for testing (will demo here next week)

Chris looking at NovaSeq completion dates, JW looking into options

Kristen did a bunch of withdrawn consent stuff and it's painful. Unclear whether we're compliant, and even if we are, it's slow and tedious and not spelled out. 

Kristen current priorities: CCLF development and Extraction Seq metrics reports.

2018-07-05

Puzzle Room outing (Tues, Sept 4th)

Updated ETL recipes page , overview meeting scheduled for Monday (https://labopsconfluence.broadinstitute.org/pages/viewpage.action?spaceKey=AN&title=Analytics+ETL+Framework

Amy: Wendy wants a way to correlate BQMS tickets with downstream data.

All Of Us application is complete

Quarterly goals review

Nasko developed a way to make hyper extracts (hyper TDE files). It has requirements we don't have on analytics server, so we'll build a new server that's ready for hyper.  If the new server is ready before we're ready to switch to Tableau 10.5, Nasko will enable it to also handle old TDE files too.

MM new risk data: tested ETL and spotchecking. Needs to follow up with users about the changes she's seeing in rates etc.

Aggregation particle for new test FCs is being checked by pipeline (Jon hard-coded the particle into a couple of lanes). We need to ensure that the API continues to work (and when we'll need this new column) (MM). How is the aggregation name incorporating the particle info? How will this affect our ETL?

Moving pooling calculator to Tiger3 but there's some performance issue so far.

KC creating documentation for PMs for metadata changes (new process flow and FAQ)

ZL working with Junko (TAG team). Maura is working in Tableau. 

2018-06-28

Tuesday's BTUG was lovely in its sparseness and sobriety. Kinesis might be helpful if we need to test more things the way Chris is doing for MKD & TopoffTool etc. Next one is in August.

RapidQC & Topoff tools: how many samples can be in the URL before it reaches a limit?

Lots of infinium coming in. What's the story with GPLIM-5540 (Mercury receipt instead of BSP)??? How do we handle transitions to Mercury in an ongoing manner? (extractions, freezer storage, receiving, QC scores, etc)

LCSET splitting is still on the table

GOING THROUGH THE TICKET BACKLOG

2018-06-21

LCSET pooling (Justin, the workflow is in development, possible new object, there's still no clarity on what will work for lab, LIMS, and analytics.) Goal is efficiency, handling potential future demands. PCR Plus example. Physically rearraying plates. Special selection of libraries. Biggest risk is that things get screwy and untrackable. We may be going back to the Squid shopping cart model. The modern equivalent of that is the Bucket, but that's not all sussed out.

There's discussion about extending Tableau tools (e.g. Rackfinder) to cover Mercury freezers. Pushback: why even need an SRS team (Jim) if we can simply avoid shattering sample groupings? Pushback: don't incorporate non-LIMS tools for clinical samples. AllOfUs would be large-scale, at which point we might need it.

Products team: Erik is out this week and our build/release process is stuck without him. He has tried to document, but we aren't there yet. 

QA team big push to surge through the QA backlog to get it to zero. 

New procedures to refresh Analytics DMs: Tiger3 ETL Shell Commands (refreshes, etc.)

More agents migrated to Tiger3: ETLDashboard, Rghqs, LimsAncestryEvents, JiraTransitions. Plan is to decommission the Windows version altogether. 

New LCSET DM in SEQPROD: analytics.lcsets (refreshed every 4h), to avoid cross-DB joins to JIRA DB

Metadata change plan: 2-phase deployment. LIMS/Analytics/PO teams are all on board. 

  • Phase 1: SUPPORT ticket continues, LIMS does changes in LIMS, close SUPPORT ticket, PM makes PO ticket. Analytics only involved if there's a problem, with a new SUPPORT ticket.
  • Phase 2: No more support ticket, PM does changes in LIMS and makes PO ticket. Analytics & LIMS only involved if there's a problem, with a new SUPPORT ticket.

Squid death: slow progress. aggregation particle functionality being delayed by AllOfUs

Tableau Server update to 10.4 MONDAY NIGHT JUNE 25

  • tableau-dev is running 10.4, look for discrepancies in most used plus most critical reports
  • need to wait for window when MKD software not being used
  • requires formal testing of MKD and informal testing of helper apps like RQC and TOT

2018-06-14

SoftEng retreat was a success. Interesting disagreement about the cost of a genome.

Tableau 10.4 is on track

Tableau is moving to a subscribe pricing model instead of license buying (TS). This gets them more money in the long run. We do see a speed reduction when all 4 cores are working, so we may want to look into options, trying to get the most out of our great deal. Enterprise pricing is now $400K, but that's going away. (We're currently paying only 30K/year)

New Metadata Change plan

We're ditching the 2 monitors in the other building. Need to remove the screenshots from TS/batchscripts

We need to fix the Tableau sign-up page on go/tableau.

Rising 9th grader coming for a week in August to shadow Kristen and see what we do here

PDO_STAR performance improvements (mariela). Want to remove dimensions. Removed unnecessary joins. 

Conference planning (TCx4 and ASx2)

2018-05-31

Affymetrix machine (Genetitan) being brought in by Verily.

Fat panda is definitely not happening.

Amy went over extract vs datasource filters (extract filters exclude rows which makes smaller extracts, data source filters don't actually exclude rows and are big.) Tableau extract filters vs. data source filters (order of operations)

Amy went over unioning Excel files that the lab saves in TABLEAU_FILES/DNAQC. (BP* batches for FIQC positive control reporting). Tableau multi-file data connector

Nasko: Cromwell viewer no longer having out of memory errors

What are we going to present at the softeng retreat? Nasko to pick something interesting that shows what our group does

TS 10.4 production update testing in progress - check workbooks

Kristen's new desktop -  ongoing woes

2018-05-17

Last group meeting with Rachel (sad) 

Tableau alerts now in production on Tableau Server (TS), beta testing Winnow alerts using Tableau Online

Nasko might make a CromWatch poster for Softeng but has a Google training conflict for June 11

Still need to upgrade production TS to 10.4 by end of June (MKD testing required, any other that must be documented?)

Plan to start adding hundreds of users to TS via active directory group import - need a way to keep the Google group current

Zach to discuss puzzle room outing with Jim

Amy working on parsing BQMS tickets to determine number of affected samples per PI and PM using BSP data

WalkUp metrics have been affected by the CLIA permission lockdown even for non-CLIA runs

2018-05-10

Chris working on liquid biopsy reporting. Also WGS booking strategy. 

RapidQC metrics are in same table as WGS, and reports are retrofitted. Tina's excited about going to a more graphical interface.

Nasko dealt with changes to JohnWalsh's Illumina summary file & OS (linux to windows). Exposed another webservice so JW can run a curl to upload the file to analytics server.

Tiger3: biggest thing is cats (functional framework in scala)

Zach's been working on cromwell viewer. Also Shades@ wonders if we can assess our own research for diversity. 

Mariela GCP fundamentals class was good. New aggregator. Testing metadata changes automation queries, look good. Estimating workload for Katie Larkin's RNAQC request. 

Amy helping Justin & Shelly JIRA-track their testing panels for MRD. Alert systems in Tableau. 

Kristen: CLF, Infinium, CMI.

Rachel: seq plating and heatmap development. Now focused on documentation before she joins DSP.

CromWatch development with alerts

2018-05-03

Cromwell API - overcome "1-project-per-request" limitation. Can consider looking into samples instead of projects. We can filter by status.

  • When pasting into Tableau, we should verify that Tableau isn't doing anything weird.
  • Implemented in JIRA DB because the JSON_TABLE piece needs Oracle12
  • Currently only on Nasko's computer, until DaveG whitelists the analytics VM

Enable Tableau Server alerts using either method below?

  • Tableau alerts
  • Winnow alerts - zach had a call with them; they're excited to give us beta access, but we'll need to test it on tableau online bc of firewalls

2018-04-19

Amy wants to add JIRA office hours to the Tableau office hours

pipeline and analytics will agree on a plan (for allowing custom aggregations), and then mercury will implement changes, pass that out to pipeline API, and then those new custom projects will show up in aggregations DMs. We'll need to ensure it doesn't mess up our tables.

Workflows in the cloud are currently in both cromwell and zamboni. So far, Nasko & Zach are only able to get one project at a time from the Cromwell API, so that work is on hold until we find out more.

Genome LC process is still totally messed up (2 fires: 2-plate LC is so bad compared to 1-plate LC, and covaris is still a mess). This may affect our throughput.

Making an offer for a new QA person – looks promising.

Went through the backlog (unassigned tickets)

2018-04-12

Mariela: Dynamic filtering by user defined expression in Metrics Plate View viz

Some RNA LCSETs moving from Squid to Mercury – does this affect us?

  • Mariela to check messages to confirm we get the correct library types
  • Zach checked with Susanna on RNA aggregations - they won't move RNA products or projects that require custom G projects

sample tracking help for Count Me in - we may want to extend that for Center for Mendelian Genetics, who may also come to us.

Quarterly goals on the Strategy board (which ones are up there, how to keep it updated)

Eliot has begun learning Phil's Metadata change process and is horrified.

MDCM Updates: Nasko & Rachel have built a tool to check for support tix, look at their linked PO tix, and then find the zamboni workflows associated with it to see their status. 

Workday self-evaluations need to be initiated this week

Pipeline working on a custom aggregation option for Mercury to share with Analytics next week

2018-04-04

Quarterly goals review.

SoftEng retreat content planning (posters?) (Confluence doc for planning HERE)

  • Retreat date: June 11th
  • From Moh's email on March 16:
    • We hope you will find something to present that is inspiring to your fellow engineers. Here are a couple of suggestions about the kinds of work we'd love to see highlighted via a poster:

      • Useful to other engineers: A tool, library, application, documentation, etc. that other engineers use or find valuable.
      • Developer ergonomics: Ease of ramp up, good documentation, high productivity, ease of deployment.
      • Usability: Easy to learn or powerful experience for end users.
      • Testing: Software is reliable, has good test data, easy to add tests, etc.
      • Security: Architecture is secure, security is tested, etc.
      • It's just cool for some reason.
  • OUR IDEAS FOR TOPICS:
    • TIGER
    • oracle automatic version control
    • MDCM
    • broadmap, 
    • operations monitoring, all the various systems we pull together, tableau vizzes
    • JIRA ETL
    • JQL TDE
    • zombiechecker
    • log monitoring
    • JSON cromwell stuff 
  • Nasko: Cromwell, JSON and beyond. Get the data wherever it might be (15 min)
  • Project tracking report(s) discussion

2018-03-29

  • Quarterly goals updates
  • Six Themes of Q4 GP:
    • Simplify Exome LC to reduce costs
    • Kill Squid, GAP, Firehose, and other legacy systems
    • Exomes in the Cloud
    • All of Us (plans for how to make arrays clinical, clinical genomes 1.0)
    • How to deal with NovaSeq migration even if they're lower quality than HiSeq
    • Collaborator Portal (replace CRSP Portal)
  • Our "new" RapidQC design will support Exomes in the Cloud
    • All existing tools that support push to cloud need to be updated with DATA TYPE to disambiguate HS/WGS requests
  • If the Exome LC changes, we'll need to respond to that in several ways
  • Let's all think about SoftEng presentations
    • Viz screenshots?
    • Tiger? Auto-backup of Oracle?
  • Count Me In sample tracking: Integration of DSM with BSP? and/or downstream Seq Info? (Michael Dunfy to meet with KC/RB)

2018-03-22

  • Announcements from Quality meeting (Desmet)
    • Preparing for Clinical Genome and Clinical Arrays
    • New Exome Content - new technology offered by Twist Bioscience
    • Scaling up NovaSeqs
  • Zach met with KT about moving research exomes to the cloud
    • She acknowledged Zamboni's usefulness and that Cromwell's lack of tracking will impact users and mostly fall back on Green Team
    • Some RapidQC metrics will be available but we'll have to determine how they compare.
    • RapidQC is super time-consuming for Tina, so how are we going to do this for all the exomes, realistically? (somehow we need to automate it!)
    • She understands our fear of the complication of "is latest" flag across multiple locations, so she hopes they can figure something out
    • when zamboni goes away, they won't be able to find the metrics to put in the table
    • the whole project might get delayed due to functional equivalency (but there's no estimate yet)
  • Strategy Board updates 
    • (yay, Tableau audit is complete, Tim was pleased)
    • pooling calculator may become less necessary due to the changes with looking 1.0 lanes in advance
    • novaseq is still not meeting our product goals (currently primarily used in walkup), so it's not yet ready for primetime
    • malaria auditor was highly impressed by the lab processes and quality
  • Eliot is going to start working on a multipart project to improve repatienting: 1) BSP validation automation 2) automating BSP changes 3) creating the PO ticket automatically with the PM as reporter

2018-03-15

  • Brainstorm: Need new name for GrantView (sad)
    • grant administrators → research fund managers
    • FundsWatch? FundView? → FundWatch
  • Christina's been working here for 19 YEARS!!!
  • Plan to upgrade Tableau Server to 10.4 or 10.5 before June 30, 2018 (SAP extract keys for 10.3 expire then)
  • How to kill an active Tableau Server refresh process without stopping server (Zach, RPT-4699)
  • Is anything missing from this summary of audit improvements? Please take a look.  Unable to locate Jira server for this macro. It may be due to Application Link configuration.
  • Testing Concatenation in Published DS: Initial test shows reduction of refresh time.  Unable to locate Jira server for this macro. It may be due to Application Link configuration.  
  • We're going to remove empty projects on Tableau Server (e.g. CMap, GTEx, PRISM)
  • Can we standardize column naming conventions somehow? It'll be awful to try to fix this, but let's find a standard for going forward. Think about it. (https://tableau.broadinstitute.org/#/views/audit_20180308/RenamingDifferences)
  • Tiger3: automatic version control from Oracle into GitHub. SO COOL. Every day, the script checks the appointed areas of Oracle for changes. If anything is changed, it automatically gets the DDL and uploads this into GitHub. It lacks comments, but you can add comments via a manually-run script if the comment is important. Maybe we can also add comments within the oracle view (e.g. RPT ticket)

2018-03-01

  • Tableau REGEXP expressions (Zach): SSF revenue analysis FY17 (SSF Revenue Dashboard)
  • PDO Tracker custom views: Broke when Zach renamed one of the views used on the main dashboard, which was the basis for the metrics select.

BTUG last night in Burlington: Point made about difference between flashy viz to grab attention and boring viz that has info for real business user. DataPrep demos: Maestro (simple, still in beta) and Alteryx (complex) 

Kinesis CI automated testing for Tableau (Zach): his neighbor at BTUG is a big fan. The video is a bit much, but there may be opportunity for us to use it in testing server upgrades.

Zach told Jim we were worried about ExomesOnTheCloud. We should reach out to GreenTeam to ensure we know what is happening when.

Tina's email: Associating Affected Sample (downstream data) to related BQMS events upstream – there was a meeting about this. It's too unregulated right now. Simplest way may be to go up to the root and interrogate its whole tree for BQMS-related tickets, and let the user figure it out. Jim asked the lab to come up with a broader vision & scope before we build anything.

Unable to locate Jira server for this macro. It may be due to Application Link configuration.  This is getting stuck in backlog, but we do want to make these changes. Will pass around to fill in the table to clarify what's planned.

Sam:  Unable to locate Jira server for this macro. It may be due to Application Link configuration.  We can't do this. No bandwidth and we're doing a lot of extra work for Stanley already. BUT need to tell him ASAP so they can rework their workflow using PDO Tracker instead of PDOSS2.

2018-02-22

News about metrics on the cloud - Rachel heard rumors that Exomes on the Cloud might be on hold, but we have no confirmation. There is a ticket to document how aggregations work to help Mercury, which might be helpful for us as well ( DSDEGP-2071 ), and a Pipeline FAQs document(https://docs.google.com/document/d/12c704P9oVgAYWdyJEBSV8v92XLwqDMxNhMhZ6ckFP4E/edit#heading=h.k3vhl63kcqzd)

Verily sabbatical - no word from Jordan yet

REGEXP are very powerful but should be used with caution as they may impact performance especially if filtering

Tina's   Unable to locate Jira server for this macro. It may be due to Application Link configuration.  (billed samples missing billing date) was a result of ETL glitch which was fixed immediately

Tina's email: Associating Affected Samples to related BQMS Tickets - Amy started an early assessment

  • AB: ~30% of tickets don't actually have samples listed, so we'll have to consider how to treat a blank Sample ID field plus (say) "Samples affected = 2" but is linked to a 96 sample LCSET.

Zach presented Grant View report to Stanley Center and other groups. They loved it. Trying to push back on implementing different versions of it for every group regardless of its size.

Rachel reported that miss-communication between requestors , Analytics and OPs sometimes get in a way of quicker resolution of re-patienting requests.

2018-02-15

Don't simply say we can do something in JIRA/Tableau if Mercury might be the better place!

Verily: Jordan Sullivan will be coming Mar 7-9

We won't go into REGEXP today, but they're really powerful in Tableau

tableau-users mailing list: 500 users. ~100 occasional cloudspend only. ~100 stanley ctr. CountMeIn will be huge. don't use it to share info beyond "server is down" etc

We'll meet today to discuss Exomes On The Cloud

Metadata changes: We now have the info to support a decision to lump this work into 1day/week or whatever is reasonable for our workload. KC&RB will meet with Phil & Robb on Tuesday to review what we found during assessment of the past year's tickets.

Version Control: discussed the plan. RPT-4517. Giant datadump of all: Functions, procedures, packages, views. Later, the script will copy just the changed items. Nasko wants to do this via the script for Oracle & TigerETL, and he'll do so by the end of the quarter. Rachel will add her own scripts to a repo manually.

2018-02-08

March1: We've told Betty we'll work on our documentation - Mostly falls on Christina for RapidQC & topoffs. 

BITS outage Saturday shouldn't disrupt things much. Nasko will check things on Sunday.

CRSP Portal is way busted. Looked at reports and they were messed up, but ppl thought it was bc it was test data. So test data in dev server is old & messy. Someone needs to figure out how to clone/scramble the real data into dev (since real data can't be shown to testers). Pipeline changed names, but Phil's report expected old names ("percent metric" vs "metric percent"). 

Nasko: released first draft of Tiger v3 & Picard aggregator agent and so far it looks good. Needs Mariela's help testing it. Will give presentation soon.

Rachel's almost done pulling together data about the true burden of metadata change tickets

Chris is improving RapidQC performance (with help from Mariela). An index on designation.

Amy can get help from Scott & Dennis to test JIRA workflow changes against Mercury auto-creating tickets

2018-02-01

Nobody took notes

2018-01-25

HiSeq X fleet split: Six of the worst performing machines (non-loaner, non-clinical) have been identified and finance is coordinating their trade. Let's create this view over time (Normalized PF vs Run Start Date). 

Mercury release (1.84) last night (2 months in the works): Cloud BAMs, SAP, etc. Next Mercury release will be Array-focused.

Group wants more standardized/precise LCSET protocol names. Mariela suggests looking at the LCSET type names, too, though changing those could wreak havoc for Zach & Amy.

Might we want to try Kanplan to manage our backlog? https://www.atlassian.com/agile/kanban/kanplan

Nasko working on Tiger 3, using Picard heavy aggregator ETL as the pilot, because if that one works, all should work. With new tiger, everyone should be able to push ETLs manually for specific LCSETs, libraries, etc (not just by MM & AM)

Amy working on the product naming. SRS agile board mgmt: performance problems for extraction requests. Jim wants to get user-level issue security in place to allow Verily users to see only their relevant issues. Perhaps similar to Illumina whitelisted IP addresses, if that also works from outside the firewall.

Mariela added metrics to PDOSTAR5. Extended some ETLs to handle calibration version (pooling). DCFM missing dates in SeqStar: data needed to be refreshed. Extending risk success rates to allow splitting genomes between PCRfree & PCRPlus.

Rachel almost done with the LOJ_ISSUE_LCSET datasource improvement. Metadata changes. Time estimate of metadata changes to support getting Phil's time building better infrastructure. CCLF priorities should be coming today so we can start that work. Soon: transition to Git. Crux: 2-factor identification is needed.

KC audit

Zach working on the coordination with the pipeline, hoping they'll do more than they've already agreed to do. Live demo today of grantview & 4 more in Feb. using cloudJIRA and tableau.

2018-01-18

BACKLOG REVIEW - closed out and assigned a few tickets.

Decided on a label for tix related to quarterly goals (2018Q3).  Also will add links to Q3 GDoc to link goals to the relevant RPT tix.

Notes in addition to changes made on existing tickets:

  • By the end of the quarter, the "feasibility assessment" tickets will be new quarterly goals OR will be closed as "won't fix" so they don't linger forever.
  • We need to ask the Mercury Team what's realistic for situations in which data get added to mercury but analytics can't access it. Should the original request (from the lab?) include info about whether they'll want the info in Tableau, so Mercury can include that development in the original release? (e.g. CLIA reagents, RPT-3760) SCHEDULE SEPARATE MEETING?
  • RPT-3082: Linking BSP QC metrics to Seq metrics: ask Mariela about Risk Assessment stuff, in case that logic can be extrapolated to handle this ticket's needs too.
  • RPT-4248: Talk with Tom. How important is this? Is it worth our development time to automate parsing of excel sheets? Who are the stakeholders?
  • RPT-4484: Walkup updates - Rachel and Chris can work together on this
  • RPT-4498: This is part of the massive extractions LIMS battle. We may not see resolution on this battle for years, but we need to look at whether efforts here will be wasted or worthwhile.
  • RPT-4570: PDO_SEQ_STAR null DCFM: We need an offline discussion to figure out if/how to do these refreshes. Chris has related tix in her backlog? SCHEDULE SEPARATE MEETING.
  • SUPPORT-3703: HiSeq Utilization, Seq mode: Mariela has some kind of related script for this?
  • RPT-4602: This brought up the idea that we should look again at our "best practices" regarding LOJ_ISSUE views in Oracle. SCHEDULE SEPARATE MEETING.
  • Nasko's backlog: If any of us have things in there that we really need from him, bring them to his attention bc otherwise he's focused on Scala 12. (e.g. KC, RPT=3917?)
  • Kristen & Rachel need help overall with understanding which datasources to use when. PDO_SEQ_STAR vs PDO_STAR5AUX, ILLUMINA2, SLXRE, SLXRE2, etc.
  • Documentation of datasources (e.g. update of Analytics ETL Framework Pandemonium image) will be helpful for our own team, and will also apply to CLIA documentation.
  • Zach wants us to get closer to a zero-backlog, so we should push back on creating tickets unless we know we can work on them. We should not be afraid to close tickets as "won't fix."

2018-01-11

Strategy board: New Admin: Lily; Babyseq is complete; GATK release is a big deal, with Intel; we should go to Broadway talks; some RNA samples were left out overnight; Index plates contamination due to plate mfg problems, so we'll mix via Bravo for now; FC yield variability "fleet split" should lead to trading out bad machines first w/ illumina; VVP robot is expensive but getting tested for plating; topoffs and lane variation are still getting tested (Tom) but this may lead to more RapidQC metrics (aggregated mean covg) required in pooling calculator; 

Mariela will email Sam D about whether the VVP testing was done on malaria plating, which had issues in Mercury w/ duplicate aliquots not getting recognized. 

There's a big push by Betty to get clinical genomes (HG38, CRAMs: nope. HG19, FASTQ: nope. Firecloud WDL redeliver the HG38 CRAMs, etc) working by March 1. Analytics Tableau & JIRA tools that are critical for this need to be documented and tested.

PIpeline wants to move exomes AND EXOME METRICS to the cloud. Mariela is working on documenting our current tools to help argue for whatever we'll need.

Pipeline had an issue with custom & regular bait mixing up custom & regular sequencing, so aggregations are still a little funky.

Nasko has been working on switching to Scala12, and has come up with improvements to the ETL world, which he'll tell us about in a couple of weeks.

While ETLs are still on Windows machine, Nasko added a troubleshooting page:  ANALYTICS-ETL Web Server, Troubleshooting (Nasko)

Performance reminders: TODAY() or DATETRUNC instead of NOW() (RPT-3250); ELSEIF instead of ELSE IF; Concatenated fields: KC will make a table with ownership (RPT-4050)
Mariela dealing with broken ETLs: Pipeline causing problems with not correctly doing manual changes (e.g. deleting BAMs & changing flags)
Rachel working on quantifying metadata changes to help plan requirements (e.g. BSP changes), Project Tracking workbooks
Chris working on clinical genomes & topoff tools.
Amy looking at Cloud Confluence 

2017-12-14

Zach's OSR spending report

Quarterly goals review - EVERYONE SHOULD REVIEW THIS DOCUMENT https://docs.google.com/document/d/1oEAzJIoQcwD5Dq2EMJregQK54wgA4Q6B9tlCWffc7bE/edit

  • E9=Epsilon9. CLIO (metadata mgr) replaces BASS and the new API that E9 provides to Mercury looks up the files (IS THERE A MAPPING OF THIS?)
  • We need to start documenting how BQMS tools are used (Pooling Calculator documentation is a good example)

Strat board: 

  • More commercial work: Genentech was really happy, Roche Basel wants to use us, etc.
  • Pool group contamination: foil wasn't sealed very well, and some users weren't vortexing them (and getting lower contam)
  • Cloud vs RQC pipeline contamination protocols differ???

2017-11-30

Mismatching PDO statuses between JIRA & Mercury: https://tableau.broadinstitute.org/#/workbooks/12914/views

  • zach suggests filtering to a recent dataset, finding mismatches, and digging from there to find out why. we may need to lock down JIRA abilities. if people are hacking for some reason, we should work with them to find a solution, explaining why it's causing problems. 

Let's discuss the comment thread on this ticket:  Unable to locate Jira server for this macro. It may be due to Application Link configuration.

Side note: Call it the "Mercury Run API" instead of "Pipeline API"

Exomes on the cloud: Time for input! https://docs.google.com/document/d/1g8EmPjOZl-DzHlypXeOjKHzI4ff1LvzBiigDbZTy1Cs/edit?ts=5a207a86#

  • There are multiple places that could affect users. Zach is lobbying for this and says they're aware that we're not happy with several things (no GUI, what's replacing Zamboni, stretch goal of reanalysis, metrics live in the cloud but our tools are on-prem)
  • we'll continue the conversation and we'll try to find a way to keep each other in the loop (Slack room?)

OSR (Office of Sponsored Research) grant demo including row-level security in Tableau (Zach): 

  • PIs are very protective of their grant info. Their own SSRA? system filters access based on their username. We can leverage Tableau user-based permissions to filter data. 
  • Zach built a new permissions system using a googlesheet (later can be a database) listing artificial grant groups, people groups, and permissions. Adding new rows in these will change who can see what. This is used as a secondary datasource (SAP as primary). This has a lot of potential. Similar to topoff tool permission-based actions.

Rachel's new Sequencing Performance Metrics report is a huge success!

We'll be moving to github soon....

2017-11-30

The amount of genomes that are going to get pushed to the cloud, along with everything else, is going to overload the cloud. There's a big meeting with BITS folks today to look at how to handle that.

Tight technical project between Broad/Intel is messy but getting straightened out

Zach looking at FC variability across machines. Exomes don't seem to be as affected as genomes (PF bases, which is what gets into aggregations). One thought: if we split the fleet, and put exomes on the worst machines, maybe we won't have to top off genomes as much. 

TOPMED meeting in Washington this week: ~20K genomes should start showing up in January. Too many reports on TS: let's look at them.

Amy: lots of stuff on hold, waiting for others. JIRA service reports were busted because users were changing mapping-target summaries. Fixed that.

Visualizing QA Bottlenecks & Priorities: Mercury developers have created features that aren't yet in QA, so creating a JIRA status for "Ready for QA" allows us to see the real backlog and get reprioritized for QA (similar to our effort 4-5 years ago in the lab that showed that we needed to increase seq capacity). 

Mariela: Finished optimizing Picard Metrics views: much faster (1min to 20sec). Uses blending too (doesn't create new rows) (different than cross-DB join). Jumped back on risk success rates. Pooling script on Bravo deck now sends a diff msg if using calibrated or regular pooling. But this was creating a diff library type that Mariela wasn't looking for.

Now we can look at using Google Sheets for Capacity Planner. (Iris Fung has started creating Cancer program reports using Google Sheets. But she's had problems with header row changes.)

Chris: ongoing discussion with PMs & arrays lab about tracking remaining stock (AOT). 80X WGS product (Tina) in RapidQC - no more proactive topoffs for the 2 libraries. No longer trying to trick the system.

Rachel: repatienting galore. MDCM & Tableau viz (workflow) improvements. Trying to wrap up the new Seq report & a few audit updates.

Kristen: helping Rachel, Dealing with Infinium, More Audit stuff & workbook improvements.

Nasko: ORSP is moving along, Dennis moving to the new tables. Rackfinder code review looks good.

2017-11-16

Strat board: Genome index calibration ongoing testing: Looking better, but some indices still best/worst

  • Clinical genomes bump other genomes off the plate, and get flagged... MikeD said pools were terrible, but the calculator works great. 
  • CLF gets a FTE from GPLIMS. What's our usage like? We need to estimate that.

Web edit permissions on tableau-dev and -beta  (RPT-4502): We're going to leave it off, so to use it for debug, turn it on, log out/in, and use it.

Reviewing SUPPORT vs. RPT tickets:

  • SUPPORT tickets that can be resolved with routine procedures (e.g. refresh) should stay as SUPPORT tickets
  • SUPPORT tickets that require significant new development should be linked, cloned, or converted to an RPT ticket
  • All SUPPORT tickets should be closed in 1-3 days depending on the urgency (metadata changes take longer but should take priority once they're ready for us)

Mariela has been working on Illumina Picard Metrics: Live cross-DB join. Removed extra datasources by adding new fields to RGHQS. Switched from facade view to non-facade. Hid unused fields.

SLXRE_RGHQS_TARGETS has timestamps for when the refresh was requested, when RGHQS was populated, when metadata was populated, and whether we(not pipeline) blacklisted.

Rachel is willing to do a basic Git tutorial and to talk with Eric about the effort to move from Stash to GitHub.

2017-11-09

Zach's FC crusade has had him following in the lab. It's a really complex process. Strip tube creation, cBots, clinical vs non & BQMS tix, etc. Eye-opening.

T10: Selectable text in tooltips is NOT default in new workbooks built from scratch in TD10. Default is for the dimension field value to be a highlight action. Measures are default selectable. To change this: in the tooltip menu window, uncheck the "Allow selection by category" box in Desktop>Worksheet>Tooltip... Workbooks originally published in T9 are default selectable b/c the box is default unchecked.

Web edit options are risky (RPT-4502). Can we set tableau-dev & -beta to automatically enable web authoring each day?

RPT-4495 Zach helping Tim get more FP metrics

Kristen wants to help Steve incorporate best practices to improve performance and reduce clutter

Tableau-beta with hyper shows marked improvements to refresh times.

Rachel making progress with the new Sequencing workbook. Weird Oracle trunc error on % adapter field.

2017-11-02

Zach shows a good LOD vs Table Calc, where the LOD allows you to alter the way the Grand Total is calculated. Customizable totals. 

  • Also, you can HIDE specific totals!!!

Zach shows Pivoting to transform data without altering the raw data

Flowcell output variability is terrible. Zach's going to dig into it.

Nasko built a DataPump tiger agent for Sam Bryant to get his data from MySQL DB to Oracle.

Chris started using color by measure (legend specialty) to show out-of-spec per measure in a crosstab. https://tableau.broadinstitute.org/#/views/WGSProjectTracking/MetricsforClinicalSeqComplete

Chris used cross-DB joins: https://tableau.broadinstitute.org/#/workbooks/12550/views (she'll look into whether it requires extracts, and if not, what the compute hit is)

  • Chris confirmed that Tableau uses an extract for the cross-join, it takes a few minutes, so live is not feasible, but fine if extracts are okay

2017-10-26

It's no-meeting week, so we focused on what needs to be done.

went through our backlog, but nobody has time to work on anything new (e.g. seq plating, RNAQC). Made a couple of comments requesting info.

discussed KC's latest admin reports

discussed metadata changes: meeting tomorrow to discuss automation options with the pipeline team.

2017-10-19

       We're all here!

ETLs were all busted. Tina restarted the analytics-etl server last night at 5pm (with Nasko's approval). Unfortunately, Nasko had also changed his PW. JQL ones: webservice wasn't running this morning because of Tina's restart. The JQL service isn't on heartbeat. If webservice on analytics-etl goes down, the JQL service goes down, and the Sample Workflow goes down (YIKES). (This won't be so bad on UNIX). Nasko will document what is necessary to manually restart the service. Unrelated to the webservice: ConnectionPoolChecker was still bad at 10am. This looks at how many connections to Oracle there are (because DaveG scolded us at one point and we started monitoring).

Basecalling has been down? For how long? Why? Mariela will check with KT.

Tableau Server 10.3 observations

  • start page settings - the new default Kristen set and user-set ones are super buggy. We reported it to Tableau.
  • web browser back button messes up filter selections (Zach, RPT-4449). If an action takes users to another tab, it may be best to add a button to the second tab to bring them back to the original tab.
  • may want to re-enable web authoring for authorized users. This is definitely the direction Tableau development is going, according to the Labs session KC had at TC17.

Tableau performance tips

  • Use Performance Recorder (either in Desktop or on Server) (on TD, start recording before you open the twb)
  • Design performance into your dashboard from the beginning
    • simplify dashboard layouts
    • avoid lots of filters, especially dependent or calculated filters
    • hide unused fields, even in extracts
    • materialize or extract Custom SQL
    • read Alan Eldridge's whitepaper
    • watch Alan's TC17 talk: 
  • Install Logshark and/or TabMon to check Server load
  • New/upcoming Tableau improvements:
    • Hyper engine: Tableau bought a German company, it's supposed to be really fast (released this year in 10.5)
    • Project Maestro (visual manipulation of the data) (maybe like a simple, cheap version of Alteryx)
    • Dashboard extensions: setting a DB value ("closing the loop") via Tableau

2017-10-12

       Tableau conference - ZL, KC, CG out.

  • Nasko - still working on push to cloud messages in anticipation of the pipeline's switch to HTTPs.
    • Also working on ORSP webservice, but is struggling with lack of unique identifiers available
  • Mariela - updated datasource behind Gtex samples report. There was a problem with ranking that was causing the wrong sample to be chosen, Mariela introduced new feature to look at plated sample (if available) first before ranking. 
  • Amy - Artel report improvements - introduced conditional tolerance setting (depending on 2ul or 50ul run) while also allowing custom set tolerance if desired.
    • Need to make a report for visual leak checks. John Walsh just got these messaged.
    • Working on making same change to 17 distinct passage transition updates. Can't consolidate since transitions all have different destinations. Working in XML is much faster than UI.
  • Rachel - Continuing to work with Andrew/seq team to get relevant reports in Tableau for quality meeting slides.
    • Working with Amy, Betty and Wendy to port CRSP JIRA deviation tickets into Labopsjira - need to spec requirements
    • Productionizing Steve's reports
    • Suggested classes or workshops for JIRA users to manage dashboards/filters. Also info sessions to see how they use the information - many people want to download data from JIRA, how do they use these?

2017-10-05

  • MKD: New Lab Director replacing Tom Mullen, and we'll need to do more printouts next week, so we may need to scramble to get all the PDF footers changed/tested. Temporarily, Heidi Rehm will be filling in again. 

Greg did MKD & WalkUp testing in his sabbatical experiment. 

Tableau 10.4 is released. 10.5 is in beta, and has a new engine (Hyper, a German company) which could improve extracts, but Zach's initial test of it showed it took the same time and created a 3x larger extract file.

Strategy Board - no major updates

  • Calibrated genome pooling factors need adjustment – Zdeb & JoWalsh are on it.
  • Eric Lander's going to be there next week

BQMS - Rachel's almost there with all the changes. Kristen helped polish the Tableau report and the team liked it.

ActiveMQ - pushing out Tuesday: requires SSL communication. The pipeline team is going to continue to check the old queue temporarily in case something hasn't switched over correctly, but Nasko doesn't want to rely on that. 

  • Currently 3 people (Tina, Steve, Tammy) doing the push-to-cloud, so we'll need to temporarily disable the Tableau action by messing with the group permission. Ideally Tina would save a sample to test the changes by running it right after the switch.

Nasko altered ETL timeframes to handle latency, and another ETL to check every morning to ensure that the arrays_qc table on-prem matches the cloud version.

Moving from Windows machine to Unix has been going smoothly, but some PLSQL oracle jobs ETLs are still annoying, so Nasko is building a new method to move the schedules for these to Unix. The code and execution of them is still in Oracle, but the scheduling is in Unix. The Unix cron jobs page will point to where in ORacle to find them, and they'll show on the ETL Dashboard.

RapidQC - something's weird with aggregations. It's serious.

Different read structures BY LANE?! doesn't seem feasible.

Mariela making a diagram for repatienting etc troubleshooting.

Kristen's computer is finally good again. Let's get tableau-ts-beta set up with all the right software (e.g. SQL navigator) so we have a bolstered useful backup.

2017-09-21

Tableau 10 join calculations (Zach)

Ruthless culling of Tableau Server has begun – sheila seemed quite excited about 100 workbooks getting deleted.

2017-09-07

FP backlog due to Harvey - Rachel's new report will help them prioritize when Sigma is back online.

BPP hitting cloud JIRA & AOT hitting our JIRA, sometimes failed late at night. 

Nasko wants us to use JQL extracts instead of Oracle for accessing JIRA data, whenever possible.

Things are very siloed in the LIMS world. Also, we need a new QA tester to replace Marcia. Possible lab intern to take that on, at least temporarily.

Nasko tuning up our SPARK cluster. It's not very robust (yet?) but when it's having problems (affecting its 2 agents), it doesn't affect the Tiger agents running on the same machine.

BITS is again paring down VMs. For the next Quarter, we need which VMs?

Kristen on vacation much of the rest of the month, so let's plan to roll out Tableau10 Sept 18-20

Mariela: RapidQC CRSP version was deleting some research FC rows but not inserting them again, so she's been going back and filling in the holes.

Mariela can go back to working on her normal stuff now that all the buggy problems are off her plate. (smile)

Amy doing a lot of small tasks. Also building BQMS JIRA (this project is built to CRSP standards) and we'll need a new report for it (KC wants Rachel to build it). Made Artel updates too. Made JIRA changes for Walkup updates.

2017-08-31

We are all sick but Nasko. 

BQMS should be ready to roll to production next week.

Artel QC volume check will roll out to production wednesday

Robb told us there may be PHI being captured in CCLF JIRA, which may affect everything we're doing with them. 

GPInfoJIRA was upgraded the other day. Shouldn't be any noticeable changes.

Erik is out Fridays from now on.

Rachel's been helping with BQMS & CCLF.

Metadata changes scripts are on Stash. 

Mariela has found that some metrics get lost from the HQS datamarts. Looking into it. SCARY.

The assessment of on-risk samples is on hold until the HQS problem is solved.

Nasko's building a SPARK cluster on our UNIX server. Unix ETL Goals: Get all ETLs in one place. Get away from painful Windows schedule. In UNIX it's one page, easier. Avoid password problems. Reduce machine restarts and such to provide more stable environment. 

analytics ~ $ sudo -u analytics EDITOR=nano crontab -e


htop


2017-08-24

Tableau alerts: maybe we do want to enable these right away upon upgrading to 10.3?

Tableau desktop licenses just got refreshed. May need to refresh these in TD (Go to Help, Manage Product Keys, Select the row, Hit Refresh)

Why Illumina ETL has been intermittently failing to process runs (Summary.txt parser, involving John Walsh). Encryption requires random number. JDBC driver creates encrypted cnxn to Oracle DB and then the script runs. BUT the randomness requires server activity, which is low at midnight, so the driver/JVM hangs/waits, and then the script fails. Nasko added a parameter to tell the JVM to go ahead even if the random number isn't generated. Chaos and randomness can be a good thing (smile)

Fargo outage Sept 1-5, no access to /seq/fargo_picard_aggregation (doesn't affect us)

Ruthless culling of Tableau Server has begun. EVERYONE: Go through Archive and delete your reports, back up offline what you want to keep.

Improved Tableau admin report: https://tableau.broadinstitute.org/#/views/Tableaureportusagestats/Tableaureportusage

  • Unable to locate Jira server for this macro. It may be due to Application Link configuration.  UPDATED

2017-08-17

Q30s are still all over the place, and this is happening to Illumina customers around the world.

Lab process for exomes & genomes is changing to hyperprep. Same LCSET type. ALL NEW PRODUCTS. LCSET protocol field should help ID them.

Tableau development is needed for Blood Biopsy & for getting NovaSeq into Walkup reports – MM will look into it

Amy's been on the Artel report for Scott & Dave, adding parameterized tolerances

Amy's been on the GP Process Dev group piloting google groups as JIRA user. This could be really useful for us!

LOJ upgrade coming soon (on LOJtest tomorrow)

MM dealing with lots of supporty tix (missing data, refreshes, etc)

Rachel helped Erik with some API updates

Repatienting – lots of effort on the improvements. Full time this week working on actual tix. 

TeamsID: folks need to get access

New unix role account 'analytics' (sudo -u analytics) + dedicated Unix server 'analytics' (ssh analytics) This will make it easier to share administration (e.g. when Nasko's on vaca)

Zach's been working on CRSP stuff. 

A few new Tableau users we need to keep an eye on: Sam Bryant (Stanley Ctr), Iris Fung (Cancer), we should monitor these.

Zach will do a Tableau101 BITS thing for development 

2017-08-10

Nasko is talking with GreenTeam about SSL. We're ready to switch when they say it's time.

Nasko is looking into enabling our ETLs in UNIX.

Chris is looking at CRSP logins & DMs for standardization (analytics DMs, but blacklist info requires CRSPReporting user). Mariela knows a lot about the DBLink etc, but she pushed that to Nasko. Nasko made new Oracle user called Analytics ETL, used exclusively by ETLs, and that user alone will have a DBLink to the CRSP DMs. Hopefully this addresses JT's concerns.

Chris wants to consolidate the two Picard Metrics workbooks

Steve's WGS metrics report is really hard for Chris to tweak, but she's working on it. Maybe an action filter will provide the designated library?

Kristen working on new CCLF features, Infinium improvements that may require Mercury changes, 

Rachel & Kristen working on repatientings, RNAQC, 

Rachel developing scripts for Erik to help automate things for him.

Rachel & Amy working on BQMS workflow (tracking problems). Wendy going on vacation, so it may go on hold for a bit.

MM: Mercury bug: Mismatch between Event Fact & Run API saying what's the plated sample ("Pipeline API" may be actually what it's called)

R1/R2 metrics for intensity needed in lane agg tables

Amy heard from Cole that they may need help communicating between the walkup lab and the Regev lab.

New password manager TeamsID

2017-08-03

Clinical genomes: Running alongside regular genomes (Project Vitruvio). They should have the same criteria for completion, so we hope to use the same reports for both, rather than creating new reports for clinical.

Stacey is working on the new TopMed proposal, hoping to get as much of the $53M as possible for the next few years.

Q30 on HiSeq is still a problem, and nobody knows why. We still want read-based Q30 from the pipeline, even though it's two more columns for our datasources.

RPT-4080: Novaseq for Walk-up? Missing metrics... need to migrate the report to the new datamart.

DB Links to Mercury Production and CRSP DB: JonT was alarmed by these and wants to know more (bc CRSP access). The CRSP data is only metrics & blacklist info, no sensitive metadata. 

  • Note: CP ID is what, exactly? 
  • Nasko suggests JonT create a new user to lock it down
  • Mariela suggests telling him which tables we use and getting our access from seq20 limited to those tables

ETL agents are being updated and standardized beautifully. 

In the future, tool owners should do password changes before vacations, rather than during, to avoid errors (especially the silent ones).

  • Nasko might find a way to run ETLs from a UNIX cron job that wouldn't be sensitive to password changes.
  • Zach and Kristen both also have password-dependent tools.

Metadata changes: JIRA ticket development in progress, SQL-generation script is working, PO-ticket-generation script is drafted.

  • let's double-check that root sample is enough for us and we don't need PDO sample in the CSV request submission file

Updated Mariela & Nasko on the changes that happened when they were away (LIMS reorg, DSP→GP switch, bsp_sample update problems)

2017-07-27

Leber Unplugged: Zach had a great vacation and stayed offline.

Kristen will deal with the heartbeat agent after the meeting and manually push the ETLs

Unassigned tix & related convos: 

  • RPT-4221: Blood Biopsy. Does Mercury have any insight to this stuff? Can the viz be there instead? 
  • RPT-4172: needs mariela. 
  • RPT-3082: KC link it to current extractions report ticket and ribogreen 
  • Linley's stuff is still showing as in-progress, but is now unassigned. KC will put them on hold.

Mercury 1.8 had a 15-day TaT. Awesome.

Mercury 1.81 has 18 issues and counting.

New Walkup LIMS was released. Walkup 1.6 is on its way. Mercury may build something to talk to Walkup to help transition their statuses and create PDOs (standalone web app at the kiosks)

2017-07-20

Artel QC: Scott wonders if we can build a time-based improvement to our Artel Tableau, but there's a vendor software that works for that, and it's quite sophisticated.

Erik upgraded LOJ to JIRA7.4, and Amy's testing it. (big improvement: the 7.3 introduction of people-can-change-workflows is permission-based)

Clinical genome batch tracking – hashing it out with Tim to determine what's changing. Goal: when samples complete SRS, plating team needs to be able to prioritize clinical samples.

Chris knows a nice blog page about LOD calcs

Chris has been swamped with bugs. 

  • BSP Seq Plating datamart: RNASeq & some Exomes: PDO & PDO SMIDs aren't showing up as associated with the plating IDs until they get associated with a seq run. Status goes straight from ordered to sequenced. (Is this related to SUPPORT-3153???
  • External product vs internal products: pricing difference - a FC is showing both part numbers, but it won't refresh. Some other weird problem with multiple batches associated. (raw vs mapped product #)
  • Lots of maintenance tasks preventing from getting to things like Stanley Center quotemaster

Re-patienting progress: 

  • We're making a CSV template to simplify/standardize PM requests, with all necessary info. 
  • We're going to make a new JIRA ticket type/workfow to cover all the LIMS/Analytics work
  • we're working with the pipeline team to simplify our watching of PO tickets. 
  • Rachel built an awesome script using that CSV as input, creates all the SQL queries we need to check our datamarts.

Infinium: new reports are all live, the lab likes it, Quality Reporting summary, etc

New extractions datamart is ready, reports will be live soon.

Rachel brought donuts, and they're delicious, but I'd trade them for Nasko, Mariela, and Zach.

2017-07-13

So many weird things

CLF JIRA data not updating in published datasource even though the extract appears to have refreshed successfully. Amy saw something like this once with JIRA data for Sadiya.

Chris has been dealing with this for Andrew Bernier... Is there a problem with the caching? 

Walkup is no longer a problem. 

KC will get more info from John Walsh about the cron error he's seeing for the Summary.csv parser.

BSP_SAMPLE and bsp.ANALYTICS_SAMPLE tables have a new index on parent_sample_id and it turned a stalling zombie query into a 15-second query.

Those tables are not updating for all fields... e.g. 

Speaking of the power of indexing: JIRA: commenting or otherwise editing/refreshing/updating anything changes the index and triggers the DB to reindex. If someone can't see a ticket on their dashboard, but can navigate to it, that's a symptom of index problem.

Amy waiting for feedback on Clinical Genome stuff

Amy working on getting positive control for fragment integrity QC. 

Rachel's SQL creation script is ready for testing!

2017-07-06

Mercury 1.79 was released.

New CRSP portal has been released.

In attempt to simplify GPLIMS releases: two new JIRA LIMS ticket fields: testing strategy (filled in by developer) & link to epic

Amy & Erik are working on a new JIRA upgrade plan. We may need to push it off until Nasko is back to ensure proper schema changes etc

BITS shutdown Sat July 22. We may need to do some checks on Sunday

  • we may need to get in touch with Nasko about the analytics-etl VM and to ensure that either we know what to check or he is on-call to check it for us

Amy has some stuff on hold (eg BQMS)

Rachel is waiting on Jim to find a bucket log example before we can create a system to 

  • Broad negotiated a massive discount with Google Cloud (95% discount, costing us half a cent instead of ten cents per GB egress charges)

Products Team Meeting notes – links to Mercury release-page – links to GPLIM Release list of issues 

2017-06-29

DataOps Library (Kristen)

Sample Coverage complexity follow-up (Zach/Christina)

Taking on the array_process_flow table/ETL from Merc team: August

1.79 is going out tonight!

ETL Illumina raw metrics off their Analyzer Studio software into ILLUMINA_READ datamart (Nasko)

  • This pulls read-specific Q30 metrics (pulls all including walkup, but doesn't pull from CRSP runs)

GPINFOJIRA has been updated

2017-06-22

ADP is finally going to its grave! Workday is coming!

Broad HR assessed going to unlimited vacation, but found that unlimited vacation leads to anxiety and people actually take LESS vacation.

Strategy Board

  • 10X process machine (Pippin Prep) was replaced and it's working now.
  • major Q30 drop

Quarterly goals document is pretty set up. If we change anything, let Zach know ASAP

Tableau debugging: RPT-4182 PDO Tracker: datasource suddenly seemed slow whenever Sample Coverage was being displayed.

  • turns out we were wrong wrong wrong about calculated fields in published datasources. They're not fully resolved until run time.
  • we're hacking the data to customize conditional formatting.
  • Layered calculation with 150-piece IF statement pulling a 150-piece IF statement field. Fixed it by reducing size with OR statement.
  • This field has been the main speed problem for 4 years, and caused optimization efforts elsewhere, but fixing this fixed a lot, so those efforts were unnecessary.
  • Hopefully if there are other complex fields like this, we can find them using Interworks Audit software.

Cleared out some backlog tickets as a group

  • Stanley Center has their own BI specialist now, so we're closing RPT-3661 as won't fix. They'll have to find another way.

2017-06-15

Strat Board

  • Seq Q30 problems confounding (or causing?) the pool-test problem. Forest for trees?
  • Overhead pricing is going down for Broad Genomics, so GP prices have to go down
  • Clinical Genome (aka Project Vitruvio): want to run it through all the same production tools, meaning that we need to document our tools better (e.g. pooling calculator, RapidQC, topoff tool, maybe more, especially anything that's up-front). If the tool doesn't work properly, and it delays getting delivery of data, it could affect patients. This may encompass input for our tools highlighting bugs in output of other systems, like Mercury
  • Novaseq is not ready yet
  • Blood Biopsy is currently being tested in Mercury. Basics for it are in Merc 1.79

Quarterly goals: everyone should double-check it and plan for next quarter. Don't be overly ambitious, as we don't want to cause disappointment.  

Consent withdrawal: Kanderka continued seeing in Tableau the path to a BAM file that had been deleted. We need to communicate with PMs 1) that BAM path is constructed in Tableau and may not be "real" and 2) there are contingencies like the Billed=True Friday refresh timing that may lead to lingering BAM path.

SAP extracts (not others) are not refreshing: still problematic in Server 10.3 (Zach, RPT-4157)

ZombieChecker updates:

  • see historical performance records for a given query from weeks/months ago (instead of ETL-dashboard's 7-day limit)
  • new SQLMonitor - how to see Oracle's insights about a running query
  • note: zamboni query that looks duplicated is actually different bc ":1" & ":2" represent parameters

2017-06-08

BSP zombie queries are bringing down BSP/GAP. Looks like a frustrated user repeatedly attempting a complex user-defined search. Is that user being taught, beyond a broad "come to us" message?

pipeline team doesn't want our analytics.  BUT dave G. may have put in a block to prevent >200 picard requests at once.

hyperprep problems in mercury (incomplete) may have now been fixed.

Novaseq is starting. Mariela will check Tammy's FC suffix guide. Amy has done FCT work. Walkup uses files. Let's check on the impact and our parsing tools.

Chris is dealing with an erroneous product goal (target) which affects DCFM etc.

Mariela looking at risk assessment & success rates

Amy might take on the task of writing a script for FCT picard triage.

CCLF is quite heavy right now. We may need to become the PO on this, if Robb doesn't have time.

Zach's looking at FC yields. They're bad. Way too many top-offs. Avg yields are declining, and production lanes are performing far worse than pool test lanes.

2017-06-01

SEQPROD was down this morning and there are a lot of picard connections happening again. Is this what brought it down? 

Strategy board:

  • Same fires. No unusual/critical updates for our team.

We will look at the ETL for potentially taking on the ARRAY_PROCESS_FLOW table from Mercury.

We need to check with Betty and BSP about patient consent withdrawals. MM says there was a problem with marking things "Deleted" or something... 

2017-05-25

Strategy board:

  • new Fargo File System (for large scale data) is not performing as expected. Could slow down sequencing

Pooling Calculator seems very promising to users, still copy/paste into Mercury is involved which may lead to errors. Web-service call is being considered to avoid copy/paste.

SRS project being used by interns - may need more help.

RE-whatever, Kristen demo-ed a tool showing the substantial impact (in terms of # JIRA comments and people involved). Resources being wasted hopefully should be brought to leadership attention so that clear "time/money/future-features-being-developed" tradeoffs are enabled.

SampleTracking tool (Chris) shows some unusual behavior where sample suddenly jumps few steps ahead.

RQC-Metrics issues are being worked on (Mariela, Kathleen)

Amy is giving up to the Dark Side - switched 2 Scala ETL agents just this week and craves for more

CodeAssistant (Nasko) now allows less painful filling in of desired JIRA fields. "Web-service" vs "DB" based interfaces are offered. DB one also provides extra info about "project" and example JIRA issues where given custom field is being used.

2017-05-18

Tableau parameter actions (https://tableau.broadinstitute.org/#/views/ParamActions/ActionDashboard)

2017-05-11

Strategy board: 

  • PDOTracker is the new fave report. 
  • Primary lanes, tight pools, great. BUT there are still some topoffs, so there's improvement needed there to avoid busting the savings.
    • Close to turning this on for all genomes.
  • New SOPs for cloud data delivery
  • Blood biopsies: the biggest challenge is 2 Root IDs for a single sample. 
  • New Arrays chip types getting added to Mercury
  • Extractions/SRS: SRS team will now be in charge of XTR samples.
  • Lane-level metrics: ummmm... something about pulling from RapidQC if it's not in the other place?
  • Erik has created an SOP (AS-145) for how to unlock JIRA when there's a lock-down/CAPTCHA problem

2017-04-13

Strategy board: we can take this chance to make our work more visible. Zach will present pooling calculator and a few other items. Kristen can discuss Infinium.

  • Let's not be "the end of the process"

Quality reporting: Project "Simba" (seriously, that's what they called it). Looking for volunteers to come up with training & best practices for African labs to do sample collection/extraction and send the samples to GP for sequencing.

Empty room in our hallway: let's do something with that. 

port old training Twiki to Google Site or Confluence page before April 28 – Amy looking into Confluence options for porting

Tableau 10: Let's make the move this summer. Everyone should start testing it.

GTEx: they use BSP aliquot ID to track their samples, but the 1% of samples that fail tracking have become a problem for them. 

  • One problem is incorrect mapping of sample to PDO in uploaded form
  • MM assessing whether we should switch to a new datasource, and then we'll tell Ellen the decision.

Infinium table (array_process_flow table in mercury): LIMS is questioning why we need this info in this format for Infinium when we don't have it that way for sequencing. The concern is the ongoing maintenance of such a complex table. We should always keep in mind whether there are simpler ways (e.g. JIRA) to get this type of data (in this case, there isn't, but perhaps we can help maintain it?)

Sharing our data: 

  • We've allowed Sampath access to the CCLF view in reporting@seqbldr. Nasko wants to send him info to ensure he understands the potential impact on our production JIRA.
  • Analyst working with Tiffany: using tabcmd
  • If we have someone who needs only JIRA data, the TDE service could be a great option.

MM has been working on Illumina Picard Metrics performance. Also on email/maintenance. Also on GTEx. Also on Parallel Programming course & Data Analytics w/SPARC.

  • Need to correlate pre-sequencing risk factors (mining past Merc tables for transient riskiness) with sequencing success. Assess effects of risk and determine predictability of success.

Amy's trying to help the Dev team with JIRA: DEV JIRA isn't really Development stuff anymore. They want more oversight on the DEV JIRA. BQMS JIRA will help.

2017-04-06

Strat board more focused on 320 work

Get Tableau 10 (probably 10.3) in production by summer.

  • Tableau-beta now on 10.2.1 as of today
  • 10.2.1 includes fix for legacy data connections to Excel incorrectly showing the Tableau refresh date when it should show the file update time.

Zach showed how to use LAST() as a quick table calc filter 

Zach showed an example of using RAWSQL_STR in a calc. field to use Oracle instead of Tableau for string concatenations. Time reduced from 30s to 1s! 

2017-03-30

Strategy board is being redesigned, supposedly with more focus back on 320C work

BSP support will be really minimal for a while, with Damien gone and Phil out

Renaming etc is an issue. We don't want to become a service-only organization. 

Let's close the unassigned tickets that aren't relevant (e.g. analyze access logs for Google buckets, maybe some SCBB help)

GTEx: Do we need to extend the plating ETL for a small selection of cases? plating ETL, aliquot ID, rerunning LC/seq (linking aggs to diff aliquots from same stock)

  • it's on us to explain the reliability of the BSP aliquot IDs, dealing with whole-tube-handoffs, etc. before the meeting 
  • we either fix the ETL or we figure out why it's too expensive to fix it and explain that to Ellen Gelfand & Kristin Ardlie

Rachel will be starting June 12 – let's collect info to share with her

2017-03-23

Several fires still smoldering at the Strategy board meeting. 

  • Infinium looks better: is this bc cloud? or is it comparing apples to apples and Row H is okay in GAP now?
  • RNAQC ribogreen is critical
  • FP: reagents we have are problematic lots
  • Pipeline is down?! Functional Equivalency Pipeline (compare our old data, baylor data, etc): Baylor says it doesn't look good, so there's a bug in the pipeline? Halted analysis.

If we do repatienting or reprojecting, ensure that the JIRA ticket has labels (repatienting, reprojecting) so we can track the resources going toward these and begin costing it to charge for that work.

  • renaming: collaborator sample ID etc – label these, too (same label?)

Niall starting to look for computational biologists (Translational Analytics Group) (trying to make a more direct impact faster)

Reviewing quarterly goals – we're close. 

2017-03-16

Snow day Tuesday, no strat board updates.

Google Cloud Platform 

  • column-optimized (instead of the traditional row-optimized database like Oracle). slower inserts, faster analytics.
  • Tableau bought Hyper (German company) two years ago, silently integrating that into the data engine in Tableau, it's supposedly both column- and row-optimized?
  • New arrays work is going into a Cloud SQL DB
  • We're using Big Query for our billing data

BILLING DATA: Past two years of billing data required much work from Lukas (e.g. collapsing repeated fields, aka labels), which led to 3-5 day latency, resulting in a BigQuery table. Now, Google Big Query will allow us to get billing data every 6 hours.

  • codeAssistant, new features
    • built-in data preview (cheaper than running the whole thing to check the output – only returns the first 50,100,500 rows.)
    • regex tricks (how to deal with hidden characters, regexTester – linked from code asst page)

2017-03-09

  • STRAT BOARD UPDATE: 
    • Q30 still a problem.
    • Chemagen is acting up again.
    • Still seeing row H problems in infinium. May be one bad scanner.
    • For every dollar we spend (e.g. on reagents), we have to give fifty cents to Broad for overhead. We're approaching this problem from a few perspectives.
    • Cloud (genomes, arrays) almost there. Arrays are being compared (Ben Neale) old vs new pipeline.

Repatienting has been happening a lot lately. How can we cost this? It's all on the engineering side, just like patient consent withdrawal. Let's start tagging and counting all the relevant tickets to start gauging the real cost. Let's stop the boiled froggy effect.

Lots of unassigned open tickets again... let's try to crank through some!

JiraExtractEngine improvements

  • WDCDB: delivering large datasets (tens of thousands) - "streaming" technique
  • extract structured data from JIRA - parseHtmlTable operation
  • decode a Normalized Url
  • early JQL validation in codeAssistant
  • expose hidden non-printable characters(LF, CR, etc...)
  • split String into List with regular expressions

Amy would like feedback on the 'waiting for client' status.  RPT-3758 - Implement "waiting for client" status in RPT JIRA workflows IN PROGRESS  It is testable in gpdevjira and ready to be implemented in production.  She'll implement it on the RPT & SUPPORT projects.

2017-03-02

New Req is open (2995)

Outage or something may have affected various things weirdly in lingering ways? Nobody knew, might be coincidence, ~2/18... various login problems and such. Restarts of JMS, DBs, etc seems to have fixed. Might be reason for our ETL failures?

Clint Beilman is going to be working with Dave Gregoire as more Oracle support, among other things. Yay!

SeqMinder access? RPT-3959

  • This is all set now, it was due to a publishing error.
  • Permissions for Tableau projects – admin control vs publisher control 
  • when someone publishes, they can't alter the permissions, they're just controlled via the server interface and locked by project

Storage for some things (CLIA-compatible) may be moving to Mercury soon, so we need to be part of that convo 

  • blood biopsy extractions process in mercury/bsp (typical convo about whether to build CLIA-compatible for all samples or have separate tools)
  • this could be okay, we'd just need to alter SRS, but it should work.

updating confluence: where to make a note when we find problems (e.g. broken links)??

Difference between WGS & RAW metrics? Not documented anywhere. Revive our plan to document metrics?

  • picard glossary exists. maybe a new confluence page linking to that and any other definitions locations?

New project for SCBB? Yes, let's say we'll do it, but not prioritize it higher than some of our critical GP stuff (e.g. extractions) ( Unable to locate Jira server for this macro. It may be due to Application Link configuration. )

  • let's keep track (roughly) of how much time we're spending on external projects.

we need to purge our old tickets and unassign things from our personal to-do lists

BITS outage this saturday 

dave bernick: devops/security: there's going to be web traffic monitoring & some blocking. Also abandoning Sophos and replacing with something better. 

Still beta-testing PDO tracker. 

Tableau User Group Meeting: people agree that the best way to train users is 1:1, but that's expensive. Still, maybe worthwhile (e.g. for PDO Tracker)

2017-02-16

LCSET control page works in Mercury/Lab/LCSET/Controls: rack scan (no longer need to fill out BSP WRs)

Amy's working with JonT & JWalsh 

Mercury usability on bucket page (doug gobron in lab, tom & jessica?)

  • 80x genomes for NCI using 2 aliquots per stock ID! This messes with our sample counts & messes with billing (0.5 per sample)
  • now it looks like there are two identical LCSETs in Mercury
  • GPLIM-4636 sort of handles this awfulness
  • bug in Mercury??? 
  • and still need to plan how to do topoffs & reworks in Mercury

AOT dashboard showing Tableau/JIRA integration and comment searching

  • This takes the place of RPT-3423!
  • Searching comments and showing counts of how many samples "should" be allowed

Zach shows similarity to PDOTracker sample finder in Tableau combined with the "last comment" column in JIRA UI

New Req is published! Hopefully we'll get some great candidates...

2017-02-09

No meeting week

2017-02-02

Yossi & KT say the missing read-level metrics exist, but there's no response on the ticket. 

  • There's a communication gap somewhere. Who doesn't understand what?
  • PO ticket shows our needs. The green team might need to write a new script? Tim wants them. Maura wants them. But where are they? How do we get to a point where we are getting them?!

Compliance AND quality: Betty is now in charge of all compliance AND quality for EVERYTHING in BG. Kara is assisting.

  • Compliance: adherence to outside standards. 
  • Quality: adherence to our own standards. This is our focus. 

Arrays are close to going live, but not quite there yet.

Tableau Server 10.1 allows cut and paste of tooltip fields!

CodeAssistant in action - PRISM project

2017-01-26

General BG updates

  • genomes are moving to the cloud. There's still a ticket requesting read group metrics.  PO-____
  • Infinium minor fire still in troubleshooting. 
  • the pooling calibration plan is going to be implemented across the board
  • maura testing a new batch of molecular barcodes – is this toward novaseq?
  • Getting rid of Squid: 
    • Tom Howd showed Jon T what they're doing in squid that they can't do in Mercury
    • Mercury User Defined Search is the answer for pretty much everything. 
    • Tom & Zach will summarize the notes and get a demo of UDS 
  • Whoa nelly, Mercury is getting overloaded? there's a 20K sample PDO, and not all the data got into the DWH, and it takes Merc 1h to load so they can press a button that takes them to another 
  • PDO dashboard (Zach/Christina): new demo prototype (replacing PDO Sequencing Samples 2)
    • Zach did a dashboard "the way Tableau has recommended (filters by action, fixed size, etc.) SUPERFAST!!

2017-01-19

  • Strat Board: no meeting week: Feb 6
    • moving genomes to cloud shouldn't happen until picard is ready (some read group metrics still missing from 6mo ago, e.g. Q30, R1/R2 phasing) 
    • ticket to reanalyze 44 genomes for Stanley using RapidQC – this'll make some metrics disappear??? Chris & Mariela will check this.
    • WGS in Mercury soon (when we get some genomes in). This'll expose any problems in Mercury UI, in data access, etc.
    • KT goal on submissions this Q: cloud submissions to NCBI
    • Arrays: LIMS & reporting on track.
    • Exome cost reductions on lab: later reduce sequencing, but currently focus on faster LC & fewer materials
    • Axygen tips to replace Agilent tips: that'll save $250K/year?
    • Broad genome prices are really high, comparatively, but incorporates things people don't recognize as missing from competitors. Higher quality, more data, storage, delivery, BAM files, etc.
    • Compute & storage is way overbudget
  • JIRA Extraction Engine - Code Assistant makes it much easier to create JQL URL for JIRA API (as presented on 2017-01-05)
    • note that if you want to do further transformations (e.g. turning a carriage-return list to a list), you'll need to include extra code in the Code Assistant fields
    • can add things like output=XML at the end of the URL to see JIRA output
    • works across all instances of JIRA (cloud, on-prem, etc)
    • We need to update all our tools using this! 

SeqMinder updates - moving the aggregation to the view instead of in the calculated field - made it much faster by not pre-aggregating (40 sec to 7 sec)

2017-01-12

Strat Board updates: Run-through of what group does what in DSDE/KDUX, Run-through of NovaSeq

Kristen will attempt a reorg of Confluence pages, trying not to break any existing internal links

Participant consent-withdrawal: Mariela will add a few notes to either the Slack room or a new JIRA ticket or GoogleDoc 

ZL trick w Tableau & Excel, using parameters and calc fields for a scatter plot with user-selectable metrics on both axes

Jonathan Drummey sent a tableau community forum user to us for help with Tableau/JIRA interfacing. Amy used LOD calcs to solve the problem.

2017-01-05

New JIRA Extraction framework (Nasko) Jira Extraction API v2 (Tableau Edition)

  • Request, API, adaptor
    • 1) Request can be articulated in different ways (e.g. URL, ETL etc)
    • 2) new tool produces a scala stream output. It's client-agnostic. 
    • 3) Scala output is adapted to desired format by one of a set of adaptors, determined by how the request was sent.
  • Fields can be manipulated as in v1, but w/ different syntax: Existing API queries will need to be ported and their syntax adapted.
    • Aqua: starting field. Dotted line: assumed type. Straight line: specified transformation. Grey field: final options
    • All fields are exported in string format, so non-strings require syntax specs. 
    • New option:TO:BITS 
    • New option: strip HTML (though html for links can be useful)
    • APIv3 idea (smile): auto-transform to the original JIRA field formats
  • Still have the thousand-record input limit from JIRA
  • Zach will test it on Skylab (tick), then we can plan a total port-over

2016-12-15

  • code review for arrays: Nasko will help look at the query. Chris will help look at whether a published datasource might help.
  • consent withdrawal: New slack room pt-consent-withdrawal. Betty likes the idea of this conversation.
  • TIGER ETL can be used to turn arrays code views into tables once they're solidified
  • SAP changes (out of quoteserver in a few months). Pricing easier will help make it easier to work with companies for projects more insulated from NIH changes.

2016-12-08

  • Offices
  • Sample removal of consent: owning it, doing it, whole tree, etc. (SUPPORT-2345) 
    • Maybe a new ticket type with all the groups in the workflow? Let's get comfortable with the process. What tickets cover it? (RPT-3551)
    • How do other seq centers deal with this? Let's ask Betty.
    • Slack group for figuring it out?
    • KC will pull together preliminary info
  • Tableau Server VM migration from Somerville cluster to Boston cluster: really complex and ugly, but Tina has been helping a lot. 
    • Friday 4pm. Turn it off, unlicense it, move it, license it on the new hardware.
    • TS-beta seems better now – maybe we'll see an overall performance improvement?
    • lab/tableau is going away. New user charles/tableau_auth
      • Need to make this change everywhere.
      • Tableau SAP situation is complex too, so we had to make changes there.
    • Next we'll push for more RAM so we don't run out on Wednesdays
  • Job families: Analytics Engineering & Software Engineering job families
  • Field trip Amy & Zach went to FC land. Tammy uses many Gdocs for designations. Still major room for improvement here.
  • Nasko building yet another JIRA interface (more powerful, easier to use – explode etc)

Zach: Dave Zdeb: preliminary results on pooling are good - we might reduce lanes and therefore reagent costs (which are higher than salaries).

2016-12-01

Amy helping seq lab with designations (to get their work started quicker). For ExEx FCs Tammy prints the JIRA ticket and lab manually types the ticket key to get Hamilton messaged correctly. Amy added a barcode for scanning that onto the Hamilton. 

  • Next step might be identifying the new most problematic step in the workflow. Zach & Amy can go watch (this happens every day). They may need more scanners.
  • The recent FC swap problem was complex and theoretically could happen again.(question)

JIRA ran out of DB connections with low memory this morning: lots of groups were doing bulk ticket transitions (SRS many consolidation tix, etc). Erik restarted (no DaveG DB intervention required). 

KC working on arrays again

Yesterday's career brainstorming session was just awful

2016-11-23

TC16 interesting talks

  • Zach: Florida Hospital (for RFID tracking, rigorous process implementation, process improvement and result - more than Tableau details)
  • Zach: two developer portal programs by/for Tableau 10: Scout (assessing workbook complexity, usability, performance) & Auditor (performance)

Amy: using context filters allowed correct cascading/relevant-only filters in VolumeCheck2.0 (but there's still possible optimization using action filters)

JIRA/JSON Tableau reader: Amy was able to load JSON. Chris also succeeded 

JQL-SQL plugin for Tableau 10.1, instealled on jira-dev: https://marketplace.atlassian.com/plugins/com.kintosot.jira.jdbc4jql/server/overview

Nasko year-end goals: make Tiger leaner (functional programming improvements); cloud tables for ArraysQC in good shape in staging DB

Amy year-end goals: FC tracking workflow (mercury designations too) - easier lab management: linear barcode scanner plugin dev with jowalsh?

Chris year-end goals: genomes & CRSP stuff. Ideally: replace PDOSeqSamples2 with something brilliant

KDUX/ScottSutherland may want Tableau advice for PMI 

KC year-end goals: arrays & Interworks server audit

Interworks audit: Lots of room for improvement on TS!

  • unused/outdated datasources
  • Steve has too many unused worksheets & datasources hidden in his workbooks

2016-11-17

No meeting: too many people out.

2016-11-10

No meeting: Tableau conference

2016-11-03

No meeting: "Quiet week."

2016-10-27

Arrays: we're on track. Still need a bit from Mercury team. Still planning a few things. We can expect real data & final validation in about a month.

FP: Jim says we'll have a decision of sorts by the beginning of next quarter. JonT will build either a Fat Panda workflow or a rebuild of GAP for Fluidigm.

Mariela's busy with user requests and such.

Amy's an admin on cloud JIRA without permission to act as an admin at this point.

2016-10-20

FP maybe working again. Same story: Not under control.

Merc user meeting yesterday: Dennis, Eliot, Dave, John all presented new features for the next 2 releases (Marcia testing all of it)

  • SAP stuff maybe in the next release
  • The big push is Arrays, ten SAP (two major things at once again here)

Mariela demo of Pass through Tableau functions preference over LOD. NFC LOD Custom Query in Daily FC Tracker Performance Demo

  • CustomSQL is slow: RGHQS + Metadata + JIRA (CustomQueryLOD: 165sec)
  • N Flowcells Pass Through calc field: RAWSQLAGG_INT("count (distinct %1) over (partition by source, lcset)", [ Flowcell ])
  • How to recognize the need for a pass-through function in Tableau: if you're using a LOD function that's making things slow.

Amy Atlassian round-up: 

  • Cloud-only developments for a few things (more users seem to be on cloud these days)
  • Life-science group (Illumina, Eli Lilly, etc.) Rapid boards
  • A guy from Eli Lilly was really impressed by our roboview etc, triage & time-tracking
  • Emphasizing more personnel in JIRA; opening more admin stuff to others in 7.3 (e.g. custom field development), but we'll need to ensure that this doesn't become a 'too many cooks in the kitchen' situation

Science primer was good, worth watching the video.

Strategy board re-planning continues

Nasko cloned BSP.ANALYTICS_SAMPLE to ANALYTICS.BSP_SAMPLE (no latency)

2016-10-13

Sarah Calvo course looks interesting but is WAY oversubscribed

  • FP on Fire again!!! 
    • Fluidigm is coming in to help.
    • Fat Panda optimism is unhindered
  • MEGA Chip maybe fixed but there's a problem with the data not showing updated (new autocall not showing in reports/LIMS reports)
  • ARRAYS
    • Green Team "done" end October
    • We'll need a plan for data crosscheck etc.
  • DATABASE PROBLEMS
  • LabOpsJiraTest schema didn't get copied to dev correctly, there's a bunch of stuff we have to deal with

Ancestry.com (we do limited genome stuff for them) is so happy with our work, they'll keep sending to us

"Secretly rolling in a bunch of new loaner sequencers" (TimD) (secret capacity)

  • 150/day more?
  • Do we have the analysis capacity for this? We've had some days of near-zero in cloud analysis?

Thurs Nov 3 BTUG here

Mariela will be on health leave Nov 14 - newyear

Chris MKD updates with Hayley

Chris optimizing Stanley Center sample mgmt code using Mariela suggestions. Lots of eyes. Case/Control parsing is tricky

  • Discuss with Mercury team 

Whenever possible, we want to spend our energy fixing fundamental workflow problems rather than creating error reports (e.g. designations, "enabling" problems)

  • Mexico idea: PDOs that haven't quite been closed out and what samples need to get hurried
  • Reinventing the wheel

Nasko fixed Spark agent for arrays.

Mariela working with Tina for PDOs with missing metrics (pipeline agg metrics not getting copied to local DB from cloud via pipeline team)

  • This is another place where we're creating a method to find holes!

Mariela genomes in mercury: pool test flag now in QA

Mariela ExEx plating now in PDOSTAR

KC all arrays all the time

2016-10-06

Fingerprinting situation:

  • Fire #2 fixed again, but wasn’t clear why (just like Fire#1) (so probably not for long!)
  • Fat Panda isn't working at all right now, so they aren't yet setting up the automation

WGS team waiting for a Mercury fix and a RapidQC fix before submitting another LCSET

Big revenue month (Sheila: “we’re doing really well financially”) (who is "we"???)

Arrays: moving forward nicely. We'll add Analytics schema to SEQDEV3, and we should ask Dave to include that in the Prod/Dev backups.

  • last night the spark agent in the cloud DB got stalled, so Nasko will fix it.

Amy demoed "Create and Link multiple tickets" feature she used in the CCLF JIRA using a new feature in the C&L plug-in

Zombie checker is now checking JIRA & BSP in addition to SEQPROD

2016-09-29

This week's Mercury presentation:

  • JT presented Squid vs Mercury
    • Squid cared about: samples & libraries
    • Mercury cares about: Events & plastic
    • Squid used kiosk: tracked GSSR, started workflow
    • Mercury: no kiosk, no GSSR, workflows are not stringent
  • User-defined search is deliberate, has parallels to SQL (group by, where, etc)
    • The first time is tricky to set up, but searches can be saved
  • FP is failing again. Ugh.
  • Brian's moving to DSDE workbench, we'll try to get someone else in the products group
    • He'll finish OSRP before going (office of subject research protection.... internal IRB) (prevent samples from getting sequenced without the right permissions)
  • SPARK: Apache Spark framework (Nasko)
    • Returns data in RDD format (pseudo-table dataset) ("Resilient Distributed Dataset") which are kept in memory
    • RDDs from various sources (JSON, Excel, etc) can be joined as with SQL, and SPARK takes care of the underlying hard work
    • SPARK is similar to Hadoop but came after, supposedly faster than Hadoop/MapReduce (Hadoop stores in I/O and SPARK stores in memory), but the two can work together or separately
    • https://databricks.com/spark/about (we're using Spark SQL+DataFrames, Spark Core API, possibly more, e.g. Streaming, in the future)
    • spark.apache.org/docs/1.2.0/cluster-overview.html 
    • Ted Sharpe will be speaking on Spark at SoftEng soon.
    • Running on analytics-etl VM, new agent ArraysQcAgent shows on ETL dashboard.

2016-09-15

  • Quarterly goals: we pretty much rocked it. Next Q: Arrays, SAP, Genome Pooling, (CCLF?)
  • HR improvements (new system)

Correlating Google use costs for DSDE (blend of Google BigQuery spend data (Lukas CSV email upload) & Oracle agg counts from metrics table, blended --by day: risky, requires scaffolding/padding with null agg dates; --bymonth: smoother but requires end-date curation in case data isn't in both yet)

Tableau RAWSQL functions: QPCR, Library Quantitation report. To QPCR data, added SLXRE_LIBRARY_LCSET table for metadata. Using aggregate function that sends an oracle function to Oracle. Benefit: you preserve your primary key, but can return more granular data per row to allow users to dig in and see what PDOs/samples/etc underlie each row.

Arrays: 

  • KC working on the join between cloud and mercury, but need Mercury to be further along first. 
  • Currently using Uname/PW from George. Need to switch cloud access to SSL and eventually use other credentials.

JIRA upgrade tonight to 7.1

2016-09-08

Published datasource in name of workbook, let's publish all the control workbooks on TS so we can all access them

Strategy board revamping (new ideas from Mexico are going to improve the world)

Fat Panda: Leadership keeps saying "next quarter" despite a lack of clear project outline and very minimal proof that the process works. Kristen and Jonna are going to work with Cole (who is working on a scale-up plan) to ensure that there's a list of everyone who will need to be involved, from lab to processing/pipeline (whether that's green team or JonT) to analytics & reporting. 

Arrays out of GAP: 

  • Nasko working on getting data from cloud to our local DB, mirroring the 3 tables in the cloud, using SPARC framework
  • ANALYTICS@SEQPROD
  • Kristen continuing to try to understand the PKs and the current data in Mercury

2016-09-01

Tableau Jira interfaces (Nasko)

Genome pilot set through Mercury update (Mariela)

Zombie queries notification from Aug 31 (Mariela)

Mercury Products Group

  • Releasing WalkUp improvements soon
  • Array Batches capability, using Mercury and BSP to split big orders into plates
  • Cloud data - need to test if getting metrics from the cloud is feasible (Google API + (undetermined) credentials)

Prism Group is using cloud JIRA and is interested in using Tableau to access their JIRA data

JIRA upgrades are ongoing.  Next will be 7.0 then 7.1...

JIRA has been locking up more frequently recently.  Continue to notice and report lock ups.

The Genome Team may provide an update on new pooling protocol progress in the next few days.

2016-08-25

FP: Steve's workaround is working. Desmet said in Strat board that Fat Panda should be happening next Quarter (we'll see). KC meeting with Jonna for a lab update.

Let's add Robb to the list of people who know that if collaborator info gets changed in BSP/Mercury, it may need to be manually changed in analytics DM. (re: SUPPORT-2028)

Nasko: WebDataConnectors + Oracle12 DB = zero-latency WDCs

  • www.lucidchart.com/documents/edit/

2016-08-18

Big news: We just created the FIRST LCSET GENOME IN MERCURY!!! 

  • LCSET-9712 (first of 40 LCSETs for this PDO)
  • PDO-9168
  • Squid BSP plating WR-30349
  • There doesn't look to be a GAPREQ ticket for this yet, unless it's set up very differently than standard ones. Let's check that.

JIRA locked up and needed a restart, which may have caused the captcha problem locking users out and/or the Tableau freeze

  • same user (squid superuser) is used between mercury/jira/squid. Locking that user creates huge problems.
  • We still don't understand the cascade of problems. Eric & Amy looking into it.

STRAT BOARD

  • Mega array is still a blocker. 
  • FP is up again, but w/ 3wk backlog. We're really stuck and can't help (still no data access).

Tableau 10 is available. Installed on beta.

Pooling calculator notes: if you include the filter card, it's SLOW. If the filter's there but not visible, it's fast.

ZL demo RapidQC to CLIA team (they're interested in using Tableau to set things). We don't have a dev system so he won't click anything.

  • Use some kind of rapidQC environment, OR in PDOSeqSamples2, the Cloud Agg Transfer File tab (file attachment). We'll insist they use this before we do any more such work.

KC looking at getting heavy production reports out of demos & test: Custom VIew of heavy use Demos reports

Nasko checking out Apache Spark

2016-08-11

STRAT BOARD UPDATES

  • FP still a fire. KC needs help finding/fixing GAP_FLUIDIGM_PLATE ETL
  • We need to test reports: Do genome LCSETs that went through the new Mercury 1.73 work like those in Squid?
  • SUPPORT tix: we need to be faster. 
  • Google Cloud data: Infinium probably will be in the cloud. How will we get this? Mercury needs it local too. Google Cloud SQL = MySQL 5.7
    • we or LIMS ETL group will pull it down. Nasko & JonT should discuss the plan. Our two ETL structures are very different.
    • we don't want them moving existing seq metrics stuff up to the cloud, and need to ensure they're clear on that. It'd break our systems.
  • Tableau10 roadshow next week, then BTUG the following week (General Assembly)
  • GitHub vs Stash: GitHub public server w/ private account (used by DSDE) has pros & cons. 
  • ZL: CloudSpend new report showing costs of cloud access. CSV file into GoogleBigQuery table (even though that's overkill) (even though Google PW access requires Google ID, which needs to change every 3 months... Nasko suggests using secret token) (Hollinger email?)

2016-08-04

  • STRAT BOARD UPDATES
    • Tatiana new admin with Anna
    • MEGA chip gender still a problem
    • Google Cloud notes: big call set over the weekend spurred a call from Google that we might take down the whole cloud with too much inter-node communication.
    • Another data limit: Daniel MacArthur huge VCF across tons of patients: 16Tb file output (not possible) (4-5Tb is max on earth) so broke it into 2800 chunks. 
    • Infinium in the pipeline is coming along nicely.
  • Pooling: MiSeq wasn't predictive of HiSeqX, so when they do a (blind) pool correction, they just go for it with that and then top off the undercovered ones. Leads to excess sequencing and delays. Pooling calc was altered and may have more improvements soon. Huge improvements.

Log parser: Nasko helping BSP look at BSP logs.  LOJ logs looks to have helped us ID a bad plugin

ZombieChecker now shows full SQL of any stuck run. Mariela suggests forwarding Zamboni zombies to the Zamboni group, if possible.

BTUG July 28 was good: Anthony showed a web connector method that uses PostGreSQL & R. Maybe we could use it. Then the sponsors talked about their "data lake" software.

ETL troubleshooting: let's document the troubleshooting process, even if it doesn't allow non-specialists to fix it, just locating the problems is important

Tableau overload on Wednesday afternoons: Zach will get BITS to help.

Marc Monnar is leaving to go to Foundation. Sanger asked Amy how we do some testing of JIRA as LIMS. 

Mercury release Mon/Tues, then LCSET genome data review will be Mariela's focus

Chris working on CRSP Portal dates, topoff tool stuff, PDOSTAR2 replacement

Nasko working on web connectors for Tableau. Are there discussions online about the downsides?

KC working on Infinium

2016-07-28

STRAT BOARD UPDATES

  • MEGA chip on pregnant Peruvian women: 83% of them showed up as male. MEGA chip rare SNPs may be messing up our custom algorithm, which clearly needs to be tweaked. This'll delay future MEGA chips too?
  • Broad Truck JIRA is up and running.
  • First post-Dinsmore CRSP Release: Success. CRSP Portal & DB changes 
  • Mercury close to 1.73 release. That'll allow genomes (at low scale) to go through. Mariela will review QC. 
  • Mercury is for tracking, but there's very little search/find/view. JT's putting that on Analytics.
  • Squid plan: dead in 6 months because of FISMA

Storage: Local (e.g. Isilon), Cloud, Edge (Fargo)

  • Fargo costs are extremely high. Unclear what access is happening. Maybe TCGA & other smaller projects? Fargo shouldn't be stuff we need to look at, but clearly we're accessing it too much.
  • Tableau: huge tables (60K rows) failed, so Zach set an "at least" size filter. (BASS)

SAP data GP Billing report

  • Breakdown of all monthly expenses per cost object/group
  • BW data format
  • SAP data in Tableau is weird and constrained.
  • Currently we only have SAP drivers on one Tableau machine
  • Now we can see daily updates to costs/budgets instead of 5 days after the end of the month

KC: Kanban board created dates and flags are pretty great

AB: Illumina wants to get their fingers in our JIRA data (FC fails, service reports, uptime/downtime, etc.  Similar to what we're doing with HiSeq utilization reports)

AB: Web data connectors in Tableau: need to be added to actual server machines (tableau.broadinst.org, tableau-dev.bi.org) which may be a problem if we want users to do their own imports.

CG: Stanley Ctr work ongoing with Diane et al, including sample status/progress, and new TV display. 

CG: DCFM bug discovered & fixed.

2016-07-07

Mercury team to release 1.72 next week(?) and 1.l73 end of July(?) 

  • Genomes v1: Let's get more info.
  • Pilot data will happen & we can compare Mercury genome vs. Squid genome.
  • We'll communicate that there won't be any pilot data debugging until Mariela's back (Aug1)

Mariela & Nasko reviewed troubleshooting

  • Links (confluence, etl-dash, MM googledoc, \\analytics-etl)
  • Phone list for true emergencies

June 29 sporadic refresh errors for PDO_STAR2 and Production PICO_REBORN. Still watching them. BITS tools (Zerto) don't show VM as overly busy, but Windows tool does.

AndyH email this week: Cloud usage

One-off reports and their correctness: Takeda example (KC)

  • We don't always know when users are basing decisions on half-cooked reports, when we're still waiting for feedback or have published a rough draft.
  • Let's always get users to commit to giving feedback before we begin the work
  • We're agile and not always QA'ing each demo thoroughly, relying on users for some of that
  • Publish in Analytics Test and give extra permissions to just one person until it's solid
  • Add a text banner to the top of demo reports? "Demo: Test only-- Before basing decisions on this, double-check with Analytics"

Troubleshooting user issues related to DM data (Mariela)

T9 rocks! Facilitiating Infinium PDM reporting using LOD expressions, & simplifying dashboards by removing extra worksheet for top-of-table totals (KC)

JiraWebConnector: HTML/JSON issues with PRISM project in the cloud (Nasko). Meet with Jennifer to discuss her needs.

JiraWebConnector: importing auto-refreshable connectors on the server

2016-06-30

Demo of Password Manager Pro (Scott)

  • "open connection" allows access via PWMgr rather than giving users the actual PW
  • you can share with individual users OR pre-described groups of users
  • As a default, when sharing a PW with people, they'll try to look at it and that'll trigger a request for access (even though you've already shared it). BUT you can set Access Control to auto-approve those requests.
  • Erik maintains a place for certificates for environments. We probably won't need to use that feature.
  • We really just need the basics. Zach will try setting up some simple PW saves.

Next week: Nasko and Mariela will run through handling errors so that we're confident when they go away for 3 weeks.

New engineer in JT's group: David.

LabOpsJira is now handled in our log parsing report. Zach's report link: https://tableau.broadinstitute.org/#/views/LogAnalyzer/LIMSLogInspector

Oracle support remains very tenuous. No backup or backup plan. Dave Gregoire only. There's a pushback on getting a 2nd DBA, even though if Oracle goes down, GP is totally hosed. Millions of dollars would be lost per week of downtime!

TS is pushing memory limits. BITS isn't very helpful on this, but may need to add more memory to the TS computer. Perhaps we can also still reduce the load in other ways.

2016-06-23

Tableau tricks of the day (Christina, Zach)

  • "How many LCSETs hs this been in?" (parsing/counting a CSV field list of LCSETs)
    • SPLIT is a new function in TD9 but doesn't lead to counting.
    • An elegant hack: LEN([CSVfield]) - LEN(REPLACE([CSVfield],",",""))
  • Plotting numbers greater than GB (e.g. TB) isn't possible in "Number(custom)" format, but is possible in "custom"
    • It's also possible to format different kinds of numbers differently.
  • Sorting & collapsing with & without timestamps in Mercury Logs Report
  • Color blindness: we have a user with color blindness that isn't red-green.
  • zombieCheckerAgent, "Running SQLs" page (Nasko)
    • Gives nice insight to stuck queries. Nasko will give us another tool to give total insight to the actual query that's stuck. 
    • Nice complement to performance recorder.

2016-06-16

Discussion about the 12 principles of Agile Software

JiraWebConnector: authentication issues, pls check RPT-3608 (Nasko) Some bugs have been fixed. Password situation is manageable now, using "trusted computers". NEed to test with extract refreshes to be sure it works as intended.

Quarterly goals: let's each look at the document to see if we're missing anything in the list

2016-06-09

Innovation Lab: JiraWebConnector, access our Cloud JIRA (Nasko): http://analytics-etl:8090/JiraWebConnector.html

  • Before: we had to write Windows batch files and encode special characters to run the JQL for TDEs, and we didn't have access to the cloud JIRA
  • JIRAWebConnector was painful to build, but allows connection to all (new account) (we're still working on adding CRSP JIRA access as well)
  • From within Tableau, connect to data using the web connection, and hit Nasko's URL.
  • For auto-refresh, connect to data using the "specialized link" URL from Nasko's page (replace the connection link with the new specialized one).
  • We need to test the extract refresh capabilities to see if it works the same as other Tableau extracts.
  • Amy will push this to the JIRA master suite.

switch Tableau reports from facade views (e.g. Illumina2_Lane_HQS) to tables (e.g. cognos.SLXRE2_Lane_HQS) (Mariela)

  • Table doesn't have indexes or pretty names, but is MUCH faster.
  • This is a good time to reconsider how to rename fields in Tableau to standardize across reports (so the same metric is called the same thing in different reports)
    • Maybe we could have a reference document online with the DB names and the ideal custom names
  • Stanley Center wants a Tableau display now, Zach asked Eric Jones to handle the hardware setup before Analytics gets involved at all

Do we have anything that will break on 6/25 due to BITS maintenance?

  • Yes. It's a Saturday, so won't be a big deal, but maybe we'll send an email with a heads-up

What Tableau workbooks shouldn't get moved to TD9 (due to possibly requiring sharing by folks who aren't on TD9, like Steve?)

  • CCLF (shared development with Paula)

Moving to TD9 and training users

  • Everyone's upgraded except Paula (CCLF) so they can just informally come to us with questions.

2016-06-02

Nasko: GPInfoJira is enabled for both SQL and JqlTableauExtract

Summer vacation planning (Nasko & Mariela July 11-29, Zach July 9-23, Kristen May 23-June 6, CG June 27-July 6 & Aug 22-23, Aug 30) 

Patient IDs removed from Analytics tools - update and issues.Replacing multiple collaborator sample ids with 1 string turned to be a problem for aggregation metrics, beacuse collaborator sample id is part of the Unique key in Agg DMs. Mariela to contact BSP team about renaming multiple collaborator samples from the same batch to the same name - shouldn't that be a problem/precedent.

Rapid QC snapshot to be offered for the weekly email - Christina.

Code reviews are encouraged.

New fonts used on TS10 for controls (check boxes and radio buttons) - spot check reports on TS10.

Tooltips being far from the mark will be resolved in TS 9.3.3.. Some issues got resolved though after opening the report in TD9 and re-publishing

2016-05-26

Ways to improve performance on CRSP Product Performance report - 

  • Contributors to slowness - quick filters, filters on calculated fields, blends to live data (JIRA in this case, but it might be necessary). 
  • Receipt Date filters turned to be a big culprit, just showing the quick filter slows the report down. 
  • Suggestions to improve performance - change the Receipt Date quick filter to "Browse Periods" if this would work for users, push the date filter to the Custom SQL , add an index to the underlying table (in Mercury DWH), review the custom SQL for performance improvements - there were  2 very similar sub-queries  

Presenting Tableau Web Data Connector for Cloud JIRA - Amy

  • Use Tableau add-on on Cloud JIRA to generate an URL with user's credentials, paste it into Tableau Web Data Connector and write your own JQL to assemble the data source. The result is a data source with all system and custom fields from JIRA
  • Web Connector builds extracts, an easy and simple way to get JIRA data, but can't add new features to it
  • Nasko to test  the JQL Extractor to connect to Cloud JIRA
  • Issues with Web Connector - Amy had some connection issues, no indication what the problem was, it's expensive ($500/month)

Zach found that couple of refreshes were failing on Tableau-dev due to low memory. Expected to improve when memory is increased (trying to add RAM on 5/31)

Tooltips in TS9 - Christina reported she found more reports with tooltips being too far from the mouse pointer. Fast moving tooltips can be avoided by a setting in TD - in Edit Tooltip switch the setting from "Instant" to "Hover". (Update, Tableau Tech Support says the runaway tooltips will be fixed in Server 9.3.3)

2016-05-19

Upgrade to TS9 went smoothly except for possible problems with gp-reports alias for screenshots.

Firecloud opportunity from Jim: to work with Marc Monnar/Amy consulting (but we don't want Amy to get too embroiled)

Employee input forms situation was weird this year, some people didn't see the due date

Nasko: New code changes caused silent failures; affected the heartbeat agent among several other failing agents. ETL monitor showed no red lines, just missing lines. Moving forward, when new code is implemented, part of the implementation will be checking that heartbeat is working afterward. 

  • This has serious implications, because the lab is relying on Tableau to sent samples to the cloud, so if we miss a day, we may never be able to recover (due to cloud's finite amount of processing space). Nasko will rewrite the heartbeat agent. 
  • Too many accounts: will move all agents to a single account (Nasko's, at least for now) for single point of failure (more obvious)
  • Possibly there's a new cron package for windows? That may be available now. Or SoftEng can help.

Mariela: free-form comments, Mercury, pipeline API

  • free-form abilities allow encoding lots of info into the LCSET names

KC: extractions & infinium complex queries with multiple layers of views/tables: we'll revisit the options (if any) after Kristen's vacation

Chris: MKD, RNA topoffs, Stanley Ctr

Oracle 12c upgrade 

2016-05-12

MKD TS9 approval is in ping-pong between Chris & Betty. Almost there.

TS9 training in progress.

TS9 bug: elusive tooltips (sometimes can't get cursor to tooltip without tooltip disappearing first)

  • we'll need to spotcheck this and should mention it at trainings ("please help us find the views that fail on this count!)

We've decided to use the auto-renaming of fields from datasources in T9 (maybe we can eventually ditch our facade views in Oracle?

Expired accounts (former employees): CRSP broke this weekend because Dinsmore's Google Docs were deleted (or something). Also, Qing's account was down, causing halt of an agent triggered in Windows Task Scheduler, and it wasn't in the heartbeat monitor.

GPLIM-4132 (cycle time)

Elusive tooltips (https://tableau-beta.broadinstitute.org/authoring/Cloud-BoundWGSTopOffTool/SentToRework)

  • acknowledged as a Tableau bug (Case# 02081419) but probably not pervasive

2016-05-05

Amy's new computer: struggling with Oracle set-up. We may need to improve our instructions.

DSP Town Hall: didn't go over well with some people. There doesn't seem to be much of a feeling of "sameness" between DSDE, KDUx, & us.

Chris needs CRSP document approval (maybe from Amy?)

Sinead wants Tableau training: KC will check with her today to see if she needs more than the current TS9 plan.

PDO sample metrics (CHris) still need work, but still trying to avoid messing up the 100+ custom views. Soon perhaps begin building PDO Status 3 from scratch?

This week, ZL & KC will upgrade TSdev to TS9.3.

Topoff tool interacts with Nasko's new table tool to allow Tableau to ALTER the DB!!! WOW!!!

  • Magical new topoff tool uses URL-based messaging to alter the DB tables.

Amy showed how the BITS Data Center Agile Board uses swimlanes & columns to represent physical hardware locations. Interesting use of JIRA.

PDOSTAR5 ETL runtimes: Mariela did major digging and discovered that Oracle randomly ran an advisor that changed the explain plan for this ETL, bringing its runtime back down to where it had been before its creep-up. But we don't have control over that advisor running, so there doesn't seem to be much we can do with this info proactively.

LOD calculations are powerful and fairly simple. Zach showed how to solve the "challenge problem" with LOD calcs.

2016-04-21

BTUG Burlington last week was mostly a good networking event. Next one Fenway Park! Mon May 23. Sox metrics guy, Jonathan Drummey.

Qing: all's well with handoffs

KC: lots of Infinium. Further reduce extracts, first assessment of Autocall DM vs GAP looks good, still looking into "fail button" in LIMS. 

CG: Topoff people: developing topoff tool for cloud. Speedier but basic tool for pool groups. They're heavily involved in the development.

Premature aggs are being reduced. Much improved today.

AM: Tigerized GAPAutocall. You don't have to try so hard anymore to figure out what it's done. Fluidigm will remain chickenish for now.

MM: working with JohnSaccoccio on Mercury DMs. Getting there. PDOSTAR5 ETL is very slow. Creeped from 8min-20min. Still don't know why.

AB: VolCheck v2. Maybe uses new device. Python, digging into HTML files. Looking to do XML output instead of HTML if possible. MMonnar StanleyCtr they're building a cloud JIRA for embryonic stem cells.

MM: discovered duplicate record in slxre_library_lcset table. Table was too forgiving on entries, making downstream ETL work harder. Implemented stricter PK 

Jim found a project: tracking reagents. Vasilia uses GoogleSheet for that, but in general, we don't have info on what reagent lots went onto which instruments/FCs, etc. 

TS9 main holdups: MKD (need 2 days of testing time with Marcia & Susan, who are busy on Mercury). 

Let's upgrade tableau-dev to TS9 and start teaching!

2016-04-14

Strategy board notes:

  • 15 GTEx samples (RNA) lost permanently due to bad drop-off procedures. May need a new JIRA project.
  • Lab contamination in exomes still a problem
  • Pipeline is still backed up, can't possibly catch up, so we're stopping sequencing to catch up (will also stop aggregating samples that will need another agg later)
  • Stopping sequencing may lead to major lab group downtime. Can we take advantage of that for Tableau tranining?

eMerge LMM will increase/test capacity in CRSP/Mercury (100-200 samples/wk)

Monthly Mercury users meetings are happening 

MAJOR underfunding in genomes. PMs won't be able to do that anymore once Scott's new tool is in place to block underfunded orders.

MKD: Policystat format change when downloaded? 

Volume check: Good on analytics side, bad on lab side. Scott's looking into improved calibration. Linley doesn't think it's likely it'll get working, and that a simple visual volume check has proven to be more reliable.

Infinium Control Dashboard: Mariela suggests a change in report format (e.g. not viewing red/green separately) to allow better indexing, faster queries, and maybe no extract in Tableau. KC will ask the team if there's a reason besides legacy that they want the current format.

New row for RNAQC: Yossi/Lisa/CancerTeam

2016-04-07

multi-key sorting in Tableau including nested LOD expressions (Zach)

Nasko, ETL ninja (Infinium Dashboard): extract Control Intensities from binary .gtc file (15 min) 

2016-03-31

QY, AM, CG, MM, ZL, KC

ZL Cancer Program Tumor Portal Gene Screen prototype 

  • creating color-shape custom shape legend using manually-created PNG files
  • Jitter calc field in Tableau using "RANDOM()" calc.

ZL PanCan Mutation Viewer on Tableau Public (toward patients sharing their own mutation data publicly)

LOD calcs: keep in mind the order of operations (after context filters, before dimension filters)

Vizzes for the weekly Friday GP email: Let's keep an eye out for FIRE-related reports (either that they are using or SHOULD be using): Orphans

We did a great job clearing out obsolete reports & reducing extract refreshes

Mitigate data downloads from Tableau (https://tableau.broadinstitute.org/views/ServerUsageDataExports/DataDownloads)

  • Some are to allow comments, but can we make that a function in JIRA that shows up in Tableau?
  • Mariela will talk about reports in Quality meetings to reduce wasted JMP-related rebuilding existing report

TS9 favorites: we need to decide what to use as the default home page for all users. 

  • Can we implement "recommended" tags? Should that be on managers? 

MM: ancestry/QC datamart updates

2016-03-24

QY, AM, CG, MM, ZL, KC, AB

Moving to TS9: reorganizing, renaming projects, removing extra reports

Spreading Qing's work around: KC infinium controls, CG/AB Fluidigm access array

Reducing extracts (load on TS)

Mariela 

  • Closing non-RPT tickets GPLIMs tickets with code testing (as QA: should do QA as Susan does)
  • waiting for next Mercury release. New key datamart at sample level (Mercury counterpart to Squid DM. Will be handled by Mercury team)
  • New plexity data for PDOSTAR2
  • Pooling calculator extension

Kristen

  • Clearing out TS

Chris

  • Scrambling. Behind in making tickets
  • RapidQC metadata slows everything down. Working on incremental vs full refresh, need to check Tableau's extract tool for insert vs updates
  • Stanley Ctr (draft for debugging & cohort rpt): They need support so they can give us samples for sequencing

Nasko

  • improvements for zach's vLAB statements (added with clause to union)

Qing

  • walkup is almost set. Did presentation for Mercury team. MKD, infinium control dashboard all set.
  • vbronze is the VM running some scripts (incl. infinium control dashboard). GTC viewer. perl script.
  • access array also perl script. Probably both (running on vbronze) could be tigerized?

Amy

  • testing JIRA7. Erik wants to test moving all JIRA to postgres instead of Oracle (but erik would be sole support for postgres). But probably we're too heavily invested in Oracle for LabOpsJIRA
  • More CRSP work. Autotransition testing (into "effectiveness check")

2016-03-17

Qing is moving to DC. April 22 is her last day.

  • Walkup LIMS: Mercury Product Team (php app)
  • MDK CRSP: Christina/Zach
  • Volume Check Machine Checker (ETL, email RPTs, stats)
  • Infinium dashboard, Fluidigm Access Array

Tableau 9.3: 

  • final beta has major engine changes. Intermittent speed problems. Too many extracts? Can we get rid of some?
  • KC making training materials

KC to change project names using Seq Test to validate that it works easily

Strategy board: WGS Contamination. Going to turn on dual indexing again, so orphan rates will go up. Orphan report is not getting used.

No bandwidth to catch up on processing.

GOALS: what's realistic to complete by June?

Tom Mullen: LMM Sanger Seq E-Merge project partner

Mariela: Troubleshooting bad user entry data. Tina sample plexity. Mericury Library/Sample data. FC designations are problematic everywhere (incl JIRA black box between LCSET & FCT) but will be better in Mercury. Major room for process improvement here! (CRSP/ExEx in same LCSET? complicated)

Chris: Trouble hacking PDOSTAR2, trying to improve it without breaking custom views etc.

Amy volume checker. DevJIRA on JIRA7.

Qing: busy documenting her work for handoffs.

2016-03-10

QY, AM, CG, MM, ZL, KC, AB

Splunk (Amy): Phil looking at using SPlunk in BSP to catch exceptions in real time. Amy looking to use on JIRA. Tableau has a Splunk connector too. Chris Dwan's group uses a different SW for this purpose.

KC working on TS9. Everyone needs to confirm that their reports look good with latest beta.

Mariela working with Mercury (DWH testing for release next week).

Helping Tom with pooling calculator.

Nasko working on JIRA links function.

Nasko giving a talk March 30 to SoftEng group (functional programming)

Qing Clinical Report hassles, MKD.

2016-03-03

QY,AM,CG,MM, ZL

Tableau Desktop 9.X visual join window (Zach)

  • multiple CSV files in a directory, or sheets in an Excel file, or tables in a database

Oracle 12 testing update (Mariela)

Tableau replication of TumorPortal.org using LOD expressions (Zach)

2016-02-25

QY,AM,CG,KC,MM,AB

Kristen: Tableau Beta upgraded to 9.3beta3: 

  • ZL: tableau-beta updated to 9.3b3, includes content discovery enhancements (tooltips and lists show view usage when searching, sort by relevancy)
  • With view counts, will it be possible to exclude ourselves?

Nasko:

  • old bsp sample datamart decomissioning - new one is running. Mariela will try to close her related tickets very soon.
  • It is also possible to update the JQLTDE service to work with more than 1000 results, but that isn't a requirement at this point. The record limit is due to XML being stored in memory during data return.

Amy: 

  • Updated JQLExtract page to have correct URLs. These were broken when converted to Spray.

Mariela:

  • Re: RPT-3402 - BSP plating DM doesn't always get the PDO for the seq plated sample, this most likely has to do with how the users create WR. James Lee helping to look into this. Mostly with Lasso and 16s, some nexomes as well.

All -  BSP should own/maintain plating ETL/datamart, but when? 

  • What would be required to use current DM toward operations reporting on plating? 
  • Is there only one plating-related datamart right now?

Analytics SUPPORT tickets... "Wait, so do we have to assign it? Close it and replace with a RPT ticket? Can we leave it unassigned until someone takes it on?"-kc(re: SUPPORT-1508)

  • Analytics SUPPORT tickets should be assigned within 24 hours and acted upon within 96 hours. If the work will take longer or is really a new feature an RPT ticket should be created and linked so the SUPPORT ticket can be resolved (ZL).

2016-01-28

Gmail filters work well to triage GP Support JIRA Service Desk emails (Amy, Zach)

  • This will email us when someone uses Service Desk to create a SUPPORT ticket AND when someone directly creates a SUPPORT ticket within GPinfoJIRA.

culling unused reports (list of all workbooks with <=10 hits in 6 months, or change max hits here)

  • Please add notes like "keep this for RNA team" or "KC will remind lab to use this" or "can be merged with workbookXYZ"

Mercury Audit Trail feature is very powerful (see RPT-3355-- an extra, transient product "Materials" showed up only during a brief change in Mercury... which was a mistake during a hack required for alternate pricing, which is the reason for the SAP project)

Quick update on RapidQC  (minor changes to the workflow) (

Goal sharing


Selected 2016 goals
AmyMentor more developers in JIRA
Lab time
ChristinaETL/DWH work
Genome workflow support
KristenETL/DWH work
Lab workflow improvements
Mentoring and management help
MarielaLab time
Scala ETL
Mentoring
NaskoSofteng talk
Help with SAP/Billing design
Mentoring
QingScala ETL
Peer reviews
ZachHelp VIZ group with goals
Work more with BIZ group

2016-01-21

tableau-beta now running Server 9.3b1 (custom view names are back!)

  • Still waiting for confirmation that this is the last update for a while

Broad's Tableau sales rep changed back to Peter Schmitz

Feb vacation planning

  • Only a couple days of 2 ppl

Oracle 12 testing

  • Nasko to look into what we needed to do for the last upgrade
  • Navdeep will be leaving, but may be able to help Squiddy tests first.

New QC person (Marsha) starting soon.

Reviewing Oracle DBs to drop obsolete ones and reassign non-GP ones (in case this can reduce GP's costs)

  • Document from Dave Gregoire

Support tickets: Amy's going to add Zach to the Support list. He'll test the Gmail filters. If it works, we'll all do the same.

RAPID QC

  • Review of RAPIDQC workflow: Starting in February?
    • Goal is to save money by avoiding unnecessary aggregations by only aggregating complete/sufficient datasets.
    • Meeting Monday to sort out the details.
    • Reference genome (HG19/HG38) (set at PDO level in Mercury?) determines whether basecalled data goes to RapidQC or Local Pipeline.
    • If RapidQC, can get queued for topoff (goes to new topoff tool) or cloud. This triage will be manual based on the 'projected covg' tool Chris built in Tableau. Nasko also built our private queues for tracking this.
  • HiSeq Metrics come from 3 places:
    • Aggregated metrics come from cloud (for RapidQC)
    • Aggregated metrics come from local pipeline (non-RapidQC)
    • ReadGroup metrics come directly from basecalling
    • Unclear where lane-level metrics are coming from!
  • We need to develop the new Topoff tool
    • Find all samples requiring topoffs
    • Avoid pooling samples with same barcode from separate previous sets
    • Pool samples that require similar amount of further sequencing 
  • BAM files etc. will be stored here for local analyses, but in some Cloud bucket for RapidQC analyses.
    • TBD: how customers will access those data if/when they need them
  • "Is latest" flag might be tricky for current testing version when aggs go through both pipelines
  • AB: Some JIRA-Mercury testing problems
  • MM: working on unifying Mercury/Squid data, need to reduce extract time on PDOSTAR, shadowing ExEx labwork (Haley)
  • CG: Stanley Ctr PM want new reports (billing, quotes, check with breilly)
  • KC: 
  • AM: Topoff/aggregation tool. BSP.analytics_sample had another backfill problem, Phil fixed it. Would like to build a less manual/back-and-forth dev tool. Updated to IntelliJ 15 & Scala 2.11
  • QY: volume check: if machine not checked for 2 weeks, team gets an email. Walkup VM & security changes. Will try to remove extra filter button step in Tableau 9.3

2016-01-14

  • Zach wants us to work from home less.
  • Followup with Tableau support re: Server 8.1 problems from Nov/Dec 2015
    • new Egnyte site is live for large file transfers
    • Tableau working on better traceability between bugs and release notes
    • also the ability to submit a case and logs directly from Tableau Server
    • ultimate goal is a live log analyzer for predictive maintenance
  • Followup on vLAB multi-workflow performance tuning
    • removed the sort which was forcing an extra query per view in Tableau 9

      vLABServer 8.3Server 9.2
      before30 sec40 sec
      after30 sec

      10 sec (!!)

    • also removed vestigial details in view that were complicating the queries but not helping users

  • Tableau 9.3 beta starts soon
  • How quickly will saving all TS/TD versions on our VM fill up its storage?
  • RAPIDQC (Chris): Send to agg & link of selected samples directly from the Tableau report. Sends to TigerETL, which sends to JMS Queue

From yesterday's meeting with Nasko:

  • BSP_SAMPLE_AGENT error - story of backfill.
  • Agent begins with a list of sample IDs and their tables' last updated data. This is what has been changed in there.
  • SQL Developer didn't show where the failure was, but IntelliJ had a silent failure. SQLNavigator showed a problem with the A260_280 (showing values like '186_120524')
  • Using multiple tools can help in troubleshooting.

2016-01-07

Have we heard back from Tableau Support about the 9.2 performance slow-down in vLAB?

  • Zach traced it to an extra (or different than 8) sort query that Tableau 9 is sending to the database, logs sent to Tableau on Jan 5
  • It was adding an extra query in serial in addition to the parallelized queries, which involves a heavy hidden sort making its own trip to the database.
  • Tableau Support is looking into that (they say it shouldn't be doing it).
  • Additionally, once there's a table calc, if you later add new dimensions to the view, the calc can break. Best to add those extra dimensions as attributes.
  • Also, there's a weird extra addition to the WHERE clause specifying 'LCSET-1' - 'LCSET-999' and we have no idea where that comes from.

Next BTUG at Amazon Robotics in North Reading on Wednesday Feb 3, then back to Broad on March 10 with Dan Murray from Interworks

Tableau 9.3 beta starts mid-January, release estimated for March

May we set all users initial default login page to NOT 'show only favorites' when we upgrade to 9? 

  • Yes. We should do this immediately upon release.

Review of unassigned RPT requests

type key summary assignee reporter priority status resolution created updated due

Unable to locate Jira server for this macro. It may be due to Application Link configuration.

2015-12-17

Congratulations, Zach!

Tableau Server 

  • Continued problems: went down 3x yesterday, but the 4pm call to Tableau solved it. Long-running queries on 8.1 hitting published datasources (which we're using more and more) cause the VizQL to fail. Tableau support figured it out using our 5GB files (transferred via Google Drive). UPGRADED TO 8.3: Seems better overall. Only downfall so far is the subtlety of the selected view on a dash. We need to communicate this to users. 
  • With TS8.3 upgrade, Susan needs to retest for MKD
  • We used to upgrade point releases on Dev, then a week later on Prod. None of the release notes mentioned anything too special, so we stopped upgrading, but some of the improvements were not especially described in the release notes. We need to be ready to move to 9.2.

Performance recorder (ZL): In vLAB, found 30s in 8.1, but 37s in 9.2. It seems to be related to either the labels (%,+,X) or a table calc that shows up in tooltips. Tableau Support is working on it – no answer yet. 

  • We should be more vigilant about checking performance recorder during report development/design.

Cell phones are listed on doc sent to mercury-dev.

  • Chris made saliva-specific top-off tool that will probably be used next week.
  • Pediatric cancer clinical case trying to deliver data on Dec28

Retreat was decent, but many scientific talks were fairly inaccessible

New Process Deviations report is an attempt to make the report's purpose very clear, using questions as headers and minimal color to reduce distraction

Linley can continue doing some analytics work on Fridays

Construction is going to be tough.

2015-12-10

December vacation & retreat planning (all) – no problems. Decent coverage.

Using MIN(1) in Tableau to selectively highlight text table values (Qing): Very powerful in Volume checker.

Atlassian is going public tomorrow

Decisive Data (Tableau partner in Seattle) (KC)

  • For now, we'll try to find existing videos online instead of hiring them to create training materials for us.

Extractions datamart/views (KC)

  • Try a filter on live data in Tableau
  • Before another alternative: stored functions

Tableau 9.X test notes: 

  • Beta is on 9.2.0 (released this week)
  • project names and descriptions become more prominent (decide changes)
    • if changing names, Kristen's JQL publish script needs to change
  • get PostgreSQL DB working for Users and Groups report (Zach) (tick)
    • Need to turn on postgres, need to create a postgres user (RPT-3332)
  • test updated version of tabcmd get (Zach, RPT-3376) (tick)
  • test updated version of tabcmd publish (RPT-3378) (tick)
  • start practicing with Desktop 9.3 (all, no need for oracle.tdc)

  • yes we have individual licenses
  • test MKD (Qing, Susan)

  • clean up large custom views because 9.X server admin pages don't show them anymore (Zach) (tick)
  • update training materials (Christina, Kristen)
  • will lock down permissions per project, already the default (Zach)
    • need to confirm for datasources... standardize same as twbs?
  • backup and restore scripts path changes per previous notes (Zach)
  • CMAP group is going to join our server (in January?) once we've upgraded
  • block comments section on views (or at least disable it globally, see RPT-2706)

2015-12-03

New "Sprayed" Analytics-ETL web-server (Nasko). Why do we need it ? Presentation/Discussion  (45 minutes)

Performance improvements to vLAB Arrays (Zach) (15 minutes)

use Tableau 9 performance recorder (even for Tableau 8 workbooks)

remove two quick filters on calculated fields

remove blend used just for one tooltip field

switch group to case for status binning

replace two original quick filters with one new parameter

net result: 38 seconds > 14 seconds (T8), 22 seconds > 8 seconds (T9)

2015-11-19

Nested sorted and ranking in Tableau (Zach)

When datasource doesn't clarify the PK, you can create your own using a string calc or select those fields & "combine field" in the right-click menu

Use (Index, Parameter, & Calc Field; parameter filter) OR Use (Table calc with ranking; direct quick filter)

Tableau Server 9.2 (Zach) (official release is soon)

user interface (projects, toolbar)

  • Renaming our projects? Rethink the project breakdown?
  • Radically different new "view" toolbar: better labels, DL instead of export (needs instruction updates), Custom Views (no longer shows it on the view, but shows on URL), new UNDO/REDO (hooray) 

admin interface (user groups, data sources – usable now to clean up extra datasources)

Tableau Server problems

Monitoring, restarting when it freezes, working with tableau support and with BITS, please still tell us, then jump to dev, then jump back after the restart. Goal is to get back to zero failures.

Atlassian Summit highlights (Amy)

Nasko needs KC & ZL to test new webserver (8080 instead of 8090 for JQL)

JMS service? Chris, URL service, multiple values, Nasko parse, something something

2015-11-12

Tableau server has been misbehaving despite restarts of both the server and its VM. Zach has asked BITS to look into VM problems. Hopefully it's not related to the point release (which was to fix custom views). 

vLAB performance improvements using the lighter CASE method instead of the burdened GROUP method: ZL documentation here: Tableau Performance Tips

If you accidentally delete a workbook containing custom views, favorites, etc, they're gone. However, for the Custom Views, you can more-or-less recover them by going to tableau-dev THAT DAY, looking at the list of custom views, figuring out how each one was built, rebuilding them on prod server with the users' names, and emailing the users to let them know to change their bookmarks and keep an eye out for problems.

Connecting JMP and Oracle: It's easy enough on Windows machines. Linley found a 3rd party driver to allow Mac connections. It's $40. see RPT-3236. For some users who download, we may need to build a new Oracle view to simplify their connections (in lieu of reports with multiple/blended data sources). Chris will try to reproduce Jes's connection in JMP.

VNC licenses ($25 each) are stuck in BITS/purchasing for some reason, so until that's sorted, we can't get the new BSP monitors displaying screenshots.

Pico/ribo run date: new column change in bsp.analytics_sample goes live tonight, so anything using it needs to be updated.

2015-11-05

IPM and other reasons server is slow – Downloading underlying data is BAD for TS performance! We need to stop users from doing this, as the action is causing Server to overload and freeze up on other users.

Can we get users to connect directly to DB via JMP? Nasko will find the how-to document, Linley will test it and we'll update the documentation if necessary. Then we need to evangelize and help get each group set up to do this. Then we need to shut down IPM and other reports that are simply used as ETLs.

Linley: discussed JMP vs Tableau w/ seq groups. There are some groups who consistently download and recreate plots in JMP (ex. Exomes). We want users to avoid this, as it's hard on Server and a waste of the users' time.

We all need to read the performance documents Zach's been adding to Confluence.(Tableau Performance Tips)

RPT-3217: BSP tests (BSP Views) Kristen set up several tests for BSP-based views. Mariela and Nasko may be able to add other non-view calls that rely on BSP.

RPT-3221: Excluded marks bug in all TS versions – unless we can reproduce this in the Superstore data, we probably won't get any traction from Tableau Support, as they'll assume it's our data causing the problem.

Upgrading to Tableau 9: Probably January/February? We all need to check out 9.2 and determine whether to move to 9.1 or 9.2 (e.g. Zach doesn't like the Custom Views changes in 9.2)

Another reason to move to T9: Fragile Table Calc vs. robust LOD expression for MKD multi-call support (Zach)

Details here: Example use of LOD expression vs. Table Calc

New "Analytics-ETL" web server is coming soon (Nasko)

new server uses SPRAY framework, which will allow faster implementation of new features.

2015-10-29

MKD: Powerful new use of Tableau Server functions

Welcome Linley!

TS 9.2 has major changes from 9.1. It's powerful, but has some problems.

Performance improvements:

Quick filters v filter actions

Excellent whitepaper (Designing Efficient Workbooks)

performance recorder: was effective in one report immediately

  • Get WB ready for recorder. save WB on blank page.
  • Start Perf Rec.
  • Do you questionable action (click on the slow dashboard)
  • When loaded, stop PerfRec.
  • View recording. It shows some of the info (more in 9 than 8)
  • View tabvprotosrv81.txt (this becomes JSON in v9)

In general, use "TODAY()" instead of "NOW()" to improve caching.

Mariela will test perf. on some of our views-over-tables (e.g. RGHQS overlay views w/ renamed column headers)

JOIN CULLING: Custom SQL doesn't allow for join culling.

We'll probably upgrade to 9.2 when its beta is solid.

BITS shutdown will be a bit gnarly.

Nasko went to a CLARITY LIMS conference

We have a test instance of Clarity LIMS (on cloud, Tableau in cloud, we're testing that connection)

NYGenome Ctr is a big showcase. Toby & Melissa are there, saying good things about Clarity LIMS

NYGenome Ctr switched in 2 months!!

PDOSTAR5 – Possible improvements (Mariela) to reduce the 1/2hr refresh.

Quarterly Report & CRSP Audit are both next week

2015-10-22

No meeting?

2015-10-15

Pipeline cluster being split into dedicated pods for basecalling, alignment, aggregation, etc. to ensure throughput

Picard to try processing some genomes in the cloud in November (all metrics will come back to seqprod)

New RapidQC system being prototyped for quick basecalling and estimation of aggregated coverage (see DSDEEPB-1583)

New Pico process in lab delayed until October 29 (neither LIMS nor Analytics was ready)

Salesforce forecast doesn't need to be integrated with program forecast at this time because platform PMs are using both

First MKD clinical samples arrived this week, assay starts next week, results should be ready right when Qing returns

Trouble with WGS and PDO datasources on Tableau Server since weekend (poor performance, excessive temp space use)

Zach, Christina, Kristen at Tableau Conference in Las Vegas next Monday-Thursday

Nasko at GenoLogics workshop in NYC next Monday-Tuesday

Amy working on process mining of JIRA transitions and LIMS messaging

Nasko working on Tableau 9.1 Web Data Connections, demo in a few weeks

2015-10-08

JQL TDE DB leaking (open connections)

MKD samples are on their way, finally

Pipeline stopped aggregations to do basecalling. Pods, rewiring, it's all still lurching from one backlog to another. For a pull system, we need better insight to pipeline tasks, but they'd rather us not be so hands-on.

Need to convert LCSets into pipeline "core hours"

DSDE are all scrambling to get FireCloud ready, then they'll move the GP pipeline to FireCloud. That has implications for the pull system (unlimnited compute)

MM: 24-plex: live this week. Problem: pool test is production run type. JohnWalsh can add a flag to table and/or JIRA? (Pooltest vs corrected)

MM working with Kathleen on deletion of BAMs

Waiting for bsp.analytics_sample additional fields

RPT-2547 Extraction QC is getting there

AB adding swimlanes in SRS

Crazy robots: issues with downed machines & communication

DCFM DM is up and current

Nasko new webserver (SPRAY) would be less buggy and allow a Tableau web connector

New connection for poolchecker agent

2015-10-01

Demo: Level of detail expression in LOJ SRS RNA tickets (Tableau 9.1) allows correct summing of samples in tickets with multiple steps

{FIXED[KEY] : MAX([Number of Samples]) }

potentially powerful, but complicated (as with table calcs, as with 9.1 webservice access)

Nasko requests that we maintain a small level of disgust when using this method, as it is a workaround for incorrect PK datasource problems

Stratboard: new pipeline computers to handle sequencing data increase (including new sequencers). More dedicated to basecalling etc. Smaller section dedicated to VCF.

Eventual goal: pull system. How this will work is unclear.

A few things are still waiting for bsp_analytics_sample table updated fields

DCFM validation: Chris has found several errors. Recreating and continuing.

MKD: Qing: Missing logo in PDF report

Mercury DW: Mariela: ongoing improvements and iterations

JIRA still runs out of memory once a month, requiring restart. Things have now been stable for a month.

PDOSTAR2 extract refreshes take longer every couple months. Chris to check on different server versions for diff performance. KC to check pico reborn for errors Zach saw.

2015-09-24

bsp.analytics_sample status update (KC,MM,CG,AM):  It's functional and now in use. We need to add more columns, and Phil may push back on some, but either way, Nasko will work with Phil to perform the new backfill without breaking the current connections.

Strat board analytics listings (Salesforce, etc) - they don't get mentioned, but mostly we're blocked on them, so Andy's weekly updates are over-simplified/unclear. We should be on track for them otherwise.

linley sabbatical ideas brainstorming: 

do we need a desk/computer/monitor for her? KC to check.

best if she comes with ideas/data from her current process, to have an impact there directly.

we'll all keep thinking of the less-exciting tasks that may help round out her experience (informatics isn't all exciting/coding)

ticket backlog – what's our process for not letting these get lost? 

let's all go through individually and try to close as many as possible as 'won't fix' (with info about why for future reference/searching) and then discuss the remainders as a group

compile list of to-do's for JIRA upgrade (while it's fresh in our minds)

overwrite entire transitions table using a one-time update statement (Nasko to document this statement)

change hard-coded schemaname references in PIJ_ISSUE_GPLIM,CF_CATALOG view in JIRADWH, LOJ_ISSUE_RAPID_VIEW in reporting

repoint JIRADWH.ETL package to current schemaname

get new schema user password from Erik/Amy

testing phase: SRS is fragile and a good place to confirm functionality

test views should be created/updated during testing phase to mimic current views

extractions in Mercury/JIRA/tableau

old cognos views (bspss) can be dropped

KC to meet with Katie Monday, will pass along info on what she needs to monitor in JIRA & elsewhere

Amy will then keep working with Dennis on updating XTR tix

KC & Mariela to discuss the Mercury DWH extractions info with JT (are extractions in events_fact?)

old tables/views that we are 99% sure are droppable can be "snoozed" (renamed zzz_date_tablename), but dropping other tables should be avoided to allow future "past-data" analysis (e.g. how many genomes of a type did we sequence in 2012)

Lots more genomes are coming! 20K NHLBI genomes in the next year (TopMed3)

This is good – in GP's sweet spot, no longer losing money on genomes (24plex,better pools).

To-do: optimize pipeline (ZL,Nasko,&SamN assessed the Tetris of this last week), help Tina with her problem of timing FP requests

Improve FP monitoring/reporting?

DCFM datamart (CG,AM)

new table, wrapped up in same ETL as PDOSTAR

Motivation is for BReilly/Billing

CG working on validation

Week of Oct 19: Only 3 of us on-site.

When Qing returns from vacation, we'll need to merge walkup–> SAP

2015-09-17

Qing vacation: MKD backup person = Zach

Qing has BioMass database password - shared with Zach

Login to tableau.broadinstitute.org as user LAB\tableau to access \\radon\mckd1_clia\ directory with metadata files

Backup of MKD workbook saved in \\gp-reports\tableau_backups\software\MKD\

Clearing out old Oracle views (specifically Pico) - Kristen

Linley sabbatical (last week of October - mid December) - Yeah!

LOJ almost at 100K issues! (99400 as of 9/15/15). Put out a fun infographic? (Issues per user, comments per user, most active etc?) - Amy

Tableau 9.1 released, got preview of some 9.2 features - Zach

Notes from 9/16 strat board meeting - Zach

Picard pipeline backed up (several weeks)

MKD launched! (now waiting for samples)

CRSP team reports new Tableau reports are helping them track and expedite samples

VolumeCheck launched! (need to confirm ongoing compliance)

WGS 24-plex is GP priority #1

Sonic is GP priority #2

GP Outing survey is MANDATORY, the questions help form balanced teams

2015-09-03

Qing's connection to MySQL DB is still wonky.

Chris: Bluejay is still happening, but at a slower pace, so she'll continue woking on it. Zach has been talking with other infx & lab people to determine the right PDO breakdown to allow proper tracking of parallel sample status (IG Fit product). Chris will work on aggregations and metadata filled in, and later we can try to get it earlier in the process.

Nasko is going to do something about requests from the Mercury warehouse?

Mariela: Pooling Calculator agents were migrated to Tiger and were merged into one for pool tests and production runs. Runs every 15 minutes.

Mariela: Pooling calculator is open for CRSP, but still doesn't have ancestry information which is needed for pool corrections.

Mariela: working on Mercury ancestry info & QC datamart. 

KC: Pico tables are live, so the data will be coming live from BSP without XML-parsing ETLs in analytics. Further improvements will come soon with new sample datamart that Nasko is helping Phil to create.

KC: FP is still awful.

2015-08-13

Seating plans have changed drastically in our favor. We may keep all the offices on our hallway.

Project BlueJAY (mid-Sept): 600 SMs. Extract (allprep) + RNA & DNA seq. This will be lots of LIMS work (further delaying getting Infinium into Mercury)

Lab is apparently clicking through all errors, requiring major fix-ups. User error.

We need to stop showing unimportant warnings.

Extractions and work data will (longterm) be tracked in Mercury, not BSP LIMS. Currently no reporting on the up-front Mercury extractions, and joins to those samples are breaking reports (Nasko's been working on it).

Amy's working on Product Realization Process (in JIRA)

GP Management system is broken regarding the transition of people from one group to another. Years of tool development, info, & knowledge gets lost.

Zach met with Mike Wilson about ExEx & pooling & reporting. 

New understanding that there needs to be more tools training, that tools need to be added to SOPs.

Reinstate office hours? 

Chris working with genomes team.

Volume checker

Nasko's been helping Qing with Tiger ETL 

Qing can let Susan to test volume check report when it is ready

Austin found problem with warped plates and is working on finding 'flatter plates'

There's discussion about adding manual-run button for ETL (vs more aggressive ETL schedule)

Eric wants to get rid of COLLAB JIRA (push it back onto Broad, Marc Monnar).

https://tableau.broadinstitute.org/views/AccessToWorkbooks6moData/Workbookuse

2015-08-06

A contingent is working really hard to convince Sheila et al that open seating is a bad idea for us.

Passwords: Broad/Tableau/Mercury/JIRA are on Active Directory. BSP/GAP/Portals are on a separate PW. Squid is separate but uses AD as a secondary? Unsure.

passwords, cached PWs & required restarts, WiFi cached PW, certificates, ETLs 

ETLs: (can we create a shared account like TS LAB\Tableau for these task schedulers?) 

We want to push some of this back on BITS for Wiki documentation. 

Qing had a problem in analytics-etl server Task Scheduler. Not only do we need tasks to check "run whether user is logged in or not", we also need to check "don't save PW" for non-filesystem related ETL. 
However, if the scheduled ETL task needs open files in restricted server (like "bragg"), we still need Task Scheduler to store password in order to access restricted server. Everytime we change password, we need store new password in Task Scheduler.

BTUG 8/25: 125 registrants within the first hour after the registration email. 60% of people tend to show up (according to Tableau)

Qing's tricks on MKD:

In Tableau, people try to sort a row level column, and it doesn't work. Qing created a new table called "header" that's just the headers, and then hid the headers in the data table.

  • This also allows the header to stay put even when you scroll.

Using a hidden column, you can sort on a separate column to avoid the aggregated row level "not sorting" problem.

Using a CSV link on a dashboard selects the first view, so you can prefix the name of that view something like "A_" to force it to be first, even if it's visually hidden.

In the URL, if you "Allow Multiple Values" and then "URL Encode Data Values" and then add >'FILTERVALUES(Plate ID)' instead of 'Plate ID'

  • Can we use this when wishing for a multi-select parameter? Filter on datasource 1, pass the results from that through FILTERVALUES... can we pass the filter to second datasource?

Dave Gregoire is on vacation for 2 weeks. He's our only DBA on Oracle...

2015-07-30

The process for changing the LAB/Tableau password has been updated in JIRA & Confluence. Recent downtime for Excel-based reports was due to an incomplete PW change (change required TS restart to allow a proper krypton mount, in addition to the backup & restore powershell scripts). 

We're upgraded to 8.3 on Tableau-dev. Tableau has extended support of 8.1 until May 2016, so it's less urgent to move to 8.3 on Prod.

PDOs: Until now, some PDOs contained cell-based SMIDs. Once extracted, we replace those SMIDs in the PDO with the DNA/RNA SMIDs. This requires more complex reporting to provide data back to collaborators using the SMIDs they expect. We want to leave the cell-based SMIDs in PDOs, but we need to determine what will break on the Analytics side (if anything). As of 2 weeks ago, all the CRSPy samples show up in standard Tableau reports, so we'll use a CRSP cell-based PDO SMID to test our analytics when leaving the cell-based SMID in the PDO.

DSDE team organization is a bit wonky right now, transitioning to different priorities and sprints & scrums. Prometheus & Epimetheus are now on the back curner, with the Cloud Pilot (FireCloud) on the front burner. November: Beta testing of FireCloud. March-Sept: Open, live, public use of FireCloud. GP DSDE support is shakier now as a result.

CLF: We need to find a solution for dealing with the non-GP relationship and time spent. We may need to go deeper as consultants before pulling back, but we probably can't sustain the current methods long-term.

BSP is taking on the BSP sample table maintenance and will also moving the Pico CLOB/XML to a table format. This is awesome.

2015-07-23

Tableau upgrades: We will probably update Tableau-Dev to 8.3, but remain on TD8.1 & TS8.1 for production. 9.1 beta will be ready soon for testing.

BTUG planned for August. There's a corporate guest speaker, but the message isn't sales-oriented, so we'll take the cookie money.

Automation: (sorry, I missed this note)

KC to move archive of TABLEAU_Files to Tableau Backups directory

Errors: More JIRA tix will help document the required actions. We can spend some time running through specific emails and their resolutions. Overall, warnings can be ignored most of the time, and errors can be ignored unless they recur.

Amy JIRA CLF: How much work do we want to be doing for non-GP customers? TBD

2015-07-16

No meeting. ZL vacation.

2015-07-09

No meeting. ZL vacation.

2015-07-02

Good news: Promotions went through!

Training was successful.

Marketing team: Want to determine ROI on mktg efforts. genomic website has GA & ActOn, spits scored leads into salesforce, some leads become opportunities. SF also gets direct opps. Do our mktg campaigns correlate to opps? Tableau allows such correlation. Hope for better forecasting too, with weighted leads/opps (weight based on progress)

Amy & Eric went over JIRA support info, updated Confluence, etc. 

Collab JIRA: need it or can we retire it? Replace it with cloud JIRA? without losing searchability etc.

Mariela wrapping up QC metrics (pico)

CRSP metrics that aren't databased

artifacts metrics are released, being used by Maura/Dev team

Chris has dealt with some bugs. Nasko added agg project for CRSP samples. Chris to list tools that await blank tables from DSDE.

WGS tools: we may need to get people together to get everyone on the same page. Then we can take down the old tool?

Kristen working on RNA freeze-thaw data (ID f-t events, correlate with downstream metrics)

Fingerprinting is 'on fire'.

Nasko updating PDOSTAR5. PLSQL improvements for sample_attributes_bsp, partly needed for repatiented samples

Mariela is part of team meeting regularly to improve repatienting workflow

Walk-up tiger is functioning well

Qing's up-to-date on CLIA training.

MKD project: software & workflow all set. Goal: receive samples within a month.

2015-06-25

No meeting. Tableau Training.

2015-06-18

Vacation calendar

Matter leaving, in addition to Jill Mesirov. DSDE Ben's last month also. Chris Dwan will be largely involved with our support.

Tableau 8.1 is off support (as of may 15)

we could upgrade TS to 8.3 and keep TD on 8.1.  That would give us 18 more months of support. 

Zach will email the flyer to Tableau-Users 

Kristen told Sheli about data refreshes

Kristen has new project looking at RNA freeze thaws

Mariela is trying to help the Infinium team troubleshoot, via new reports.

Infx plan is to decommission GAP LIMS and move those things to Mercury. Before Genomes get moved to Mercury.

Nasko continues to improve Tiger ETL. DSDE implementing artifact tables soon.

Chris working on WGS Top-off tool and date coverage first met. Mariela will help?

Working on training course topics/material

Laurie moving to Dev 

Amy working on a couple things in JIRA (for Robb, filters,...)

Zach helping with marked LCSET indicators for Wendy. Also with cap color indicators (red & blue standard from SQM)... currently seems risky!

Qing helped Susan with MKD project testing.

Qing needs to spend several days updating her CLIA training

There's a new trick to export CSV: Kriebel csv export button as a self-referential link: The Greatest Tableau Tip EVER: Exporting CSV made simple!

2015-06-11

Wendy wants a Process Deviation report.

FP locations? How is Picard currently looking up FPs? 2 tix on this... 

Data delay definitions for Sheli and others? Underlying workflow, exposable timestamps, etc.

Talk at GP community meeting?

Kristen to sit with Sheli

This afternoon: MKD review to launch next week

Computers shut-down this weekend, and then Eric on vacation

Tableau Training to be on June 25. Chris & Zach to handle sign-ups with Nathan's help. 

2015-06-04

Performance reviews are done.  Meetings with Zach will be done by Monday.  Official notifications should arrive within a few weeks.

BTUG on 6/3 in Burlington featured HealthDataViz, a consulting company.  Key points to ponder: How can we get people to the report they need efficiently?  How can we market our tools more effectively?

Broad Internal Tableau Training Course on 6/25 before GP@3.  Nathan feels the work day is cut short anyway and attendance will not be hindered.  He is asking lab managers to lists people who should attend.

Tableau Customer Showcase on 6/2.  The City of Boston and the Mayor are now beginning to use Tableau.

CRSP data has been added to datamarts using a temporary workaround until DSDE can add requested tables to CRSPPROD. Data can be exposed in Research reports.  Avoid creating new reports for now.

MCKD update

JIRA update

Process deviation tracking request from Wendy may be similar to what Kristen has created using the DEV tickets.

2015-05-21

Pooling is problematic. PMs think pooling improves quality; lab thinks it improves quantity. Not factoring in a pooling penalty. Not requanting samples before repooling. Pool test on MiSeq doesn't translate to the HiSeqX the way it does to HiSeq, for some reason: Possibly because of Nextera, which was never run on regular HiSeq.

Sequenom cloud is down, so we (and all Sequenom customers) are affected, because apparently all Sequenom data gets processed on that cloud. Surprise!

Qing helped Kristen with a calculated field parser for Tableau. Damien helped Kristen with the ruby script, and Qing created a macro(?) in Excel to polish the XLS output. This may help us stay optimized & organized across many TWBs with many calculated fields.

Zach "warms" views for the screenshot TVs, and this reduces Tableau load times for those reports. We might find this useful for speeding up other reports, which we could add to his "warm" list.

Kristen should switch the pico reports to a 15m extract schedule, because the load times are super slow right now.

Many files in TABLEAU_Files have been moved to the ARCHIVE folder. Everyone should check through there to ensure that none of that is still needed in T_Files, or move things to other directories. KC will deal with getting rid of the BSP/GAP files.

Nasko optimized his scala-killing code to avoid killing functional ETLs at the same time.

Chris requested a room for Tableau training. We should create a bulletin with Nathan's help, and hang it & post it as a screenshot on the Tableau monitors around the building.

2015-05-14

Picard is fixed (backfill overwriting workflow date)

Large insert Nexome product has increased coverage

MKD deadline of June1 – looks like we'll hit it

HiSeq4000 is coming (along with loaner Xs)

Howie is leaving... GP Infx is getting sparse!

Excel macro to send XLS to collaborators with rearranged/renamed columns (PDM/PM preferences)

BSP team will soon help parse the Pico Work Task Data XML to help with Pico ETL problems

Chris DCFM & Refresh time on server (PDOSTAR2)

Nasko's new Heartbeat agent sends error if there's a silent failure. New ETL agents will need to be added to the heartbeat list. We may be able to employ a post-ETL TDE script at some point.

Amy's working on a new CRSP LCSET type.

2015-05-07

Jon Thompson is totally overwhelmed by Squid fixes, preventing him from getting further on Mercury.

CRSP is now being pulled into our DMs. The same tool hits both non-CRSP and CRSP places separately, and can fail separately without preventing the other data from getting added. Because it's the same tool, there's only one thing to edit when necessary.

2015-04-30

In Tableau, "?:embed=yes" plus guest access being on allows external viewers to change to PDF view using Qing's new javascript button (automates PDF download)

Service desk: Can we automate FAQ knowledgebase? E.g."where are x,y,z metrics?"

Chris to take lead on Tableau Server training

Kristen to take lead on clearing out TABLEAU_Files

2015-04-23


2015-04-16


2015-04-09

SUPPORT tix: they can share them with the analytics mailing list when they're aimed at us

GAP Pheno attr table: KC to make a list of what attributes we might need for any/all reports

TS Project names: they don't really make sense anymore (Sequencing operations vs bsp vs gap). To rename and consolidate, we'll just have to get the default permissions right and then coordinate the cached permissions in desktop. Not urgent, but might be time to do this soon (or at least test what's needed)

Analytics += Business Systems: 

Billing, can we automate some of this with ETL? 

Integration of SAP, Salesforce, Mercury, WalkUp, QuoteServer, Product/Price Model

Sales team (Don Skifter on the phone with Pharma companies, helped by Mike Saylor & Kristina Tracy)

New Tigerizations look great

"Unmanaged delta" means manually run ETL ("managed" is run by automated agent)

Backfill is nearly caught up (1 day behind?)

Time to update "how to handle errors" info in Confluence

2015-04-02


2015-03-05

Attendees

Christina Raymond Gearin, Amy Biasella , Zach Leber, Mariela Mihaleva, Atanas Mihalev, Kristen Connolly, Qing Yu

Discussion items

TimeItemWhoNotes
5 minBTUG March 3Zach

Thanks all for helping host a great event
http://community.tableau.com/thread/153337

5 minFire BoardZachMany lab fires impacting production, no good way for analytics to help
5 minUpdatesZachProduct Delta-Queue graph gaining traction
https://tableau.broadinstitute.org/views/PDOSampleDelta-Queue/ProductQueuesandTimes
5 minUpdatesQing

MCKD1 design docs and software completed (first version)
WalkUp metrics report released
Adding flowcell designations to WalkUp LIMS

5 minUpdatesNaskoFinishing new ETL framework, being used by K&M to inject JIRA comments for Pico tix
Will present in next 1-2 weeks, goal is for others to use it
5 minUpdatesAmyReceived request to build a JIRA for DFCI (encouraging them to do it themselves)
LCSET Idle Time report almost ready, need to see what kind of delays it reveals
5 minUpdatesChristinaPDO_QUEUE work to support Delta-Queue report
Several re-patienting updates
5 minUpdatesKristen

Production pico report is working well
Plating pico data too difficult to work with for now
Also working on report for Cell Line Factory


2015-02-26

Attendees

Christina Gearin Amy Biasella (Deactivated), Zach Leber (Unlicensed), Mariela Mihaleva, Atanas Mihalev, Kristen Connolly (Unlicensed), Qing Yu (Unlicensed)

Discussion items

TimeItemWhoNotes
10 min
BTUG March 3
Zach
- please register in advance
https://www.eventbrite.com/e/boston-tableau-user-group-registration-15656171029
- need help with greeting table
- need help taking pictures of event
10 min

ExEx Sample Tracking (RPT-2753)
Zach
like the old HQS reports and the current Squid tracking page, e.g.
http://squid-ui.broadinstitute.org:8000/squid/app?page=lco/LcSetSamples&service=external&wrid=41790

-  Using  Autodoc in Squid to track everything in one doc

-  Need Pooling Calculator       

5 min
ExEx Flowcell Tracking (RPT-2752)
Zach
please review, maybe we already have something close
10 minXML from BSPMariela

CLOB file is parsed using PL/SQL

5 min

LC Product Performance Report too slow

Mariela
1 minAtlassian User GroupAmy

Immediately after BTUG

2 min

JIRA about to become faster and more consistent performing

Zach

Using local (VM) storage of files

5 minDB UpgradeZachTableau is showing new error connecting for Zach

Action items


ItemWhoAction
BTUG March 3

?


CG,
AB, KC,

?


- please register in advance
https://www.eventbrite.com/e/boston-tableau-user-group-registration-15656171029

- Greeting table, name tags, pens        √ CG      

-  Pics of lobby reception, talks              √ KC & AB

- Expert help (bring laptops)                  ?     
ExEx Sample Tracking (RPT-2753)
?
TAG take a look to see if we can help
ExEx Flowcell Tracking (RPT-2752)
MM

- They could use a (refreshed once or twice daily) view to quickly spot underperforming flowcells in process

- Filter current report to show only ExEx flowcells        √ MM

XML from BSPMM, ?- Continue collaboration with James and Damien for encouraging ownership and future planning         ? 

LC Product Performance Report too slow

MM- Review filters, perhaps use extract                  √ MM
DB Upgrade?Other TAG members please try to connect in Tableau
Optimize extracts in Tableau for best performanceCGPDO_STAR2


2015-02-12

Attendees

Kristen Connolly (Unlicensed) Zach Leber (Unlicensed) Christina Gearin Qing Yu (Unlicensed) Amy Biasella (Deactivated) Mariela Mihaleva

Discussion items

TimeItemWhoNotes

Infinium SRSZL

Wendy said Infinium pull box has been running empty. Maybe because SRS team got used to half-capacity? Or because SRS team lost a member?


Decentralized picoZL

Brian Sogoloff is involved with pico. Apparently we're physically decentralizing plating pico


WGS poolingZL

still not live


snowall

Mariela just arrived in snowpants. Commutes are 3h.


Coverage metricsall

Covg metrics are still manually calculated. 3 versions: old pipeline (lowest), analytics-assisted intermediate (highest), new pipeline (middle)


Delta-queueZL

Thinking about better ways to visualize queue, incoming work, completed work, (throughput), backlog, WIP age, etc.

2015-02-05

Attendees

Kristen Connolly (Unlicensed)

Discussion items

TimeItemWhoNotes
snowZL

We can't recover from snow days, so we'll be putting people up in hotels to keep sequencers running.


VMsKCKristen will get SQLNavigator added to Tableau-TS-beta computer to help with working from home on snow days.

MatlockZL/QYProject Matlock launching April1. Qing is doing compliance training.

DSDE/KDUXZLPromethius/DSDE/Epimethius (customer-facing portal for promethius): Anthony & Scott(sutherland?) are looking into what to use in that portal for showing customers. Zach to meet with Anthony. KDUX is a new group at Broad that's the opposite of DSDE. It's a small group initiative (Scott sutherland, Zim, ??)

ELNsZLCORE electronic lab notebooks (ELN) is being discussed. CORE is where Stalker now works.

GenologicsZLZach is also talking with Genologics today.

Stanley Ctr scatterplotZLDaniel MacArthur, Brett Thomas, and others have created a scatterplot online on Amazon that cobbles spreadsheets together and is getting a lot of attention, but we might be able to assist in improving.

scalaAMNasko loves scala so much.

Raw WGS metricsMMThe RAW WGS metrics go live today, BUT we'd like Picard to backfill actual calculations for all HiSeqX WGS instead of using zeros as placeholders to allow inner joins.

backlogMMMariela has time to help with some of the things in Kristen's backlog.

TDE "last updated"ZL

SkyLAB on Tableau Online has published extract timestamp problems, timezone problems, etc.

How can we create clear visibility on "data last updated" for TDE-based reports?


BTUGZLNext BTUG meeting has been moved (again), now scheduled for March 3, and hopefully there won't be another blizzard.

2015-01-29

Attendees

Kristen Connolly (Unlicensed)L, KC, AM, MM, QY, AB, CG

Discussion items

TimeItemWhoNotes
--JIRA upgrade

Upgrading SEQBLDR went well, but there were some plugin issues as usual (specifically SRSPlus agile board problems related to link queries)


Tableau9 beta

ZL Installed TS9 & TD9 on tableau beta. Plenty of problems, especially with TS (which gets very little testing).

Totally reimagined UI on TS9 will be problematic

Terrible admin pages on TS9


Project MatlockZL, QY

Jim wants Qing to help Brendan with "Project Matlock", which looks into the MCKD1 gene on sequenom.

Want to get data out of the sequenom DB without going through GAP DB. Throw it into excel or Tableau for simple math.

New CRSP Lab product.


Genomes system hackballZL, CG

Whole Genomes have a couple new products to handle custom pricing, quotes/contracts.

Tina has been trying to deal with the different billing for different contracts, and created 2 new products accordingly. Using Mercury's project tracker is very slow. She changes the product description in Mercury, slowly downloads data, changes it back, downloads the rest of the data, puts them together... It works, but is dangerous and fragile. It also splits WGs downstream, so Chris has to create workarounds to put all the WGs back together for the rest of the process to view.

Mercury team is looking into what they can do to help

2015-01-22

Attendees

Everyone in Analytics

Discussion items

Time

Item

Who

Notes

5m

Infinium Reagents

qing

Infinium fire required reagent tracking. Howie helped Qing find it in DB. They pulled 2months of data and MikeD is happy. Wendy may want this turned into live view over a longer time period.

5m

Pico WR

kc

New Pico WR is rolled out into production. Still developing process monitoring report, but PM report is done. At some point in the future, we may want to speak with BSP LIMS team about creating table output format for these data to avoid XML-table ETLs.

5m

Doumont

zl

Great talk last Tuesday, will share notes. Plan presentations from the perspective of the audience, providing them with info to allow them to make real life changes.

5m

Product Throughput Report & queue views

zl

it's number-heavy, table form. May not be ideal, but works for now. The queue size is the leading indicator of our process & throughput. We should go back to our old method of viewing queue changes over time (incoming, completed, and backlog on one plot). For PDOSTAR2, a published datasource, we can't do custom SQL off it, so we'll have to go back a level to do the custom SQL. Can the T8 label improvements help make data digestible at a glance? Can we use extracts to speed up data refreshes, since most of these won't be needed live?

5m

skylab

zl

It's working. Complex underlying commands. Showed to marketing group.

5m

BTUG

all

Zach will set up early. Nathan is coordinating drinks, no snacks. Doctor meet-up table? Flip-board sign? Balloon IDs? Qing will coordinate nametags & pens. If we can be there as experts and sign-ins, great. 180 people registered!

5m

Picard analysis times

zl

Steve has a plot that probably covers this base, though we need to change the axis to continuous discrete to prevent week 1 from showing before the previous year's week 52. New WGS metrics coming soon. Zach will make RPT ticket.

5m

Tableau 9 beta

zl

It's just released (TD & TS). We don't have a plan for testing the beta yet. Doesn't look to have any major improvements relevant to us.

2015-01-08

5m

GA4GH

All

Discussed MPG talks about the Global Alliance for Genomics and Health

Broad/DSDE is leading the effort to develop APIs for data sharing within and between centers

5m

Aggregation Refresh Requests

Mariela

Most refresh requests are no longer necessary, some leftover refreshes still happening for fingerprints, goal is total automation

Problem with control samples (showing as zero LOD) is fixed

It might be helpful to document (even at a very high level) the steps that need to happen (re-agg vs refresh, triggers, etc.)

10m

Mid-year reviews

All

Software Engineering and Analytics Engineering job families distributed

Kristen promoted to Senior Analytics Engineer

Mid-year reviews with rest of group to be scheduled

5m

Strategy Board Highlights

Zach

Pipeline stopped over break except for Buick, lots of catching up to do, things are late, don't know when we'll be caught up

Infinium failures still very high, production at half-capacity

BSP developers focusing on Pico release for Jan 21, this is the only facet of this release. Analytics is almost ready to switch reports to LIMS data from JIRA data

8-plex pooling for genomes coming in late January

5m

JIRA Service Desk

All

Customers can use JIRA Service Desk to log SUPPORT tickets (both bugs and feature requests)

Zach would like SUPPORT tickets tagged for Analytics to notify analytics@broad

Would like to avoid exposing analytics@broad to other SUPPORT tickets

5m

Other

All

Qing and Kathleen have begun discussing pipeline options for WalkUp

JIRA upgrade is still on the horizon

Time
Item
Who
Notes

2014-12-18

Discussion items

Time

Item

Who

Notes

10m

Scala

Nasko

Switched SQL library to ScalikeJDBC (from Querulous), updated from Scala 2.9 to 2.11

Should present to softeng in 2015 Q1 (ETL documentation and architecture, plus show examples of Java vs. Scala vs. Scala Functional)

20m

Internal Communications

All

Need more proactive alerts from lab before making process changes, e.g. pico daughter tubes, machine modes

Need more proactive information from other LIMS developers, e.g. 12-digit barcodes

Need to ask Andy Hollinger to keep Analytics posted and remind others to do the same

Closer relationships with users and developers makes it more likely they will give us information

10m

Updates

All

SeqProd VM upgrade on 12/16 went well, new database is faster, but one query is badly broken (http://analytics-etl:8090/etl_runs/analytics.etl.DeckEventAgent$)

Buick Team re-hybing all the production samples with new bait, will be working 8a-8p over the break, please spot check e-mail

Productopia meeting cancelled forever, will update product metrics on white board instead

2014-12-11

Discussion items

Time

Item

Who

Notes

30m

Mercury LIMS Analytics

All

Compile notes on current limitations of analytics for Mercury LIMS, e.g. library ancestry and naming. Mariela to organize and meet with Jon Thompson to discuss the implications of moving a Core product (e.g. Genomes) from Squid to Mercury in 2015

20m

Updates

All

Nasko standardizing his Scala frameworks to make support and development easier

Christina adding special handling for Merck PDOs into WGS Topoff Tool

Mariela updating Crazy Robots service dashboard, ETL to get base call metrics into agg tables, new report to correlate Seq metrics with QC metrics, SeqProd-VM performance testing

Amy working with lab on big overhaul to standard LCSET workflow in JIRA - filters, rapid boards, and Tableau will need updates. Request and comments found in

Error rendering macro 'jira' : Unable to locate JIRA server for this macro. It may be due to Application Link configuration.

Zach working on RoboView kiosks and BroadMap Mobile

2014-12-04

Attendees

Zach Leber (Unlicensed)

Qing Yu (Unlicensed)

Atanas Mihalev

Christina Gearin

Mariela Mihaleva

Discussion items

Time
Item
Who
Notes
30 minutesWalkUp demoQing

Demo of newest interface including batch submissions

Summary posted atWalk Up web interface, database, IRB/OSAP report

5 minutesupdatesQing

Infinium barcodes now 11 digits, database field limit updated

5 minutesupdatesNasko

Confirming pico ancestry still okay with latest lab process using backup instead of primary stock

Developing ETL Delta Object in functional Scala

5 minutesupdatesChristina

Need special handling of Mean Coverage for Merck PDOs

Fixing barcode mismatches for de-indexed WGS aggregations

5 minutesupdatesMariela

confirming Tableau reports are working for Nexomes (Pooling Calculator, LC Performance)

added tab to Crazy Robots report showing number of tickets closed
https://tableau.broadinstitute.org/views/CrazyRobotsServiceRecordsLiquidHandlers/ServiceTicketsThroughput

5 minutesupdatesZach

staging RoboView kiosks for use in labs

testing BroadMap Mobile on iPhones

Error rendering macro 'create-from-template' : No space found with space key: GPI


2014-11-20

Attendees


Zach Leber (Unlicensed)

Mariela Mihaleva

Qing Yu

Christina Raymond Gearin

Atanas Mihalev

Amy Biasella

Kristen Connolly (Unlicensed)

Discussion items

Time

Item

Who

Notes

10 mins

Broad Retreat review

Qing/Kristen

Qing honored with Excellence in Engineering award for WalkUp work

Best retreat yet (good talks and good space)

More of Analytics should go next year

20 mins

JIRA custom field catalog service

Nasko

New service to catalog custom field types

Automatically generates the SQL needed for Oracle views

Available on the Confluence website:Jira DWH

5 mins

Pico review

Kristen

BSP will make it much easier to retrieve Pico status per sample

5 mins

SRS+ agile board

Amy

Adds swimlanes by SRS bucket and includes pending column to show parent tickets not yet marked for consolidation

Current implementation uses linkedIssuesInQuery() function which can be very slow - considering alternatives until LOJ is upgraded to JIRA 6.3/Agile 6.6 in December.

5 mins

SeqBldr upgrade

Zach

Will be upgraded from Oracle 11.1 to 11.2 on December 18. This hosts all Atlassian apps. Prior to December 18, LOJ will be upgraded per above. And prior to that, LOJT will be upgraded, which we can use to test our view code.

2014-11-06

Attendees


Zach Leber (Unlicensed)

Mariela Mihaleva

Qing Yu (Unlicensed)

Christina Gearin

Atanas Mihalev

Amy Biasella (Deactivated)

Goals

meeting management (Kristen, 15 mins)

work management (Zach, 5 minutes)

last week’s topics (all, 10 minutes)

switch to Mercury HTTPS (Nasko, 5 minutes)

general updates (all, 10 minutes)


Discussion items

Time

Item

Who

Notes

15 min

Meeting management

Kristen

Overview of Goals, Roles and Process

  • Create an agenda before a meeting so you don't waste your and others' time

  • Roles: Facilitator, Presenters, Scribe, Timekeeper

  • What do you want people to Know, Believe, Feel, Do

5 min

Work management

Zach

more than 3 in process is probably too many

we need to eliminate our backlog – try to keep them small, punt the other issues so we can respond faster to new important opportunities

Now or never decisions – trust ourselves and our ideas

the platform is falling over; let's not follow suit. Keep credibility with our customers, so when we say "won't fix" they'll accept it

ACTION: Everyone needs to go through their lists

10 min

Last week's topics

all

implemented new filtering via yossi/kylee. Chris putting it in reports, platform is trying to sell it. Indefinitely far away right now with current filres.

action: get Qing's stuff on stash

kristen will come around individually to get rest of JIRA views sussed out

5 min

Switch to Mercury

Nasko

squid, gap, bsp, everything is changing to https

5-6 places overall that need alteration for the big Mercury switch on Monday

nasko's fix is single line, so on monday (by noon) he should be done with the change.

ETLs are running on a schedule, so if he fixes the problem during the right window, no ETL should have problems.

we can continue to ignore the cert in production. chances are small that anyone will pretend to be mercury

10 min

General updates

all

amy and zach going to dana farber for JIRA today.

do we want to add nasko's ETL to the list of JIRA-related consulting stuff?

how should we handle the refresh problems? probably going to continue having swaps...

probably release Walk Up batch submission version this week. Access array metrics per target table is created, and the updated Tableau report is published. Wendy's group and Jonna used it a lot before. However, Teni's group has not started to use it.

chris to make a comparison tool, kristen to help with seq plating data

CLF in testing. CR working great.

Earlier

Meeting notes

Plan your meetings and share notes and actions with your team.