Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 71 Next »

In some occasions, the analytics team may need to take specific actions to check or update changes in the datamarts. Examples in which this may happen may include the following:

  • Consent withdrawal
    • Participants may voluntarily request to withdraw their participation from specific ongoing research studies or other projects. Regulations indicate that any existing metadata must be updated to reflect such withdrawal request. These changes may not be readily available until the right steps are followed to update the metadata on the datamarts.
  • Fingerprinting and gender related changes
    • Requests for fingerprinting reanalyses and gender related changes may occur for several reasons such as missing FP assays, sample swaps, and collaborator requests. These changes may not refresh automatically in the datamarts for which refresh pushes may be necessary. 

The main steps a user may need to follow in these occasions includes checking metadata changes and pushing datamart refreshes. The processes explained below are meant to help the user take step-by-step actions if and when required. A user may not need to take the steps in all processes.


Table of Contents

1. Checking Metadata Changes

Download the CSV or XLS file from the SUPPORT ticket (they should be interchangeable). Copy the distinct "Old" Collaborator Sample IDs to paste them into the SQL queries below.

1.1. Check whether PDO_STAR is updated

  1. Log in to REPORTING@SEQPROD on SQL Navigator and run the following query:

    SELECT DISTINCT s.bsp_collaborator_sample_id, 
    				s.collaborator_participant_id, 
    				s.aggregation_project, 
    				s.last_aggregation_date, 
    				s.external_sample_id, 
    				s.pdo_name, 
    				s.pdo_sample_id, 
    				s.sample_delivery_status, 
    				s.sample_is_billed
    FROM cognos.pdo_star5 s 
    WHERE s.bsp_collaborator_sample_id IN ('oldcollabSMID1','oldcollabSMID2')
    
    
  2. If no records are returned, then no further action is needed (datamarts have already been refreshed)
  3. If records are returned, then either there are supposed to be some samples remaining (this is less frequent) or datamarts need refreshing.
  4. Before refreshing PDO STAR, first check to see if SLXRE_READGROUP_METADATA is updated (section 1.2 below).
  5. If readgroup metadata is updated, then the PDO STAR datamart needs a refresh (see sections 2.1-2.3 below). Note the PDO_NAME of any records returned.

1.2. Check whether SLXRE_READGROUP_METADATA is updated

  1. Log in to REPORTING@SEQPROD on SQL Navigator and run the following query:

    SELECT DISTINCT a.collaborator_sample_id,
    				a.sample_id bsp_sample_id,
    				a.project, a.research_project_id,
    				a.product_order_key,
    				a.product_order_sample,
    				a.collaborator_participant_id,
    				a.flowcell_barcode,
    				a.run_name
    FROM cognos.slxre_readgroup_metadata a
    WHERE a.collaborator_sample_id  IN ('oldcollabSMID1','oldcollabSMID2')
  2. If no records are returned, then no further action is needed (datamarts have already been refreshed)
  3. If records are returned, then the datamart needs a refresh (section 2.4 below). Note the FLOWCELL_BARCODE of any records returned.

 1.3. Check whether there's updated data to be added

  1.  Check for metrics in DB with the following queries:

    SELECT *
    FROM analytics.bsp_sample
    WHERE  collaborator_sample_id IN 'oldcollabSMID1, oldcollabSMID2'
    ;
    SELECT *
    FROM metrics.basecalling_analysis ba 
    WHERE ba.flowcell_barcode IN ('HTH3KBBXX')
    AND ba.metrics_type IN ('Indexed', 'Unindexed')
    ;
    SELECT *
    FROM metrics.picard_analysis ba 
    WHERE ba.flowcell_barcode IN ('HTH3KBBXX')
    AND ba.metrics_type IN ('Indexed', 'Unindexed')
    ;
    --Understand when flowcells were updated in Analytics DM
    SELECT *
    FROM cognos.slxre_rghqs_targets t 
    WHERE t.flowcell_barcode = 'HTH3KBBXX'
    ;

1.4. Check aggregations

  1.  Check for aggregations in DB with the following queries:
SELECT *
FROM cognos.slxre2_pagg_sample a 
WHERE a.SAMPLE IN ('UNKNOWN_SubmittedAs_33433.wb','UNKNOWN_SubmittedAs_33514.wb')
;
SELECT a.sample_ids plated_samples, a.individual_name, a.*
FROM cognos.slxre2_pagg_sample a 
WHERE a.SAMPLE IN ('33433.wb','33514.wb')
;
-- check whether the same read group is aggregated under old and new collab sample id 
SELECT is_latest, rg.flowcell_barcode, rg.lane, rg.molecular_barcode_name, count(*),
concat_string_csv(DISTINCT a.SAMPLE )
FROM metrics.aggregation a,
metrics.aggregation_read_group rg  
WHERE a.SAMPLE IN ('UNKNOWN_SubmittedAs_33433.wb','33433.wb')
 AND a.is_latest <> 0
 AND a.id = rg.aggregation_id
GROUP BY rg.flowcell_barcode, rg.lane, rg.molecular_barcode_name, is_latest

NOTE: is_latest should be 2 for anything with consent withdrawn (2 signifies BAM is deleted)

1.5. Check whether Sample datamarts have been refreshed

  1. If refreshes aren't showing updated data, it may be because the sample ETL is down. If LIMS team has done their part, but the bsp.analytics_sample and analytics.bsp_sample DMs are not updated, check the ETL agent here: http://analytics:8090/etl_runs?status=Succeeded&status=Failed&status=Running&status=Abandoned&agent=analytics.tiger.agents.BspSample&action=Refresh

2. PDO STAR, RGHQS, and Read Group Metadata Datamart Refresh Push

Reference

Review Tiger3 ETL Shell Commands (refreshes, etc.) for more details on this section

PDO STAR automatically refreshes samples that are not billed every two hours, and does a full refresh of 1-year data on Friday mornings at 2:55 am. If metadata change requests are for orders that are not billed or are less than a year old, a manual push may not be necessary. However, step 2.4 below should be followed regardless to check whether the metadata changes were made in PDO STAR.

2.1. Option 1 - Refresh PDOSTAR & RGHQS using RunEtl tool

  1. Instructions here

2.2. Option 2 - Manual Refresh in PDO STAR using a query (alternate method using "Old" Collaborator Sample IDs)

For this process, you can bypass section 2.1 and use the "Old" Collaborator Sample IDs obtained in section 1.

  1. Log in to your favorite SSH tool (e.g. SecureCRT)
    1. connect to "analytics" server
    2. enter the following: /home/unix/analytics/TigerETL3/runEtlShell.sh
    3. paste & run the following etl-command once, pasting the "Old" Collaborator Sample IDs in the query below

      import analytics.tiger.agents.PdoStar._
      AnalyticsEtlDB("analytics.tiger.agents.PdoStar.fromSQL", collSamplesToPdoSamples(List("oldcollabSMID1","oldcollabSMID2")) flatMap DataSetEtl() flatMap pipeline)

      NOTE: These are supposed to be two lines (the first ends with an underscore). Once you hit enter, the command takes a little while. You'll know it's complete when you get the scala prompt again.

2.3. Option 3 - Manual Refresh in PDO STAR using a query (using PDO_NAMEs)

In order to follow this process, you need to obtain the PDO_NAME of any records returned in section 1.1.

  1. Log in to your favorite SSH tool (e.g. SecureCRT)
    1. connect to "analytics" server (command: ssh analytics)
    2. enter the following: /home/unix/analytics/TigerETL3/runEtlShell.sh
    3. paste & run the etl-command separately for each PDO, changing PDO-12480 to the relevant PDO returned above.

      AnalyticsEtlDB("analytics.tiger.agents.PdoStar.fromSQL", deltaFromSql("""SELECT DISTINCT a.pdo_name||','||a.pdo_sample_id FROM cognos.pdo_star5 a WHERE a.pdo_name = 'PDO-12480'""") flatMap DataSetEtl() flatMap analytics.tiger.agents.PdoStar.pipeline)
    4. The command takes a little while. You'll know it's complete when you get the scala prompt again

2.4. Reviewing that PDO STAR did refresh

  1. Rerun the PDO_STAR query above. This refresh is immediate, so you can run the query right away.
  2. If it returns nothing, the DM is all set.
  3. If it still returns records, ensure that that is unexpected behavior and try the refresh again. (occasionally, there are other samples that need to remain with the "old" IDs, as is usually the case in sample swaps)
    1. If it still returns unexpected records, talk with Nasko.

2.5. Refreshing the ReadgroupMetadata DM in SEQPROD Database

Reference

Review Analytics ETL Framework for more details on this section

  1. Log in to COGNOS@SEQPROD on SQL Navigator
  2. Request a refresh for RunMercuryAgent by issuing the following query:

    BEGIN
    cognos.data_warehouse_api.request_refresh(P_FLOWCELL_BARCODE=>'XYZ', P_REQUESTED_BY=>'JohnDoe', P_COMMENTS=>'Test', P_DEFERRAL_ANALYSIS_DATE=>null);
    END;

    Note: No COMMIT command needed after this particular call - it's implicitly done for you. Just replace the XYZ with a flowcell barcode. Leave all other parameters there and unchanged.

    If you have multiple flowcells, you may run the query as follows, adding additional lines after BEGIN:

    BEGIN
    cognos.data_warehouse_api.request_refresh(P_FLOWCELL_BARCODE=>'XYZ1', P_REQUESTED_BY=>'JohnDoe', P_COMMENTS=>'Test', P_DEFERRAL_ANALYSIS_DATE=>null);
    cognos.data_warehouse_api.request_refresh(P_FLOWCELL_BARCODE=>'XYZ2', P_REQUESTED_BY=>'JohnDoe', P_COMMENTS=>'Test', P_DEFERRAL_ANALYSIS_DATE=>null);
    cognos.data_warehouse_api.request_refresh(P_FLOWCELL_BARCODE=>'XYZ3', P_REQUESTED_BY=>'JohnDoe', P_COMMENTS=>'Test', P_DEFERRAL_ANALYSIS_DATE=>null);
    ...
    cognos.data_warehouse_api.request_refresh(P_FLOWCELL_BARCODE=>'XYZX', P_REQUESTED_BY=>'JohnDoe', P_COMMENTS=>'Test', P_DEFERRAL_ANALYSIS_DATE=>null);
    END;
    
    

2.5. Wait for RunMercury agent and recheck.

The RunMercury agent is run on a schedule. After the next run, rerun the SELECT SQL clause in step 1.2 to ensure that the records have been refreshed and are not returned with the old collab sample id.

http://analytics:8090/etl_runs?status=Succeeded&status=Failed&status=Running&status=Abandoned&status=Halted&agent=analytics.tiger.agents.RunMercury&action=Refresh


3. Related pages

Performing metadata changes

Requesting sample metadata changes

Tiger3 ETL Shell Commands (refreshes, etc.)


  • No labels