In some occasions, the analytics team may need to take specific actions to check or update changes in the datamarts. Examples in which this may happen may include the following:
The main steps a user may need to follow in these occasions includes checking metadata changes and pushing datamart refreshes. The processes explained below are meant to help the user take step-by-step actions if and when required. A user may not need to take the steps in all processes.
Table of Contents
Download the CSV or XLS file from the SUPPORT ticket (they should be interchangeable). Copy the distinct "Old" Collaborator Sample IDs to paste them into the SQL queries below.
Log in to REPORTING@SEQPROD on SQL Navigator and run the following query:
SELECT DISTINCT s.bsp_collaborator_sample_id, s.collaborator_participant_id, s.aggregation_project, s.last_aggregation_date, s.external_sample_id, s.pdo_name, s.pdo_sample_id, s.sample_delivery_status, s.sample_is_billed FROM cognos.pdo_star5 s WHERE s.bsp_collaborator_sample_id IN ('oldcollabSMID1','oldcollabSMID2') |
Log in to REPORTING@SEQPROD on SQL Navigator and run the following query:
SELECT DISTINCT a.collaborator_sample_id, a.sample_id bsp_sample_id, a.project, a.research_project_id, a.product_order_key, a.product_order_sample, a.collaborator_participant_id, a.flowcell_barcode, a.run_name FROM cognos.slxre_readgroup_metadata a WHERE a.collaborator_sample_id IN ('oldcollabSMID1','oldcollabSMID2') |
2. If records are returned, then the datamart needs a refresh (section 2.4 below). Note the FLOWCELL_BARCODE of any records returned.
Check for metrics in DB with the following queries:
SELECT * FROM metrics.basecalling_analysis ba WHERE ba.flowcell_barcode IN ('HTH3KBBXX') AND ba.metrics_type IN ('Indexed', 'Unindexed') ; SELECT * FROM metrics.picard_analysis ba WHERE ba.flowcell_barcode IN ('HTH3KBBXX') AND ba.metrics_type IN ('Indexed', 'Unindexed') ; --Understand when flowcells were updated in Analytics DM SELECT * FROM slxre_rghqs_targets t WHERE t.flowcell_barcode = --'HTH3KBBXX' ; |
SELECT * FROM slxre2_pagg_sample a WHERE a.SAMPLE IN ('UNKNOWN_SubmittedAs_33433.wb','UNKNOWN_SubmittedAs_33514.wb') ; SELECT a.sample_ids plated_samples, a.individual_name, a.* FROM slxre2_pagg_sample a WHERE a.SAMPLE IN ('33433.wb','33514.wb') ; -- check whether the same read group is aggregated under old and new collab sample id SELECT rg.flowcell_barcode, rg.lane, rg.molecular_barcode_name, count(*), concat_string_csv(DISTINCT a.SAMPLE ) FROM metrics.aggregation a, metrics.aggregation_read_group rg WHERE a.SAMPLE IN ('UNKNOWN_SubmittedAs_33433.wb','33433.wb') AND a.is_latest <> 0 |
Review Tiger3 ETL Shell for more details on this section |
In order to follow this process, you need to obtain the PDO_NAME of any records returned in section 1.1. |
paste & run the etl-command separately for each PDO, changing PDO-12480 to the relevant PDO returned above.
AnalyticsEtlDB("analytics.tiger.agents.PdoStar.fromSQL", deltaFromSql("""SELECT DISTINCT a.pdo_name||','||a.pdo_sample_id FROM cognos.pdo_star5 a WHERE a.pdo_name = 'PDO-12480'""") flatMap DataSetEtl() flatMap analytics.tiger.agents.PdoStar.pipeline) |
The command takes a little while. You'll know it's complete when you get the scala prompt again
For this process, you can bypass section 2.1 and use the "Old" Collaborator Sample IDs obtained in section 1. |
paste & run the following etl-command once, pasting the "Old" Collaborator Sample IDs in the query below
import analytics.tiger.agents.PdoStar._ AnalyticsEtlDB("analytics.tiger.agents.PdoStar.fromSQL", collSamplesToPdoSamples(List({{"oldcollabSMID1"}}{{,"}}{{oldcollabSMID2"}})) flatMap DataSetEtl() flatMap pipeline) |
The command takes a little while. You'll know it's complete when you get the scala prompt again
Review Analytics ETL Framework for more details on this section |
Request a refresh for RunMercuryAgent by issuing the following query:
BEGIN cognos.data_warehouse_api.request_refresh(P_FLOWCELL_BARCODE=>'XYZ', P_REQUESTED_BY=>'JohnDoe', P_COMMENTS=>'Test', P_DEFERRAL_ANALYSIS_DATE=>null); END; |
If you have multiple flowcells, you may run the query as follows, adding additional lines after BEGIN:
BEGIN cognos.data_warehouse_api.request_refresh(P_FLOWCELL_BARCODE=>'XYZ1', P_REQUESTED_BY=>'JohnDoe', P_COMMENTS=>'Test', P_DEFERRAL_ANALYSIS_DATE=>null); cognos.data_warehouse_api.request_refresh(P_FLOWCELL_BARCODE=>'XYZ2', P_REQUESTED_BY=>'JohnDoe', P_COMMENTS=>'Test', P_DEFERRAL_ANALYSIS_DATE=>null); cognos.data_warehouse_api.request_refresh(P_FLOWCELL_BARCODE=>'XYZ3', P_REQUESTED_BY=>'JohnDoe', P_COMMENTS=>'Test', P_DEFERRAL_ANALYSIS_DATE=>null); ... cognos.data_warehouse_api.request_refresh(P_FLOWCELL_BARCODE=>'XYZX', P_REQUESTED_BY=>'JohnDoe', P_COMMENTS=>'Test', P_DEFERRAL_ANALYSIS_DATE=>null); END; |
Note: No COMMIT command needed after this particular call - it's implicitly done for you. Just replace the XYZ with a flowcell barcode. Leave all other parameters there and unchanged.
3. Wait for RunMercury agent and recheck.
The RunMercury agent is run on a schedule. After the next run, rerun the SELECT SQL clause in step 1 to ensure that the records have been refreshed and are not returned with the old collab sample id.
Requesting sample metadata changes