Determine the root cause of missing BAM files path
Request:SUPPORT-9123
My samples are missing BAM file paths in PDO Tracker report. This translates into analysis are missing in the data source (cognos.pdo_star5_aux). The BAM file path is constructed in the report based on the aggregation analysis.
Troubleshooting process.
Reproduce the issue in the report and then in SQL.
SELECT a.aggregation_project, a.analysis_id, a.data_type, a.runs_end, a.bsp_collaborator_sample_id, a.* FROM pdo_star5_aux a WHERE a.pdo_sample ='SM-JVCPT'
;
-- returns NULL for aggregation_project and analysis_id. This means the PDO Star ETL didn't pick the aggregation. This data comes from aggregation data mart (cognos.slxre2_pagg_sample)Check the aggregation data
SELECT * FROM slxre2_pagg_sample a WHERE a.SAMPLE ='microbial_pos_1159'; -- this is the bsp_collaborator_sample_id from Q1.
-- returns no records.Check whether there are aggregations in the raw data - where the pipeline records the metrics.
SELECT * FROM metrics.aggregation a WHERE a.SAMPLE ='microbial_pos_1159' -- this is the bsp_collaborator_sample_id from Q1.
-- returns no records. This means that the aggregation has not been run yet. It can be queued (but we don't have visibility to that ) or the sequenced read groups could have been blacklisted. Blacklisted read groups are not included in the aggregation.
Check for sequencing data. Set a run_date filter that should capture the runs
-- returns the read groups that the sample was sequenced on.
Check whether the read groups were blacklisted.
5.1. By looking up the blacklist table:
-- returns no records. Read groups are not blacklisted.
5.2. By joining the metadata and blacklisting tables
-- returns all sequenced read groups but the flowcell from blacklist table (b.flowcell_barcode) is NULL. This means that these read groups are not present in the blacklist table.
Turns out this sample was not aggregated and its read groups were not blacklisted.
A similar checks should be done for all samples in the request and make sure that the same explanation is valid for all of them. ( replace "a.pdo_sample ='SM-JVCPT'" with "a.pdo_sample in ('SM-JVCPT', 'SM-...', ...)" and similarly in other queries look up all samples in a list.
In this case the request to run aggregation should be passed to Pipeline Ops.
Usually, when they complete, the data is automatically being picked up in order - in aggregation DM and then in the PDO Star DM.