Table of Contents | ||||
---|---|---|---|---|
|
...
Code Block |
---|
SELECT a.run_name, a.cell_well, "etl.dataset", c."rowid", c."Read Length (bp)" "Read Length (bp) RAW", -- for exploration purposes only --REPLACE(c."Read Length (bp)", CHR(191), '>=') "Read Length (bp)", -- '>=' UTF8 e2 89 a5 "Reads", "Reads (%)" ,"YieldDECODE(rawtohex(c."Read Length (bp)"), "Yield (%)" FROM pacbio a, json_table(DATA, '$[*]' COLUMNS(BF2030' "etl.dataset" path, '$."etl.dataset">= 0', NESTED PATH '$."etl.ccs2.hifi_length_summary"[*]BF20352C303030' COLUMNS( , '>= 5000', "rowid" PATH '$.rowidBF2031302C303030', '>= 10000', "Read Length (bp)" 'BF2031352C303030', '>= 15000', PATH '$."ccs2.hifi_length_summary.read_length"BF2032302C303030', '>= 20000', "Reads" 'BF2032352C303030', '>= 25000', NUMBER PATH '$."ccs2.hifi_length_summary.n_reads"','BF2033302C303030', '>= 30000', 'BF2033352C303030', '>= 35000', "Reads (%)" 'BF2034302C303030', NUMBER PATH '$."ccs2.hifi_length_summary.reads_pct"''>= 40000', "Yieldrawtohex(c."Read Length (bp)") -- catch everything else NUMBER PATH) '$."ccs2.hifi_length_summary.yield"'"Read Length (bp)", "Reads", "Reads (%)" ,"Yield (bp)", "Yield (%)" FROM pacbio a, json_table(DATA, '$[*]' COLUMNS( NUMBER"etl.dataset" path '$."etl.dataset"', NESTED PATH '$."etl.ccs2.hifi_length_summary.yield_pct"'"[*]' COLUMNS( ) )) AS c WHERE site_id=3 AND a.domain='CROMWELL/sl_dataset_reports/*/call-import_dataset_reports/execution/ccs.report.json*' AND a.run_name='r64020e_20220519_191246' AND a.cell_well='1_B01' |
...
Metrics stored in “attributes“ JSON-array
Other metrics are stored in “attributes” JSON-array (on the left side). A new synthetic “etl.attributes“ JSON-object is added to allow more natural JSON-extraction from the DB.
...
Code Block |
---|
SELECT a.run_name, a.cell_well, "etl.dataset", "HiFi Reads", "HiFi Yield (bp)", "HiFi Read Length (mean, bp)" FROM pacbio a, json_table(DATA, '$[*]' COLUMNS( "HiFi Reads""rowid" PATH '$.rowid', "Read Length (bp)" PATH '$.read_length', "Reads" NUMBER PATH '$.n_reads', "Reads (%)" NUMBER PATH '$."etl.attributes"."ccs2.number_of_ccs_reads".value',reads_pct', "HiFi Yield (bp)" NUMBER PATH '$."etl.attributes"."ccs2.total_number_of_ccs_bases".valueyield', "HiFi Read Length (mean, bp"Yield (%)" NUMBER PATH '$."etl.attributes"."ccs2.mean_ccs_readlength".value',yield_pct' "etl.dataset" path '$."etl.dataset"' ) )) AS c WHERE site_id=36 AND a.domain='CROMWELL/sl_dataset_reports/*/call-import_dataset_reports/execution/ccs.report.json*' --AND rawtohex(c."Read Length (bp)") = 'BF2033302C303030' -- filter bucket >= 30000 AND a.run_name='r64020er64218e_2022051920221021_191246195314' AND a.cell_well='12_B01' |
...
The “superJSON” tool
Imagine you have SMRTLink screen in front of you saying “Longest Subread N50: 21250” for a given run/cell. How can you find out which metrics-file this number comes from ?
Open the “superJSON” tool (all files are merged in there), expand all nodes and search for this exact number https://analytics.broadinstitute.org/pacbioMetrics/3/r64386e_20220523_180557/4_D01/superjson
...
Code Block |
---|
SELECT a.run_name, a.cell_well, a.movie, c."raw_data_report.insert_n50"
FROM pacbio a,
json_table(DATA, '$[*]'
COLUMNS(
"raw_data_report.insert_n50" NUMBER PATH '$."etl.attributes"."raw_data_report.insert_n50".value'
)) AS c
WHERE site_id=3 AND a.domain='CROMWELL/sl_dataset_reports/*/call-import_dataset_reports/execution/raw_data.report.json'
AND a.run_name='r64386e_20220523_180557' AND a.cell_well='4_D01'
|
...
Keep in mind that UTF8 characters (like ‘>=’) - nicely rendered in Chrome - may have variable-length bytes representation and therefore Oracle’s rawtohex function is necessary.
https://www.utf8-chartable.de/unicode-utf8-table.pl?start=8704&number=128&names=-
REPLACE(c."Read Length (bp)", CHR(191), ‘>=') "Read Length (bp)", -- '>=' UTF8 e2 89 a5
seems to do the trick but DECODE expression gives you more control.
UPDATE: this is even less cryptic way to deal with non UTF-8 characters
Code Block |
---|
ASCIISTR("HiFi LenSum read_length") = '\00BF 10,000' |
Metrics stored in “attributes“ JSON-array
Other metrics are stored in “attributes” JSON-array (on the left side). A new synthetic “etl.attributes“ JSON-object is added to allow more natural JSON-extraction from the DB.
...
Code Block |
---|
SELECT a.run_name, a.cell_well, "etl.dataset", "HiFi Reads", "HiFi Yield (bp)", "HiFi Read Length (mean, bp)"
FROM pacbio a,
json_table(DATA, '$[*]'
COLUMNS(
"HiFi Reads" NUMBER PATH '$."etl.attributes"."ccs2.number_of_ccs_reads".value',
"HiFi Yield (bp)" NUMBER PATH '$."etl.attributes"."ccs2.total_number_of_ccs_bases".value',
"HiFi Read Length (mean, bp)" NUMBER PATH '$."etl.attributes"."ccs2.mean_ccs_readlength".value',
"etl.dataset" path '$."etl.dataset"'
)) AS c
WHERE site_id=3 AND a.domain='CROMWELL/sl_dataset_reports/*/call-import_dataset_reports/execution/ccs.report.json*'
AND a.run_name='r64020e_20220519_191246' AND a.cell_well='1_B01' |
...
The “superJSON” tool
Imagine you have SMRTLink screen in front of you saying “Longest Subread N50: 21250” for a given run/cell. How can you find out which metrics-file this number comes from ?
Open the “superJSON” tool (all files are merged in there), expand all nodes and search for this exact number https://analytics.broadinstitute.org/pacbioMetrics/3/r64386e_20220523_180557/4_D01/superjson
...
Code Block |
---|
SELECT a.run_name, a.cell_well, a.movie, c."raw_data_report.insert_n50"
FROM pacbio a,
json_table(DATA, '$[*]'
COLUMNS(
"raw_data_report.insert_n50" NUMBER PATH '$."etl.attributes"."raw_data_report.insert_n50".value'
)) AS c
WHERE site_id=3 AND a.domain='CROMWELL/sl_dataset_reports/*/call-import_dataset_reports/execution/raw_data.report.json'
AND a.run_name='r64386e_20220523_180557' AND a.cell_well='4_D01'
|
Additionally, couple of JSON documents are synthetically generated by the ETL at the “root” level. These might be useful for cross-reference purposes and can be seen via the “root” super-JSON
https://analytics.broadinstitute.org/pacbioMetrics/3/r64386e_20220523_180557/root/superjson
...
“per-barcode” metrics are supported by converting multiple “consensusreadset.xml“ files into JSONs and then merging these into a single “synthetic JSON-array“. These can be recognized by checking for trailing “*” at the end of “domain” field.
...
For a given cell and domain, if ETL comes across multiple files then it will naturally merge these into JSON-array.
However this logic is not sufficient if there is only 1 barcode registered per cell - therefore a list of exemption file-types (ccs.report.json
) is kept to instruct the ETL to always merge these into JSON-array regardless of number of files.
Metrics extracted through PacBio API
Turns out some information is not available in the JSON/XML files but can be extracted through the SMRTLink endpoints. Few new domains have been added: “API/runs” and “API/collections”
...
file-types (ccs.report.json
) is kept to instruct the ETL to always merge these into JSON-array regardless of number of files.
Metrics extracted through PacBio API
Turns out some information is not available in the JSON/XML files but can be extracted through the SMRTLink endpoints. Few new domains have been added: “API/runs” and “API/collections”
...
“API/runDataModel“ domain
This is special domain derived from the “apiRoot:/runs/UUID” API, where the “dataModel” field is extracted (turns out it’s an XML), converted into JSON and recorded in PACBIO datamart as “API/runDataModel“ domain. This data is also available in the “DATAROOT/*/*/*.run.metadata.xml“ domain however it would show up there later when cell “movies” start, etc.
...
“API/runDataModel/RecordedEvents” domain
Bunch of intrihuing “recorded events” were unearthed from PacBio’s dataModel. These are captured into the new “API/runDataModel/RecordedEvents” domain. Particularly interesting is the "AcquisitionInitializeInfo" event which apparently provides "reagent info" among others (see below)
...
How files are scraped from the file system - the linux voodoo magic
...
Code Block | ||
---|---|---|
| ||
scala> analytics.tiger.utils.AnalyticsDB("analytics.tiger.agents.PacBio.Sodium", analytics.tiger.agents.PacBio.Sodium.perRunETL("r64386e_20220523_180557",Map("override"->"true","verbose"->"true")), toCommit=true) TIGERETL_RUNID: 4534558 find /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557 -regex ".*\.\(json\|xml\)" => 48 files returned /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/3_C01/bc2012--bc2012/m64386e_220526_091216.bc2012--bc2012.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/3_C01/m64386e_220526_091216.ccs_reports.json /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/3_C01/m64386e_220526_091216.lima_guess.json /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/3_C01/m64386e_220526_091216.sts.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/3_C01/.m64386e_220526_091216.run.metadata.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/3_C01/m64386e_220526_091216.5mc_report.json /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/3_C01/bc2095--bc2095/m64386e_220526_091216.bc2095--bc2095.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/3_C01/bc2090--bc2090/m64386e_220526_091216.bc2090--bc2090.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/3_C01/m64386e_220526_091216.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/3_C01/.m64386e_220526_091216.metadata.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/3_C01/m64386e_220526_091216.unbarcoded.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/3_C01/bc2011--bc2011/m64386e_220526_091216.bc2011--bc2011.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/4_D01/bc2012--bc2012/m64386e_220527_172851.bc2012--bc2012.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/4_D01/m64386e_220527_172851.ccs_reports.json /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/4_D01/.m64386e_220527_172851.run.metadata.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/4_D01/m64386e_220527_172851.unbarcoded.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/4_D01/.m64386e_220527_172851.metadata.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/4_D01/bc2095--bc2095/m64386e_220527_172851.bc2095--bc2095.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/4_D01/bc2090--bc2090/m64386e_220527_172851.bc2090--bc2090.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/4_D01/m64386e_220527_172851.lima_guess.json /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/4_D01/m64386e_220527_172851.sts.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/4_D01/m64386e_220527_172851.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/4_D01/bc2011--bc2011/m64386e_220527_172851.bc2011--bc2011.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/4_D01/m64386e_220527_172851.5mc_report.json /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/2_B01/m64386e_220525_014545.unbarcoded.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/2_B01/bc2012--bc2012/m64386e_220525_014545.bc2012--bc2012.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/2_B01/m64386e_220525_014545.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/2_B01/m64386e_220525_014545.lima_guess.json /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/2_B01/bc2095--bc2095/m64386e_220525_014545.bc2095--bc2095.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/2_B01/bc2090--bc2090/m64386e_220525_014545.bc2090--bc2090.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/2_B01/.m64386e_220525_014545.run.metadata.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/2_B01/m64386e_220525_014545.5mc_report.json /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/2_B01/bc2011--bc2011/m64386e_220525_014545.bc2011--bc2011.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/2_B01/.m64386e_220525_014545.metadata.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/2_B01/m64386e_220525_014545.ccs_reports.json /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/2_B01/m64386e_220525_014545.sts.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/1_A01/bc2012--bc2012/m64386e_220523_181627.bc2012--bc2012.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/1_A01/m64386e_220523_181627.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/1_A01/.m64386e_220523_181627.metadata.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/1_A01/bc2095--bc2095/m64386e_220523_181627.bc2095--bc2095.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/1_A01/bc2090--bc2090/m64386e_220523_181627.bc2090--bc2090.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/1_A01/.m64386e_220523_181627.run.metadata.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/1_A01/m64386e_220523_181627.unbarcoded.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/1_A01/m64386e_220523_181627.lima_guess.json /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/1_A01/m64386e_220523_181627.5mc_report.json /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/1_A01/bc2011--bc2011/m64386e_220523_181627.bc2011--bc2011.consensusreadset.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/1_A01/m64386e_220523_181627.sts.xml /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/1_A01/m64386e_220523_181627.ccs_reports.json find /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions -path "/seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/*/inputs/*.consensusreadset.xml" -type l -ls | grep /r64386e_20220523_180557/ | cat => 15 files returned 9256292134 32 lrwxrwxrwx 1 pbprod gppacbio 120 Jun 9 09:52 /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/a0823154-4bbd-4b0a-9817-f78742054619/call-pbreports_barcode/inputs/480159576/m64386e_220526_091216.consensusreadset.xml -> /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/3_C01/m64386e_220526_091216.consensusreadset.xml 9387650676 32 lrwxrwxrwx 1 pbprod gppacbio 120 Jun 9 10:02 /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/c58dc438-f021-426c-89ce-e82ee4728d62/call-pbreports_barcode/inputs/478310612/m64386e_220523_181627.consensusreadset.xml -> /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/1_A01/m64386e_220523_181627.consensusreadset.xml 9294923607 32 lrwxrwxrwx 1 pbprod gppacbio 120 Jun 9 10:07 /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/b70fbab3-8f00-44b4-a0df-f8e0e607389e/call-pbreports_barcode/inputs/479235094/m64386e_220525_014545.consensusreadset.xml -> /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/2_B01/m64386e_220525_014545.consensusreadset.xml 9167679957 32 lrwxrwxrwx 1 pbprod gppacbio 120 Jun 9 09:59 /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/5675af6e-2370-41f2-b4bd-8b41454ed14e/call-pbreports_barcode/inputs/481084058/m64386e_220527_172851.consensusreadset.xml -> /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/4_D01/m64386e_220527_172851.consensusreadset.xml 9088193591 24 lrwxrwxrwx 1 pbprod gppacbio 120 Jun 9 10:07 /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/8578251e-cf2b-4a64-bf60-934ea70bdf8c/call-import_dataset_reports/inputs/479235094/m64386e_220525_014545.consensusreadset.xml -> /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/2_B01/m64386e_220525_014545.consensusreadset.xml 9163607270 32 lrwxrwxrwx 1 pbprod gppacbio 131 Jun 9 09:58 /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/cfda101e-bd68-4ae3-a0d0-9e9491e60dd2/call-import_dataset_reports/inputs/481084058/m64386e_220527_172851.unbarcoded.consensusreadset.xml -> /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/4_D01/m64386e_220527_172851.unbarcoded.consensusreadset.xml 9088194092 32 lrwxrwxrwx 1 pbprod gppacbio 131 Jun 9 10:56 /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/dde0dfad-e14c-47b9-b2da-b37c02c3ab1b/call-import_dataset_reports/inputs/480159576/m64386e_220526_091216.unbarcoded.consensusreadset.xml -> /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/3_C01/m64386e_220526_091216.unbarcoded.consensusreadset.xml 9387713582 24 lrwxrwxrwx 1 pbprod gppacbio 150 Jun 9 09:58 /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/55615992-df0e-40e8-b131-5b45f7981a3a/call-import_dataset_reports/inputs/1130665717/m64386e_220527_172851.bc2012--bc2012.consensusreadset.xml -> /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/4_D01/bc2012--bc2012/m64386e_220527_172851.bc2012--bc2012.consensusreadset.xml 9294923510 32 lrwxrwxrwx 1 pbprod gppacbio 120 Jun 9 09:59 /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/458fbc8c-f5d1-488c-982e-62dc87cfe4f2/call-import_dataset_reports/inputs/481084058/m64386e_220527_172851.consensusreadset.xml -> /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/4_D01/m64386e_220527_172851.consensusreadset.xml 9390327733 32 lrwxrwxrwx 1 pbprod gppacbio 131 Jun 9 10:57 /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/76c55248-f8a2-4f04-9868-fa3fbe09fe65/call-import_dataset_reports/inputs/479235094/m64386e_220525_014545.unbarcoded.consensusreadset.xml -> /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/2_B01/m64386e_220525_014545.unbarcoded.consensusreadset.xml 9390326404 32 lrwxrwxrwx 1 pbprod gppacbio 120 Jun 9 09:52 /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/d59cd1d9-d7f0-4283-bcc2-f1f4ef02669c/call-import_dataset_reports/inputs/480159576/m64386e_220526_091216.consensusreadset.xml -> /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/3_C01/m64386e_220526_091216.consensusreadset.xml 9390326842 32 lrwxrwxrwx 1 pbprod gppacbio 131 Jun 9 10:14 /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/05d863a1-8a18-4693-8faa-0153892341b7/call-import_dataset_reports/inputs/478310612/m64386e_220523_181627.unbarcoded.consensusreadset.xml -> /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/1_A01/m64386e_220523_181627.unbarcoded.consensusreadset.xml 9167679974 32 lrwxrwxrwx 1 pbprod gppacbio 150 Jun 9 10:01 /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/13f78080-40dd-4bb3-bf7a-8a3bb21ff542/call-import_dataset_reports/inputs/1242954991/m64386e_220525_014545.bc2095--bc2095.consensusreadset.xml -> /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/2_B01/bc2095--bc2095/m64386e_220525_014545.bc2095--bc2095.consensusreadset.xml 9387713609 32 lrwxrwxrwx 1 pbprod gppacbio 150 Jun 9 09:59 /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/b5e71499-f314-408e-974e-0231e36b7098/call-import_dataset_reports/inputs/-1356847117/m64386e_220527_172851.bc2011--bc2011.consensusreadset.xml -> /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/4_D01/bc2011--bc2011/m64386e_220527_172851.bc2011--bc2011.consensusreadset.xml 9294923539 24 lrwxrwxrwx 1 pbprod gppacbio 120 Jun 9 10:02 /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/16cab23f-16d9-4784-b544-4d4b1ea41b34/call-import_dataset_reports/inputs/478310612/m64386e_220523_181627.consensusreadset.xml -> /seq/gp_pacbio_prod/smrtlink/userdata/data_root/r64386e_20220523_180557/1_A01/m64386e_220523_181627.consensusreadset.xml find /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/a0823154-4bbd-4b0a-9817-f78742054619 -path "/seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/a0823154-4bbd-4b0a-9817-f78742054619/*/execution/*.json" => 7 files returned /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/a0823154-4bbd-4b0a-9817-f78742054619/call-pbreports_barcode/execution/barcode.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/a0823154-4bbd-4b0a-9817-f78742054619/call-pbreports_barcode/execution/per_barcode_reports.datastore.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/a0823154-4bbd-4b0a-9817-f78742054619/call-pbreports_barcode/execution/per_barcode_reports/a4377b6f-5ed5-45b9-8c6e-da74f67b4719/dataset_stats.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/a0823154-4bbd-4b0a-9817-f78742054619/call-pbreports_barcode/execution/per_barcode_reports/df9c94a6-92d6-4c89-950c-5b34958b6bc0/dataset_stats.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/a0823154-4bbd-4b0a-9817-f78742054619/call-pbreports_barcode/execution/per_barcode_reports/42abca05-d793-43ce-b552-c18fe68ad0ef/dataset_stats.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/a0823154-4bbd-4b0a-9817-f78742054619/call-pbreports_barcode/execution/per_barcode_reports/055d5d05-40d4-4441-ac45-5e05dee0a85d/dataset_stats.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/a0823154-4bbd-4b0a-9817-f78742054619/call-pbreports_barcode/execution/task-report.json find /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/c58dc438-f021-426c-89ce-e82ee4728d62 -path "/seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/c58dc438-f021-426c-89ce-e82ee4728d62/*/execution/*.json" => 7 files returned /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/c58dc438-f021-426c-89ce-e82ee4728d62/call-pbreports_barcode/execution/barcode.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/c58dc438-f021-426c-89ce-e82ee4728d62/call-pbreports_barcode/execution/per_barcode_reports.datastore.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/c58dc438-f021-426c-89ce-e82ee4728d62/call-pbreports_barcode/execution/per_barcode_reports/b7214b4a-f2c7-4a2b-883c-08a98585d239/dataset_stats.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/c58dc438-f021-426c-89ce-e82ee4728d62/call-pbreports_barcode/execution/per_barcode_reports/86ff6655-a08b-4222-aa4c-7fd132a2d2ec/dataset_stats.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/c58dc438-f021-426c-89ce-e82ee4728d62/call-pbreports_barcode/execution/per_barcode_reports/3739008b-2f93-4855-9ba1-f367c886034a/dataset_stats.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/c58dc438-f021-426c-89ce-e82ee4728d62/call-pbreports_barcode/execution/per_barcode_reports/613096b2-0df0-4c7b-8021-aedfdadcfcef/dataset_stats.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/c58dc438-f021-426c-89ce-e82ee4728d62/call-pbreports_barcode/execution/task-report.json find /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/b70fbab3-8f00-44b4-a0df-f8e0e607389e -path "/seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/b70fbab3-8f00-44b4-a0df-f8e0e607389e/*/execution/*.json" => 7 files returned /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/b70fbab3-8f00-44b4-a0df-f8e0e607389e/call-pbreports_barcode/execution/barcode.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/b70fbab3-8f00-44b4-a0df-f8e0e607389e/call-pbreports_barcode/execution/per_barcode_reports.datastore.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/b70fbab3-8f00-44b4-a0df-f8e0e607389e/call-pbreports_barcode/execution/per_barcode_reports/760e2d75-2397-4316-8a75-facd648bd127/dataset_stats.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/b70fbab3-8f00-44b4-a0df-f8e0e607389e/call-pbreports_barcode/execution/per_barcode_reports/198bbcd8-56aa-4628-8241-fcccbcf7e8b7/dataset_stats.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/b70fbab3-8f00-44b4-a0df-f8e0e607389e/call-pbreports_barcode/execution/per_barcode_reports/cb3377b8-7db7-430b-aa7e-27a8e7eae0dc/dataset_stats.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/b70fbab3-8f00-44b4-a0df-f8e0e607389e/call-pbreports_barcode/execution/per_barcode_reports/eb7a0619-a212-4a0b-9019-be2995cfa6b0/dataset_stats.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/b70fbab3-8f00-44b4-a0df-f8e0e607389e/call-pbreports_barcode/execution/task-report.json find /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/5675af6e-2370-41f2-b4bd-8b41454ed14e -path "/seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/5675af6e-2370-41f2-b4bd-8b41454ed14e/*/execution/*.json" => 7 files returned /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/5675af6e-2370-41f2-b4bd-8b41454ed14e/call-pbreports_barcode/execution/barcode.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/5675af6e-2370-41f2-b4bd-8b41454ed14e/call-pbreports_barcode/execution/per_barcode_reports.datastore.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/5675af6e-2370-41f2-b4bd-8b41454ed14e/call-pbreports_barcode/execution/per_barcode_reports/3fdb0ce5-4004-432c-9903-8ab90e067e35/dataset_stats.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/5675af6e-2370-41f2-b4bd-8b41454ed14e/call-pbreports_barcode/execution/per_barcode_reports/2dc32207-607f-4150-8443-8a3434f6b283/dataset_stats.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/5675af6e-2370-41f2-b4bd-8b41454ed14e/call-pbreports_barcode/execution/per_barcode_reports/f86226fa-e794-40d7-b8c1-476079643dfa/dataset_stats.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/5675af6e-2370-41f2-b4bd-8b41454ed14e/call-pbreports_barcode/execution/per_barcode_reports/22bb414b-6ea7-4060-90bf-a3fc06d395e9/dataset_stats.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_collection_reports/5675af6e-2370-41f2-b4bd-8b41454ed14e/call-pbreports_barcode/execution/task-report.json find /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/8578251e-cf2b-4a64-bf60-934ea70bdf8c -path "/seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/8578251e-cf2b-4a64-bf60-934ea70bdf8c/*/execution/*.json" => 7 files returned /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/8578251e-cf2b-4a64-bf60-934ea70bdf8c/call-import_dataset_reports/execution/adapter.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/8578251e-cf2b-4a64-bf60-934ea70bdf8c/call-import_dataset_reports/execution/raw_data.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/8578251e-cf2b-4a64-bf60-934ea70bdf8c/call-import_dataset_reports/execution/control.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/8578251e-cf2b-4a64-bf60-934ea70bdf8c/call-import_dataset_reports/execution/loading.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/8578251e-cf2b-4a64-bf60-934ea70bdf8c/call-import_dataset_reports/execution/task-report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/8578251e-cf2b-4a64-bf60-934ea70bdf8c/call-import_dataset_reports/execution/detect_cpg_methyl.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/8578251e-cf2b-4a64-bf60-934ea70bdf8c/call-import_dataset_reports/execution/ccs.report.json find /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/cfda101e-bd68-4ae3-a0d0-9e9491e60dd2 -path "/seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/cfda101e-bd68-4ae3-a0d0-9e9491e60dd2/*/execution/*.json" => 2 files returned /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/cfda101e-bd68-4ae3-a0d0-9e9491e60dd2/call-import_dataset_reports/execution/task-report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/cfda101e-bd68-4ae3-a0d0-9e9491e60dd2/call-import_dataset_reports/execution/ccs.report.json find /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/dde0dfad-e14c-47b9-b2da-b37c02c3ab1b -path "/seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/dde0dfad-e14c-47b9-b2da-b37c02c3ab1b/*/execution/*.json" => 2 files returned /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/dde0dfad-e14c-47b9-b2da-b37c02c3ab1b/call-import_dataset_reports/execution/task-report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/dde0dfad-e14c-47b9-b2da-b37c02c3ab1b/call-import_dataset_reports/execution/ccs.report.json find /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/55615992-df0e-40e8-b131-5b45f7981a3a -path "/seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/55615992-df0e-40e8-b131-5b45f7981a3a/*/execution/*.json" => 2 files returned /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/55615992-df0e-40e8-b131-5b45f7981a3a/call-import_dataset_reports/execution/task-report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/55615992-df0e-40e8-b131-5b45f7981a3a/call-import_dataset_reports/execution/ccs.report.json find /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/458fbc8c-f5d1-488c-982e-62dc87cfe4f2 -path "/seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/458fbc8c-f5d1-488c-982e-62dc87cfe4f2/*/execution/*.json" => 7 files returned /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/458fbc8c-f5d1-488c-982e-62dc87cfe4f2/call-import_dataset_reports/execution/adapter.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/458fbc8c-f5d1-488c-982e-62dc87cfe4f2/call-import_dataset_reports/execution/raw_data.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/458fbc8c-f5d1-488c-982e-62dc87cfe4f2/call-import_dataset_reports/execution/control.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/458fbc8c-f5d1-488c-982e-62dc87cfe4f2/call-import_dataset_reports/execution/loading.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/458fbc8c-f5d1-488c-982e-62dc87cfe4f2/call-import_dataset_reports/execution/task-report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/458fbc8c-f5d1-488c-982e-62dc87cfe4f2/call-import_dataset_reports/execution/detect_cpg_methyl.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/458fbc8c-f5d1-488c-982e-62dc87cfe4f2/call-import_dataset_reports/execution/ccs.report.json find /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/76c55248-f8a2-4f04-9868-fa3fbe09fe65 -path "/seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/76c55248-f8a2-4f04-9868-fa3fbe09fe65/*/execution/*.json" => 2 files returned /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/76c55248-f8a2-4f04-9868-fa3fbe09fe65/call-import_dataset_reports/execution/task-report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/76c55248-f8a2-4f04-9868-fa3fbe09fe65/call-import_dataset_reports/execution/ccs.report.json find /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/d59cd1d9-d7f0-4283-bcc2-f1f4ef02669c -path "/seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/d59cd1d9-d7f0-4283-bcc2-f1f4ef02669c/*/execution/*.json" => 7 files returned /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/d59cd1d9-d7f0-4283-bcc2-f1f4ef02669c/call-import_dataset_reports/execution/adapter.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/d59cd1d9-d7f0-4283-bcc2-f1f4ef02669c/call-import_dataset_reports/execution/raw_data.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/d59cd1d9-d7f0-4283-bcc2-f1f4ef02669c/call-import_dataset_reports/execution/control.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/d59cd1d9-d7f0-4283-bcc2-f1f4ef02669c/call-import_dataset_reports/execution/loading.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/d59cd1d9-d7f0-4283-bcc2-f1f4ef02669c/call-import_dataset_reports/execution/task-report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/d59cd1d9-d7f0-4283-bcc2-f1f4ef02669c/call-import_dataset_reports/execution/detect_cpg_methyl.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/d59cd1d9-d7f0-4283-bcc2-f1f4ef02669c/call-import_dataset_reports/execution/ccs.report.json find /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/05d863a1-8a18-4693-8faa-0153892341b7 -path "/seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/05d863a1-8a18-4693-8faa-0153892341b7/*/execution/*.json" => 2 files returned /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/05d863a1-8a18-4693-8faa-0153892341b7/call-import_dataset_reports/execution/task-report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/05d863a1-8a18-4693-8faa-0153892341b7/call-import_dataset_reports/execution/ccs.report.json find /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/13f78080-40dd-4bb3-bf7a-8a3bb21ff542 -path "/seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/13f78080-40dd-4bb3-bf7a-8a3bb21ff542/*/execution/*.json" => 2 files returned /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/13f78080-40dd-4bb3-bf7a-8a3bb21ff542/call-import_dataset_reports/execution/task-report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/13f78080-40dd-4bb3-bf7a-8a3bb21ff542/call-import_dataset_reports/execution/ccs.report.json find /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/b5e71499-f314-408e-974e-0231e36b7098 -path "/seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/b5e71499-f314-408e-974e-0231e36b7098/*/execution/*.json" => 2 files returned /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/b5e71499-f314-408e-974e-0231e36b7098/call-import_dataset_reports/execution/task-report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/b5e71499-f314-408e-974e-0231e36b7098/call-import_dataset_reports/execution/ccs.report.json find /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/16cab23f-16d9-4784-b544-4d4b1ea41b34 -path "/seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/16cab23f-16d9-4784-b544-4d4b1ea41b34/*/execution/*.json" => 7 files returned /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/16cab23f-16d9-4784-b544-4d4b1ea41b34/call-import_dataset_reports/execution/adapter.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/16cab23f-16d9-4784-b544-4d4b1ea41b34/call-import_dataset_reports/execution/raw_data.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/16cab23f-16d9-4784-b544-4d4b1ea41b34/call-import_dataset_reports/execution/control.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/16cab23f-16d9-4784-b544-4d4b1ea41b34/call-import_dataset_reports/execution/loading.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/16cab23f-16d9-4784-b544-4d4b1ea41b34/call-import_dataset_reports/execution/task-report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/16cab23f-16d9-4784-b544-4d4b1ea41b34/call-import_dataset_reports/execution/detect_cpg_methyl.report.json /seq/gp_pacbio_prod/smrtlink/userdata/jobs_root/cromwell-executions/sl_dataset_reports/16cab23f-16d9-4784-b544-4d4b1ea41b34/call-import_dataset_reports/execution/ccs.report.json |
Technical caveats
API-domains are derived via API-calls which appear sensitive to reinstalls. So, API-domains in SODIUM from before Jun-2022 are not available due to SMRTLink reinstall.
Not all workflows are triggered for all runs (for example cromwell ones). You might have to OUTER JOIN things to deal with this uncertainty.
This framework is tightly coupled to PacBio’s internal file-structure (unfortunately and inevitably). So, next time PacBio change their SMRTLink version, this solution may have to be fixed accordingly.
All metrics stored in PACBIO datamart are in JSON format. Metrics in XML files are converted into JSON
for each digested metrics file, a special “domain” field is generated - it allows for similar metrics to be grouped and queried via SQL later on
examples shown are for v11 installation on “sodium”. Once “skywalker” is operational switch over should be relatively easy.
ANALYTICS.PACBIO datamart (along with relevant views) is located in this Oracle instance
Code Block db.analytics.url="jdbc:oracle:thin:@//seqprod.broadinstitute.org:1521/seqprod.broadinstitute.org"
username: REPORTING
"ANALYTICS.PACBIO_STAR" view demonstrates how to merge together multiple files (ccs_report, loading, etc) in a flat per (run,cell_well) datasource. It is based on SmrtLink v10, hydrogen data (site_id=1) but techniques used are 100% legit.
Surgically extract fields from metrics-JSON via Oracle JSON
progress of Sodium PacBio flattened metrics ETL can be checked here ETL dashboard
rollback-protection is implemented so that ETL-run is cancelled if seen-before files are removed