Notes for DSDE/FC/GDAC Meeting 2016_08_25
At some levels I do believe there is a good deal of commonality between use cases for FC / GOTC and GDAC. But, GenomicsPlatform-style production sequencing (which is what I understand GOTC to be largely for) and iterative, high-throughput scientific analysis (the GDAC use case) are ultimately distinct in their goals and how they play out.
My (admittedly loose) understanding of GP-style workflows is that in the large they are like automobile assembly lines:
- Not exploratory: goal(s) are clear and well-defined
- Generally not cyclic or iterative:
- the intent of a workflow is to run a batch of samples through a well-defined set of processes
- then STOP
- FOREVER (for that set of inputs)
- The assembled car (or parts of it) are never put back onto the assembly line for touch-ups
- It's either given to the customer (if good) or scrapped (if bad)
- Usually the only time any part of the workflow is rerun is when a well-defined step fails (e.g. QC)
- the intent of a workflow is to run a batch of samples through a well-defined set of processes
- Have a more linear DAG structure, possibly much more so
- Executed and inspected by internal staff
- Self-contained:
- everything needed for the workflow to completely execute is available at launch time
- And this is known apriori
- Example: batch of sample data off sequencer + some reference data
- Rarely (if ever) would one change values in samples batch and then re-run the workflow
- Largely executed end-to-end on single sample
- Are initiated from a single entry point
If this is a fair description of GOTC/GP-style workflows, then it is worth nothing that virtually none of these conditions hold for GDAC-style workflows.
In contrast, GDAC-style workflows largely represent an attempt to automate the Scientific Method.