Notes for DSDE/FC/GDAC Meeting 2016_08_25

 

At some levels I do believe there is a good deal of commonality between use cases for FC / GOTC and GDAC.  But, GenomicsPlatform-style production sequencing (which is what I understand GOTC to be largely for) and iterative, high-throughput scientific analysis (the GDAC use case) are ultimately distinct in their goals and how they play out.

My (admittedly loose) understanding of GP-style workflows is that in the large they are like automobile assembly lines:

  1. Not exploratory:  goal(s) are clear and well-defined
  2. Generally not cyclic or iterative:
    1. the intent of a workflow is to run a batch of samples through a well-defined set of processes
    2. then STOP
    3. FOREVER (for that set of inputs)
    4. The assembled car (or parts of it) are never put back onto the assembly line for touch-ups
      1. It's either given to the customer (if good) or scrapped (if bad)
    5. Usually the only time any part of the workflow is rerun is when a well-defined step fails (e.g. QC)
  3. Have a more linear DAG structure, possibly much more so 
  4. Executed and inspected by internal staff
  5. Self-contained:
    1. everything needed for the workflow to completely execute is available at launch time
    2. And this is known apriori
    3. Example: batch of sample data off sequencer + some reference data
    4. Rarely (if ever) would one change values in samples batch and then re-run the workflow
  6. Largely executed end-to-end on single sample
  7. Are initiated from a single entry point

If this is a fair description of GOTC/GP-style workflows, then it is worth nothing that virtually none of these conditions hold for GDAC-style workflows.

In contrast, GDAC-style workflows largely represent an attempt to automate the Scientific Method.