Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Since a multi-center MAF is a richer source of data, leading to better science, ideally the ultimate goal should would be for them to supplant single-center MAFs
  • As of this writing KICH is among the earliest, if not the firstalong with e.g. COADREAD, to submit a multi-center MAF to the DCC (through BCM)
  • THCA is an example where the multi-center mutation calling benchmark results have not yet been merged back into a single MAF for DCC submission
  • There is a seemingly implicit understanding that an analyst or programmer from the primary sequencing center has the responsibility of creating the merged multi-center MAF, by incorporating the calls from secondary centers in the benchmark exercise

...

  • Mike/Broad will immediately ingest KICH multi-center MAF;  this MAF will be reflected in the Dec stddata run and next Analysis run
  • Mike will contact appropriate parties at Broad to generate THCA multi-center MAF
  • Mike will gauge (or raise) awareness at Broad, of need for multi-center merge of MAFs for other disease studies sequenced at Broad
  • Heidi will advocate for:
    • clearer nomenclature across the TCGA to identify multi-center MAFs;  beginning with contacting BCM, so that name of a multi-center MAF not exactly match the single-center MAF (even though MD5 and file size can disambiguate, using same names is VERY unfriendly)
    • standardization of multi-center VCF/MAFs across the AWGs;  getting the format nailed down is most important, but tools can be shared, too, if possible starting with the BCM tool that was likely written to submit the KICH multi-center MAF
      QUESTION 2013_12_02:  how is filtering going to happen during the merge process (e.g. what tool, again, did Baylor use for this in KICH, if any, or was it manual?)
    • broader understanding in primary sequencing centers of need/responsibility to merge/aggregate multi-center calling results into single VCF/MAF