Meeting Minutes 09-16-2011Telecon attendees: Sheila Reynolds & Adam Norberg (ISB), Dan Dicara & Mike Noble (Broad) - Discussed the appropriate placement of the PVCA Pipeline in Firehose
- Three schemes were discussed as described here: BatchEffects.pdf
- Scheme 3 was determined optimal
- Place PVCA in the Normalizer Workflow after pertinent Merge Pipelines
- One pipeline per technology (i.e. array or sequencing platform)
- Merge individual reports for individual technologies into a single report
- This report could be added as an annotation in Firehose that can be referred to in downstream analysis pipelines
- One report per data type (i.e. expression and methylation)
- Perhaps add PVCA in other places (i.e. after aggregation/centering pipelines such as mRNA_Preprocess_Median)
- Discussed adding information to the data based on the PVCA results
- Create a new meta-data file
- Sample list for each tumor type with a column indicating if batch effects were discovered
- Downstream pipelines could add columns of information to this file
- Redactions could be entered at the end
- Adam mentioned he may have a python script for doing this - follow up with him
- Creating a new file will prevent the adverse effects of adding columns to the preexisting data that could possibly break downstream parsers
- Package this with the normalized results
- Talk to Nils about allowing linking between reports (this is being tracked as GDAC-80)
- Discussed how to correct for batch effects
- This would be difficult to automate and should be decided by downstream pipelines
- Finally, we discussed maintaining a Batch Effects page on our TCGA-GDAC website (this is the first entry)
|