Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 451 Current »

 

For each data run a new panel should be added here, describing the significant functional or data changes in that analysis run. Mike will cut them from here and paste to the public page upon releasing that run

If sparse, do diffs with last analysis run, to see new tasks/data added (and optionally ask team, via email, to verify).  The /wiki/spaces/GDAC/pages/844334194

Spring 2016 Analysis Run
  1. This is likely to be either the penultimate or perhaps even final standard Firehose analysis run of the TCGA project. Custom AWG runs will continue for TCGA as needed.

  2. Summary of sample changes (see the comprehensive samples report for more details):

    BCR

    +1

    (11368 total)

    Clinical

    +32

    (11196 total)

    CN

    +2

    (10987 total)

    MAF

    +313

    (7099 total)

    Methylation

    +1

    (10972 total)

    miRSeq

    +2

    (10156 total)

    mRNASeq

    +164

    (10267 total)

    rawMAF

    +2072

    (6322 total)

    RPPA

    +627

    (7429 total)

  3. APOBEC pipelines updated: 
    1. used median filtering in primary APOBEC analysis
    2. in downstream clinical correlations, corrected names of categorical variables and descriptions of how they were utilized
  4. cNMF clustering improvement: new criteria used to select best cluster, identical to that describe in Summer 2014 run (see below) for consensus hierarchical clustering:
  5. The cophenetic correlation coefficients and average silhouette values are used to determine the k with the most robust clusterings. From the plot of cophenetic correlation versus k, we select modes and the point preceding the greatest decrease in cophenetic correlation coefficient, and from these choose the k with the highest average silhouette value.
  6. Survival analysis: for all clinical correlations
    1. Modified the p-value calculation of survival analysis with continuous data. It now uses the quantile interval categorical values instead of continuous values.
    2. Previously it had one hazard ratio value for one continuous value, but now has multiple hazard ratio values for quantile interval curves (and are now reflected in the plot legends)

  7. FireBrowse:
    1. updated to reflect these run results
    2. iCoMut:
      1. loaded 4 additional disease cohorts: DLBC, ESCA, SARC, and THYM
      2. Completed most of work for major new release, stay tuned for announcement next week, incorporating many graphical and data exploration enhancements
Tasks Under Development
  •  Migrated implementation of our clustering codes away from GenePattern into FH native jobs, consolidating and simplifying along the way (needs more description/tailoring)
  • The spearman correlation was used in the pipeline of Correlate_mRNAseq_vs_Mutation_APOBEC.

 

Table from 2013_09_23 analysis run;  keep until next run is posted which corrects the GAF 3.0 issues

THCA SKCM LGG KIRP
 GAF 2.1
323 Samples
6806 Mutations
GAF 3.0
401 Samples
6736 Mutations
 GAF 2.1
228 Samples
189759 Mutations
GAF 3.0
228 Samples
189948 Mutations
 GAF 2.1
217 Samples
25172 Mutations
GAF 3.0
220 Samples
23947 Mutations
 GAF 2.1
111 Samples
7907 Mutations
GAF 3.0
112 Samples
7367 Mutations
RankGene2.1 RankRankGene2.1 RankRankGene2.1 RankRankGene2.1 Rank
1NRASNRAS11C15orf23C15orf2311IL32TEAD352751IL32KCNK5354
2BRAFBRAF22CDKN2APOLDIP262122IDH2IL3212CDC27CDC272
3HRASHRAS33NRASNUDT11169503IDH1ATRX53NF2IL321
4EMG1OTUD47934BRAFCDKN2A24TP53PRCP9504PPARGC1BNF23
5PTTG1IPEIF1AX145OXA1LNRAS35ATRXIDH225SFRS2IPPPARGC1B4
6RPTNNUP935006TP53BRAF46CICIDH136METPCDHGC512984
7TGNLRP6267STK19OXA1L57FUBP1TP5347ELF3MET6
8TMCO2PPM1D138PTENTTN188NOTCH1HEATR32398PCF11PLAC436
9R3HDM2MUC7179DSG1UGT2B15179PIK3R1CIC69LGI4PCF118
10PRB2OR56A14310PPP6CTP53610PIK3CAFUBP1710RAB27BLGI49
  • No labels