Additional reading

the cancer genomics cloud online course

Comprehensive bioinformatic analysis
of cancer genomes


basics of next generation sequencing (NGS) and the cancer genomics cloud (CGC)

Basics in NGS
1. Goodwin S, McPherson JD, McCombie WR. Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet. 2016 May 17;17(6):333-51. doi: 10.1038/nrg.2016.49. Review. PubMed PMID: 27184599

 

TCGA
1. Tomczak K, Czerwińska P, Wiznerowicz M. The Cancer Genome Atlas (TCGA): an
immeasurable source of knowledge.
Contemp Oncol (Pozn). 2015;19(1A):A68-77. doi:
10.5114/wo.2014.47136. Review. PubMed PMID: 25691825; PubMed Central PMCID:
PMC4322527.
2. The future of cancer genomics. Nat Med. 2015 Feb;21(2):99. doi:
10.1038/nm.3801. PubMed PMID: 25654590.
3. Chin L, Hahn WC, Getz G, Meyerson M. Making sense of cancer genomic data.
Genes Dev. 2011 Mar 15;25(6):534-55. doi: 10.1101/gad.2017311. Review. Erratum
in: Genes Dev. 2012 May 1;26(9):1003. PubMed PMID: 21406553; PubMed Central
PMCID: PMC3059829.
4. Chin L, Andersen JN, Futreal PA. Cancer genomics: from discovery science to personalized medicine. Nat Med. 2011 Mar;17(3):297-303. doi: 10.1038/nm.2323. PubMed PMID: 21383744.


analysis of gene expression
by rna-seq

RNA datasets: Recount
1. Leonardo Collado-Torres, Abhinav Nellore, Kai Kammers, Shannon E Ellis, Margaret A Taub, Kasper D Hansen, Andrew E Jaffe, Ben Langmead, Jeffrey Leek. Recount: A large-scale resource of analysis-ready RNA-seq expression data.doi: https://doi.org/10.1101/068478 bioRxiv. 068478
2. Frazee AC, Langmead B, Leek JT. ReCount: a multi-experiment resource of analysis-ready RNA-seq gene count datasets. BMC Bioinformatics. 2011 Nov 16;12:449. doi: 10.1186/1471-2105-12-449. PubMed PMID: 22087737; PubMed Central
PMCID: PMC3229291.

Counting reads: HTSeq
1. Anders S, Pyl PT, Huber W. HTSeq--a Python framework to work with
high-throughput sequencing data.
Bioinformatics. 2015 Jan 15;31(2):166-9. doi:
10.1093/bioinformatics/btu638. Epub 2014 Sep 25. PubMed PMID: 25260700; PubMed Central PMCID: PMC4287950.

Testing and controlling batch effects
1. Leek JT, Scharpf RB, Bravo HC, Simcha D, Langmead B, Johnson WE, Geman D, Baggerly K, Irizarry RA. Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet. 2010 Oct;11(10):733-9. Doi: 10.1038/nrg2825. Review. PubMed PMID: 20838408; PubMed Central PMCID: PMC3880143.


GENOMIC ANALYSIS USING
WHOLE-GENOME AND
WHOLE-EXOME DATASETS

Variant calling
1. Nielsen R, Korneliussen T, Albrechtsen A, Li Y, Wang J. SNP calling, genotype calling, and sample allele frequency estimation from New-Generation Sequencing data. PLoS One. 2012;7(7):e37558. doi: 10.1371/journal.pone.0037558. PubMed PMID:
22911679; PubMed Central PMCID: PMC3404070.
2. Sequence Alignment/Map Format Specification. The SAM/BAM Format Specification Working Group. 6 Sep 2016.
3. Nielsen R, Paul JS, Albrechtsen A, Song YS. Genotype and SNP calling from next-generation sequencing data. Nat Rev Genet. 2011 Jun;12(6):443-51. Doi: 10.1038/nrg2986. Review.  PubMed PMID: 21587300; PubMed Central PMCID: PMC3593722.
4. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009 Jul 15;25(14):1754-60. Doi: 10.1093/bioinformatics/btp324. PubMed PMID: 19451168; PubMed Central PMCID: PMC2705234.
5. Tian S, Yan H, Kalmbach M, Slager SL. Impact of post-alignment processing in variant discovery from whole exome data. BMC Bioinformatics. 2016 Oct 3;17(1):403. PubMed PMID: 27716037; PubMed Central PMCID: PMC5048557.
6. Cock PJ, Fields CJ, Goto N, Heuer ML, Rice PM. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res. 2010 Apr;38(6):1767-71. doi: 10.1093/nar/gkp1137. Review. PubMed PMID: 20015970; PubMed Central PMCID: PMC2847217.

Best practices
1. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010 Sep;20(9):1297-303. doi: 10.1101/gr.107524.110. PubMed PMID: 20644199;
PubMed Central PMCID: PMC2928508.
2. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, McKenna A, Fennell TJ, Kernytsky AM, Sivachenko AY, Cibulskis K, Gabriel SB, Altshuler D, Daly MJ. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011 May;43(5):491-8. doi: 10.1038/ng.806. PubMed PMID: 21478889; PubMed Central PMCID: PMC3083463.
3. Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, Del Angel G, Levy-Moonshine A, Jordan T, Shakir K, Roazen D, Thibault J, Banks E, Garimella KV, Altshuler D, Gabriel S, DePristo MA. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr Protoc Bioinformatics. 2013;43:11.10.1-33. Doi: 10.1002/0471250953.bi1110s43. PubMed PMID: 25431634; PubMed Central PMCID: PMC4243306.

Variant interpretation
1. Amendola LM, Jarvik GP, Leo MC, McLaughlin HM, Akkari Y, Amaral MD, Berg JS, Biswas S, Bowling KM, Conlin LK, Cooper GM, Dorschner MO, Dulik MC, Ghazani AA, Ghosh R, Green RC, Hart R, Horton C, Johnston JJ, Lebo MS, Milosavljevic A, Ou J, Pak CM, Patel RY, Punj S, Richards CS, Salama J, Strande NT, Yang Y, Plon SE, Biesecker LG, Rehm HL. Performance of ACMG-AMP Variant-Interpretation Guidelines among Nine Laboratories in the Clinical Sequencing Exploratory Research Consortium. Am J Hum Genet. 2016 Jun 2;98(6):1067-76. doi:10.1016/j.ajhg.2016.03.024. PubMed PMID: 27181684; PubMed Central PMCID: PMC4908185.
2. Li MM, Datto M, Duncavage EJ, Kulkarni S, Lindeman NI, Roy S, Tsimberidou AM, Vnencak-Jones CL, Wolff DJ, Younes A, Nikiforova MN. Standards and Guidelines for the Interpretation and Reporting of Sequence Variants in Cancer: A Joint Consensus Recommendation of the Association for Molecular Pathology, American Society of Clinical Oncology, and College of American Pathologists. J Mol Diagn. 2017 Jan;19(1):4-23. doi: 10.1016/j.jmoldx.2016.10.002. Review. PubMed PMID: 27993330.
3. Carr TH, McEwen R, Dougherty B, Johnson JH, Dry JR, Lai Z, Ghazoui Z, Laing NM, Hodgson DR, Cruzalegui F, Hollingsworth SJ, Barrett JC. Defining actionable mutations for oncology therapeutic development. Nat Rev Cancer. 2016 Apr 26;16(5):319-29. doi: 10.1038/nrc.2016.35. Review. PubMed PMID: 27112209.


Advanced analytical topics
and multi-omics

Genome-wide association studies
1. Balding DJ. A tutorial on statistical methods for population association studies. Nat Rev Genet. 2006 Oct;7(10):781-91. Review. PubMed PMID: 16983374.
2. McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JP, Hirschhorn JN. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet. 2008 May;9(5):356-69. Doi: 10.1038/nrg2344. Review. PubMed PMID: 18398418.
3. Sham PC, Purcell SM. Statistical power and significance testing in large-scale genetic studies. Nat Rev Genet. 2014 May;15(5):335-46. doi: 10.1038/nrg3706. Review. PubMed PMID: 24739678.
4. Marchini J, Howie B. Genotype imputation for genome-wide association studies.
Nat Rev Genet. 2010 Jul;11(7):499-511. doi: 10.1038/nrg2796. Review. PubMed PMID: 20517342.

Rare variants association studies
1. Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A, Cho JH, Guttmacher AE, Kong A, Kruglyak L, Mardis E, Rotimi CN, Slatkin M, Valle D, Whittemore AS, Boehnke M, Clark AG, Eichler EE, Gibson G, Haines JL, Mackay TF, McCarroll SA, Visscher PM. Finding the missing heritability of complex diseases. Nature. 2009 Oct 8;461(7265):747-53. doi: 10.1038/nature08494. Review. PubMed PMID: 19812666; PubMed Central PMCID: PMC2831613.
2. Spencer CC, Su Z, Donnelly P, Marchini J. Designing genome-wide association studies: sample size, power, imputation, and the choice of genotyping chip. PLoS Genet. 2009 May;5(5):e1000477. doi: 10.1371/journal.pgen.1000477. Epub 2009 May 15. PubMed PMID: 19492015; PubMed Central PMCID: PMC2688469.
3. Moutsianas L, Agarwala V, Fuchsberger C, Flannick J, Rivas MA, Gaulton KJ, Albers PK; GoT2D Consortium., McVean G, Boehnke M, Altshuler D, McCarthy MI. The power of gene-based rare variant methods to detect disease-associated variation and test hypotheses about complex disease. PLoS Genet. 2015 Apr 23;11(4):e1005165. Doi: 10.1371/journal.pgen.1005165. eCollection 2015 Apr. PubMed PMID: 25906071; PubMed Central PMCID: PMC4407972.

Structural variation and gene fusions
1. McPherson A, Wu C, Hajirasouliha I, Hormozdiari F, Hach F, Lapuk A, Volik S, Shah S, Collins C, Sahinalp SC. Comrad: detection of expressed rearrangements by
integrated analysis of RNA-Seq and low coverage genome sequence data.
Bioinformatics. 2011 Jun 1;27(11):1481-8. doi: 10.1093/bioinformatics/btr184. Epub 2011 Apr 9. PubMed PMID: 21478487.
2. McPherson A, Wu C, Wyatt AW, Shah S, Collins C, Sahinalp SC. nFuse: discovery of complex genomic rearrangements in cancer using high-throughput sequencing. Genome Res. 2012 Nov;22(11):2250-61. doi: 10.1101/gr.136572.111. Epub 2012 Jun 28. PubMed PMID: 22745232; PubMed Central PMCID: PMC3483554.
3. Zhang J, White NM, Schmidt HK, Fulton RS, Tomlinson C, Warren WC, Wilson RK, Maher CA. INTEGRATE: gene fusion discovery using whole genome and transcriptome data. Genome Res. 2016 Jan;26(1):108-18. doi: 10.1101/gr.186114.114. Epub 2015 Nov 10. PubMed PMID: 26556708; PubMed Central PMCID: PMC4691743.

Integrating transcriptome and epigenome
1. Schübeler D. Function and information content of DNA methylation. Nature. 2015 Jan 15;517(7534):321-6. doi: 10.1038/nature14192. Review. PubMed PMID: 25592537.
2. Feinberg AP, Koldobskiy MA, Göndör A. Epigenetic modulators, modifiers and mediators in cancer aetiology and progression. Nat Rev Genet. 2016 May;17(5):284-99. Doi: 10.1038/nrg.2016.13. Epub 2016 Mar 14. Review. PubMed PMID: 26972587; PubMed Central PMCID: PMC4888057.
3. Verma M. Genome-wide association studies and epigenome-wide association studies go together in cancer control. Future Oncol. 2016 Jul;12(13):1645-64. doi: 10.2217/fon-2015-0035. Epub 2016 Apr 15. PubMed PMID: 27079684.
4. Cramer D, Serrano L, Schaefer MH. A network of epigenetic modifiers and DNA repair genes controls tissue-specific copy number alteration preference. Elife. 2016 Nov 10;5. pii: e16519. doi: 10.7554/eLife.16519. PubMed PMID: 27831464; PubMed Central PMCID: PMC5122459.

Molecular subtyping in cancer
1. Landau DA, Carter SL, Stojanov P, McKenna A, Stevenson K, Lawrence MS, Sougnez C, Stewart C, Sivachenko A, Wang L, Wan Y, Zhang W, Shukla SA, Vartanov A, Fernandes SM, Saksena G, Cibulskis K, Tesar B, Gabriel S, Hacohen N, Meyerson M, Lander ES, Neuberg D, Brown JR, Getz G, Wu CJ. Evolution and impact of subclonal mutations in chronic lymphocytic leukemia. Cell. 2013 Feb 14;152(4):714-26. doi: 10.1016/j.cell.2013.01.019. PubMed PMID: 23415222; PubMed Central PMCID: PMC3575604.
2. Sims AH, Howell A, Howell SJ, Clarke RB. Origins of breast cancer subtypes and therapeutic implications. Nat Clin Pract Oncol. 2007 Sep;4(9):516-25. Review. PubMed PMID: 17728710.
3. Cancer Genome Atlas Network.. Comprehensive molecular portraits of human breast tumours. Nature. 2012 Oct 4;490(7418):61-70. doi: 10.1038/nature11412. Epub 2012 Sep 23. PubMed PMID: 23000897; PubMed Central PMCID: PMC3465532.


Advanced features of the cgc platform