the Cancer Genomics Cloud online course

Comprehensive bioinformatic analysis
of cancer genomes

Course Overview

In this online course, we demonstrate basic and advanced concepts for the bioinformatic analysis of next generation sequencing data. We showcase the tools and workflows available on the Cancer Genomics Cloud (CGC) platform, as well as their applications. The CGC, built and maintained by Seven Bridges, provides access to the world’s largest cancer genomic datasets, alongside tools for analyzing genomic data in a secure and scalable cloud environment.

Topics covereD

1. Next-Generation Sequencing and the Cancer Genomics Cloud
2. RNA-Seq analysis of gene expression data
3. DNA sequencing analysis of whole genomes and whole exomes
4. Multi-omic approaches for advanced analysis
5. Advanced features of the Seven Bridges Cancer Genomics Cloud, including applications for tool wrapping, workflow optimization, task automation using the API, and best practices for making a Docker container.

View the full syllabus.

Distinguished guest speakers

Ben Langmead

Ben Langmead, PhD

Assistant Professor | Johns Hopkins University, Computer Science | Summarizing tens of thousands of RNA-seq datasets in the cloud

Name Name Name

James Knight, PhD

Director of Bioinformatics | Yale Center for Genome Analysis, Dept. of Genetics | Human Variant Calling and Disease Association


Paul Boutros, PhD

Principal Investigator, Informatics & Biocomputing | Ontario Institute for Cancer Research | The Three Genomes of Prostate Cancer


Brad Chapman, PhD

Research Scientist | Harvard Chan School, Bioinformatics Core | Interoperable Community Developed Variant Calling with BC Bio and the Common Workflow Language

Basics of Next Generation Sequencing (NGS) and the Cancer Genomics Cloud (CGC)

This module is an introduction to the field of next generation sequencing and its applications in biomedicine. The first lesson is an introduction to NGS technologies, such as Sequencing by Synthesis. The second lesson is an introduction to The Cancer Genome Atlas (TCGA) and the Cancer Genomic Cloud (CGC) platform.

Analysis of Gene Expression by RNA-Seq

The second module is dedicated to the analysis of gene expression by RNA-Sequencing, and how these methods can be run using the CGC platform. In the first lesson, we review experimental design and methods to obtain read counts. In the second lesson, we explore how gene expression can be quantitated from read counts, from count normalization to statistical analyses. In the third lesson, we cover tools for detecting gene fusions and de novo assembly methods.

Genomic Analysis Using Whole-Genome and Whole-Exome Datasets

The third module focuses on whole genome sequencing (WGS) and whole exome sequencing (WES) analysis. In the first lesson, we'll cover genome reconstruction methods using the reference genome and variant calling. In the second lesson, we discuss methods and best practices for variant calling, including single nucleotide variants (SNV) and copy number variations (CNV). We also introduce the concept of non-linear reference genome. The third lesson focuses on variant filtration, variant quality score recalibration, annotation, interpretation, and the identification of biologically relevant variants (e.g. driver genes).

Advanced Analytical Topics and Multi-omics

The fourth module introduces the concepts, methods and applications for combined analysis of different NGS-based “omics” platforms. In the first lesson, we discuss integrative analysis of genomic, epigenomic and transcriptomic data, eQTL with SNV and gene expression, and SV and gene fusion analyses. Moreover, we review other analytical methods such us GWAS, RVAS, Genotype/Phenotype analysis and microbiome analyses.

Advanced Features of the CGC Platform

In the final module, we present advanced feature of the Seven Bridges Cancer Genomics Cloud, including portable and reproducible tools, workflow optimization, automation using the API, and best practices for making a Docker container.