Learn from TCGA data

The Cancer Genome Atlas (TCGA) is one of the largest and most complete cancer genomics datasets available. However, at more than a petabyte in size, TCGA is challenging to use. It requires large storage facilities to house, and high performance computation capacity to process. Currently, in order for researchers to compute over TCGA, or analyze their own data alongside it, they must download the dataset to their own hardware. 

Analyze TCGA data immediately and securely

The CGC will allow researchers to immediately and securely access the complete TCGA dataset on the cloud. The dataset includes raw and processed data from whole genome, whole exome, RNA, microRNA and bisulfite sequencing studies, as well as array-based studies. Both Open Access and Controlled Access data will be available. 

Explore TCGA data using Smart queries

Researchers using the CGC can search for cases and data by their associated clinical metadata, and use a visual case explorer to browse the mutation status and expression levels of a gene in all patients with a particular disease. They can then recall all files associated with these patients, filter further by metadata and execute a analyses over them.

all you need is an internet connection

The CGC democratizes cancer genomics research. Scientists anywhere with an internet connection can manipulate and compute on TCGA to further their research. There is no need to provision, set up and maintain servers for storage and computation, and no time or bandwidth is spent waiting for data to download.