Release notes

SUPPORT FOR AMAZON WEB SERVICES SPOT INSTANCES

Seven Bridges has introduced support for Spot instances on the Amazon Web Services (AWS) deploy of the Cancer Genomics Cloud. Spot instance support can be selected as a default for projects and and an option for each task execution. By selecting a spot instance execution costs can be dramatically reduced. Our testing indicates an execution cost savings of over 75% on common workflows.

Release notes

DATASET IMPROVEMENTS

TARGET GRCh38 Dataset

The Therapeutically Applicable Research to Generate Effective Treatments (TARGET) dataset provides genomic, transcriptomic, and epigenomic data from patients representing several childhood cancers and serves as a valuable complement to the existing genomic and multi-omic datasets available on the CGC. The complete TARGET GRCh38 dataset, which includes both Open Data accessible to all researchers and Controlled Data, to which access is regulated by the Database for Genotypes and Phenotypes (dbGaP), is now available on the CGC. This dataset can be queried using the Data Browser to generate custom cohorts from within this dataset as well as cohorts derived from multiple similarly aligned datasets such as the TARGET GRCh38 and TCGA GRCh38 datasets.

Release notes

ELASTIC BLOCK STORAGE CUSTOMIZATION FEATURE

Overview

On August 1, we released a feature that provides you with the ability to customize the amount of Elastic Block Storage (EBS) disk space attached to different Amazon instance configurations. EBS customization is useful for bioinformatics workflows because it provides the ability to optimize your computation by giving you greater control over requirements and costs.

To learn more about EBS customization, please see our documentation.

Release notes

PARALLEL INSTANCE LIMITS AND TASK QUEUING

Individual users on the CGC now have a default usage limit of 80 parallel instances. Seven Bridges sets this upper bound because the number of parallel instances used in total by the CGC Platform is limited by Amazon Web Services (AWS), the CGC’s underlying cloud service provider. While this means that tasks using more than 80 instances can take longer to complete, it also helps ensure that instances are available for all CGC users to run their analyses.

Release notes

HARMONIZED DATASET ON THE CGC

Datasets on the CGC are now categorized as "harmonized" and "legacy" in accordance with the GDC.

In 2016, the GDC started hosting and distributing previously generated data from The Cancer Genome Atlas (TCGA). Additionally, for all submitted sequence data (FASTQs and BAM alignment files), the GDC generated new alignments (BAM files) to the latest human reference genome, GRCh38, using standard workflows. Using these alignments, the GDC generated derived data, including normal and tumor variant and mutation calls, gene and miRNA expression, and splice junction quantification data. The GDC refers to this process of data generation through standard workflows as data harmonization.

Release notes

EXPORT TO VOLUMES: COPY-ONLY PARAMETER

We've added an Advance Access copy-only parameter to the Start an Export job request for Volumes. This means that, while it is fully operational, it is subject to change. If this parameter, copy_only, is set to true, the specified file will be copied to a volume but the source file will remain on the CGC. Learn more from our documentation.

Release notes

SEARCH VIA THE DATASETS API

Use the Datasets API to search across datasets hosted on the CGC, such as TCGA and CCLE. First, use a free-text search to find all exact matches in metadata across all databases. Then, use the URIs obtained in the initial search to obtain further details, including individual members matching the designated resource. All responses are provided in the JSON format.

Learn more about searching via the Datasets API.

Release notes

USE MANIFEST FILES TO SET METADATA IN THE CGC UPLOADER

Use a manifest file to upload large batches of files along with their metadata in the CGC Uploader. This is similar to the functionality already present in the Command Line Uploader.

FILTER BY AND SET CUSTOM METADATA

Custom metadata fields are visible via the visual interface. You can set the values for these fields from the Files tab of your project or from that individual file's page. Use these custom metadata fields for filtering alongside of preset fields. Note that you cannot create new metadata fields on the visual interface. However, you can create custom metadata fields using the API.

Release notes

EXPORT AND IMPORT MANIFEST FILES

Take advantage of new options, Export metadata manifest and Import metadata manifest, from the drop-down menu of the Files tab within a project.

Select Export metadata manifest to export your project files' metadata as an editable manifest file. Use this manifest file to modify the file metadata. Conversely, use Export metadata manifest from filtered files to export and modify the metadata for a subset of your project files based on the filters you've applied.

Select Import metadata manifest to import a manifest file to the CGC and apply the metadata contained in that file to all the files in your project.

Note that a manifest file is formatted as a .CSV file.

Learn more about this feature from our Knowledge Center.

Release notes

IMPROVE DISPLAY OF USER-DEFINED REPORTS

We’ve improved the display of base64html files. To access an expanded view of the report, select the file from the Filestab in your project dashboard. Then, click Expand report to view the base64html file fullscreen in a new tab.

DEVELOPER DASHBOARD

We've added a Developer Dashboard on the CGC's visual interface which is accessible from the User Settings drop-down menu or from the footer.

Release notes

GROUP PARAMETER SETTINGS

View parameter settings grouped by the apps they belong to on task-related pages, such as the Draft Task page and Task page. This functionality allows you to toggle between viewing:

  • all parameters for all apps within your workflow
  • editable parameters
  • only the parameters whose values are different from the default ones

Note that parameters which are both included in input ports and shared between nodes will be automatically linked. Learn more from the documentation.