Release notes

SUPPORT FOR AMAZON WEB SERVICES SPOT INSTANCES

Seven Bridges has introduced support for Spot instances on the Amazon Web Services (AWS) deploy of the Cancer Genomics Cloud. Spot instance support can be selected as a default for projects and and an option for each task execution. By selecting a spot instance execution costs can be dramatically reduced. Our testing indicates an execution cost savings of over 75% on common workflows.

Release notes

DATASET IMPROVEMENTS

TARGET GRCh38 Dataset

The Therapeutically Applicable Research to Generate Effective Treatments (TARGET) dataset provides genomic, transcriptomic, and epigenomic data from patients representing several childhood cancers and serves as a valuable complement to the existing genomic and multi-omic datasets available on the CGC. The complete TARGET GRCh38 dataset, which includes both Open Data accessible to all researchers and Controlled Data, to which access is regulated by the Database for Genotypes and Phenotypes (dbGaP), is now available on the CGC. This dataset can be queried using the Data Browser to generate custom cohorts from within this dataset as well as cohorts derived from multiple similarly aligned datasets such as the TARGET GRCh38 and TCGA GRCh38 datasets.

Release notes

ELASTIC BLOCK STORAGE CUSTOMIZATION FEATURE

Overview

On August 1, we released a feature that provides you with the ability to customize the amount of Elastic Block Storage (EBS) disk space attached to different Amazon instance configurations. EBS customization is useful for bioinformatics workflows because it provides the ability to optimize your computation by giving you greater control over requirements and costs.

To learn more about EBS customization, please see our documentation.

Release notes

PARALLEL INSTANCE LIMITS AND TASK QUEUING

Individual users on the CGC now have a default usage limit of 80 parallel instances. Seven Bridges sets this upper bound because the number of parallel instances used in total by the CGC Platform is limited by Amazon Web Services (AWS), the CGC’s underlying cloud service provider. While this means that tasks using more than 80 instances can take longer to complete, it also helps ensure that instances are available for all CGC users to run their analyses.

Release notes

HARMONIZED DATASET ON THE CGC

Datasets on the CGC are now categorized as "harmonized" and "legacy" in accordance with the GDC.

In 2016, the GDC started hosting and distributing previously generated data from The Cancer Genome Atlas (TCGA). Additionally, for all submitted sequence data (FASTQs and BAM alignment files), the GDC generated new alignments (BAM files) to the latest human reference genome, GRCh38, using standard workflows. Using these alignments, the GDC generated derived data, including normal and tumor variant and mutation calls, gene and miRNA expression, and splice junction quantification data. The GDC refers to this process of data generation through standard workflows as data harmonization.

Release notes

EXPORT TO VOLUMES: COPY-ONLY PARAMETER

We've added an Advance Access copy-only parameter to the Start an Export job request for Volumes. This means that, while it is fully operational, it is subject to change. If this parameter, copy_only, is set to true, the specified file will be copied to a volume but the source file will remain on the CGC. Learn more from our documentation.

Release notes

SEARCH VIA THE DATASETS API

Use the Datasets API to search across datasets hosted on the CGC, such as TCGA and CCLE. First, use a free-text search to find all exact matches in metadata across all databases. Then, use the URIs obtained in the initial search to obtain further details, including individual members matching the designated resource. All responses are provided in the JSON format.

Learn more about searching via the Datasets API.

Release notes

USE MANIFEST FILES TO SET METADATA IN THE CGC UPLOADER

Use a manifest file to upload large batches of files along with their metadata in the CGC Uploader. This is similar to the functionality already present in the Command Line Uploader.

FILTER BY AND SET CUSTOM METADATA

Custom metadata fields are visible via the visual interface. You can set the values for these fields from the Files tab of your project or from that individual file's page. Use these custom metadata fields for filtering alongside of preset fields. Note that you cannot create new metadata fields on the visual interface. However, you can create custom metadata fields using the API.

Release notes

EXPORT AND IMPORT MANIFEST FILES

Take advantage of new options, Export metadata manifest and Import metadata manifest, from the drop-down menu of the Files tab within a project.

Select Export metadata manifest to export your project files' metadata as an editable manifest file. Use this manifest file to modify the file metadata. Conversely, use Export metadata manifest from filtered files to export and modify the metadata for a subset of your project files based on the filters you've applied.

Select Import metadata manifest to import a manifest file to the CGC and apply the metadata contained in that file to all the files in your project.

Note that a manifest file is formatted as a .CSV file.

Learn more about this feature from our Knowledge Center.

Release notes

IMPROVE DISPLAY OF USER-DEFINED REPORTS

We’ve improved the display of base64html files. To access an expanded view of the report, select the file from the Filestab in your project dashboard. Then, click Expand report to view the base64html file fullscreen in a new tab.

DEVELOPER DASHBOARD

We've added a Developer Dashboard on the CGC's visual interface which is accessible from the User Settings drop-down menu or from the footer.

Release notes

GROUP PARAMETER SETTINGS

View parameter settings grouped by the apps they belong to on task-related pages, such as the Draft Task page and Task page. This functionality allows you to toggle between viewing:

  • all parameters for all apps within your workflow
  • editable parameters
  • only the parameters whose values are different from the default ones

Note that parameters which are both included in input ports and shared between nodes will be automatically linked. Learn more from the documentation.

Release notes

SEVENBRIDGES-PYTHON ERROR HANDLING

sevenbridges-python 0.6.0 adds error handling libraries to overcome rate limits and platform issues. Users can also add detailed error logging.

WORKING WITH VOLUMES

We've updated our tutorials on working with volumes for both S3 and Google Cloud Storage buckets

Release notes

CREATE AND USE FOLDERS VIA THE API

You can organize your project files within folders via specific API requests. The Folders functionality is an advanced access feature, which means specific feature details are subject to change. Each project is a root folder containing all your project files. Create nesting folders within this root folder to sort your files or mirror external file structures. Currently, this functionality is only accessible via the API. This means you can only see the files in your root folder on the visual interface. You won't be able to see the folders you've created via the API or the files within them via the visual interface Learn more about folders.

IMPROVED SEARCH BY PROPERTIES IN THE DATA BROWSER

Each entity in the Data Browser contains a search functionality you can use to quickly find properties such as Drug name or its value. Add desired values by selecting them from the returned matches. If you wish to add an additional value for the property, simply start searching again. Your previous selections will be preserved as you select further values. Learn more about the Data Browser.

Release notes

VISUAL INTERFACE IMPROVEMENTS FOR ADDING FILES AND APPS

We've updated the visual interface of the CGC to make adding files and apps intuitive and convenient. For instance, you can browse and add apps from the Public Apps repository, all without leaving your project.

TOOL INPUT AND OUTPUT DOCUMENTATION

Our documentation on tool input ports and tool output ports are updated to cover additional features such as Load contents and Stage inputs. Check out our improved documentation on the Knowledge Center.

Release notes

OB.TREE.LOG FOR JOB EXECUTIONS

After each job execution, the CGC will generate a job.tree.log file. This file contains the structure of the working directory and can be useful in the debugging process. Learn more from our documentation.

SEARCH FOR A PARTICULAR PROPERTY IN THE DATA BROWSER

When building queries in the Data Browser, take advantage of the new search functionality for each entity to quickly locate its associated properties