Release notes
Disabling of inactive user accounts
As a part of the ongoing effort on maximizing the security on the CGC, we have started identifying inactive user accounts, which will be temporarily disabled in the upcoming period. This procedure is in line with recommendations and best practices for additional data security and prevention of unauthorized access to the system.
User accounts that will be labelled as inactive are those that haven't logged in to the CGC or used the API in 90 days. Prior to temporarily disabling inactive accounts, notification emails will be sent out to account owners, inviting them to log in to the CGC if they want to keep their accounts active. Accounts that do get disabled will not be removed and data associated with them will remain on the CGC. They can be re-enabled at any time by emailing our Support Team at support@sbgenomics.com.
Release notes
Recently published apps
Salmon, DeepVariant and STAR-Fusion were updated to the latest versions.
Release notes
Data model updates
Data models have been changed for a series of datasets on the CGC. You will be notified about the change through an info message next to your saved Data Browser queries, informing you that the queries might be affected by the data model update and instructing you to rebuild the queries that are not working, or contact Seven Bridges for assistance.
Data Cruncher file preview improvements
We have improved the file preview option in Data Cruncher analysis details to support additional file formats. The full list of supported formats now includes IPYNB
, HTML
, PDF
, JPG
, JPEG
, SVG
, PNG
, GIF
, MD
, and raw content preview is available for all other text file formats, for example RMD
, R
, PY
, etc.
Release notes
Recently published apps
Cutadapt was updated to the latest version and upgraded to CWL1.0.
Release notes
Data model versions
To help you maintain your dataset queries and align them with changes in underlying data models, we have introduced data model versioning.
Data models determine the nomenclature for all elements that are included in a model, define all entities and their properties, as well as relationships between different entities. When there is a new version of a data model, the structure of these elements and their relationships can change, thus rendering previously saved queries non-runnable.
To make the changes more visible and help you rebuild your queries, data model versions are displayed at various locations in the Data Browser for each of the available datasets, as well as the update history. Learn more
Recently published apps
DESeq2 was updated to the latest version and upgraded to CWL1.0.
Release notes
Real-time job monitoring / Instance metrics [new feature, BETA]
With these new features we are providing additional job monitoring and debugging tools, which are readily accessible for each task during its execution. You are now able to monitor execution progress in a more extensive way and find out the reasons for stuck, prolonged, or misconfigured jobs on your own, while we continue to ensure the same level of reproducibility.
Real-time job monitoring
As long as the instance on which the job was running is active, you can now access additional information about the job execution environment.
Instance metrics
The CGC now also lets you access instance metrics information for all instances used in task execution. This information is available during task execution and for 15 days after the task has been executed.
Release notes
Data Cruncher multiple environment setups
When selecting your Data Cruncher environment, you are now able to chose between different environment setups for your JupyterLab or RStudio analysis. Each environment setup is a set of preinstalled libraries that is available every time an analysis is started and is intended for a specific purpose.
For this first release, we are enabling support for machine learning use cases using CPU or GPU instances. Learn more.
Release notes
Task validation allows additional characters in file names
We updated task validation to allow ~
(tilde) and #
(hash) characters in input file names. When executing tasks on the CGC, you will be able to use input files containing those characters in their names.
Recently published apps
The following additions were made to the Public Apps Gallery:
FastQC was updated to CWL1.0 and had it’s version bumped as well to the latest one.
The seq2HLA tool was published in CWL1.0 as well. This tool does HLA typing from RNA-Seq data.
Sleuth 0.30.0 was published in CWL1.0. This tool does differential expression analysis with results coming in from any of our favorite pseudoaligners (i.e. Salmon or Kallisto). The tool is wrapped in accordance with our existing differential expression tools (like DESeq2), so the user experience should be similar.
Kallisto/Salmon Sleuth workflow was published, as a CWL1.0 app as well. This workflow is meant to provide end-to-end read quantification + differential expression solution, starting from FASTQ files and generating a differential expression report with Sleuth. The FASTQs are processed beforehand with either Kallisto or Salmon (as chosen), allowing for ultra-fast transcript quantification, so that Sleuth analysis can proceed in the fastest time possible.
Release notes
PDC data update on the CGC
PDC data on the CGC has been updated to match the PDC Data Release from December 20, 2019.
New studies released:
CPTAC GBM Discovery Study - Proteome
CPTAC GBM Discovery Study - Phosphoproteome
CPTAC GBM Discovery Study - CompRef Proteome
CPTAC GBM Discovery Study - CompRef Phosphoproteome
The update includes 2090 files and ~500 GB of data.
See more information about the history and contents of each PDC data update on the CGC.
Release notes
Recently published apps
The following apps had their versions updated and were bumped to CWL1.0:
STAR
STAR Workflow
STAR-Fusion
Deep Variant
Release notes
Docker repository management improvements
With this release of Docker Repository Management improvements, we provide our users with a better user interface, as well as additional functionalities. The new Docker registry section under the Developer tab is introduced to:
Provide more details on Docker images, such as tag, size, image ID, SHA digest, time of last update.
Provide logs for all push and delete actions over a repository.
Provide example command lines for docker login, push and pull commands.
Allow users to delete a Docker repository or delete an image by tag.
Allow users to manage Docker repository membership and permission level.
Allow users to create new empty repositories. This means that users can now create a Docker repository and mark it as private before pushing any content (images) to the repository.
Release notes
GDC Datasets version update
As of December 27, GDC datasets available through the Data Browser and the API correspond to GDC Data Release 21.
Recently published apps
Tabix 1.9 toolkit was updated to CWL1.0.
Release notes
ICDC data now available for import on the CGC
Integrated Canine Data Commons (ICDC) is a cloud-based repository of canine cancer data that was established to further research on human cancers by enabling comparative analysis with canine cancer. We have implemented a file import system that allows you to import ICDC data into your projects using manifest files generated on the ICDC website.
PDC data update
Currently available version of PDC data on the CGC has been updated with the following PDC releases:
International Cancer Proteogenomic Consortium - Proteogenomic Characterization of HBV-Related Hepatocellular Carcinoma data, December 2019.
Pediatric Brain Tumor Atlas - CBTTC program Pediatric/AYA brain tumor dataset, November 2019.
Release notes
Bulk moving of files and folders via the API
In order to help you further optimize your API usage and the number of calls required to organize files and folders within projects, we have introduced the option of moving files or folders in bulk from one project location to another. Bulk move is aimed at improving API usage and user experience in general for all users who use the API to run analyses at scale.
Recently published apps
The following toolkits had their versions updated and were bumped to CWL1.0:
SnpEff
Samtools
Release notes
Automatic billing notifications now sent via email to CGC users
When running a task on the CGC, billing group owners (both those using pilot funds and those using standard billing groups) will get automatic email notifications in the following situations:
when they are approaching their spending limit (10% left) and
when the spending limit is reached.
Emails for standard billing group owners will contain information about the limit amount and a link to the corresponding billing group. Automatic billing notifications will also create JIRA tickets for the Seven Bridges Support Team, so they can be informed about these situations.
Release notes
Search by ID through multiple datasets at once
We have improved the existing Search by ID feature by enabling you to perform a search that will be applied across all available datasets. The search is performed by clicking Search by ID from the Data Browser’s dataset selection screen, returns sets of matched entities from all available datasets and allows you to select an entity (or a combination of entities) to start the Data Browser with. The search covers every available UUID and ID, either belonging to an entity or property, while retaining the existing capability of searching by file name.
Release notes
Recently published apps
Several new CWL1.0 apps have been published to the Public Apps Gallery:
New BROAD Best Practices workflows: Data Pre-processing and Germline snps and indels variant calling in version 4.1.0.0. These workflows are built according to BROAD’s best practices following their WDL scripts, and together they allow for producing analysis-ready BAM files and VCF files with germline mutations.
eQTL analysis workflows: FastQTL and MatrixEQTL – Expression quantitative trait loci (eQTLs) are genomic variants related to variation in expression levels of mRNAs. These loci could be either cis, in the neighborhood of a gene transcription start site (TSS) or trans, distant eQTLs.
NanoStringQCPro 1.10.0: NanoString® has introduced the nCounter technology for direct counting of molecules in samples, which enables direct detection of specific RNA, DNA and protein molecules. It provides highly robust data across clinically relevant samples while reducing hands-on time and simplifying analysis.
Release notes
Added support for Amazon EC2 P3 GPU Instances
We have added support for Amazon P3 GPU instance family to the CGC. Amazon EC2 P3 instances deliver high performance compute in the cloud with up to 8 NVIDIA® V100 Tensor Core GPUs and up to 100 Gbps of networking throughput. These instances deliver up to one petaflop of mixed-precision performance per instance to significantly accelerate machine learning and high performance computing applications.
Release notes
CGC meets Dockstore
Now you can import CWL workflows from Dockstore.org with a single click. Dockstore is an open platform for sharing Docker-based apps described with the Common Workflow Language (CWL), Workflow Description Language (WDL) or Nextflow, which enables bioinformaticians to share analytical tools that can be executed in a compliant execution environment, such as the CGC. This integration should allow users to have streamlined interoperability between the two platforms without the need to manually port apps by exporting and importing CWL code. Learn more.
Define Compute Resources per Task Run
When creating a task via visual interface, you are now able to set top level instance type and max number of parallel instances for your execution without having to create a new version of the app. Learn more about setting execution hints on task level from our documentation.
Release notes
Human Cell Atlas Preview Datasets Public Project
Human Cell Atlas Preview Datasets are now available as a public project on the CGC. The project contains files released to the research community within the first three single-cell sequencing datasets as “Human Cell Atlas Preview Datasets”. The available datasets are:
Census of Immune Cells
Ischaemic Sensitivity of Human Tissue
Melanoma Infiltration of Stromal and Immune Cells