When defining task execution settings, you can now enable memoization. Achieve significant time and cost optimization of your project workload by letting the CGC reuse existing results of your previous runs. Memoization can be enabled at project or task level, where the task-level setting overrides the project-level one.
Learn more from our documentation.
Multiple datasets selection for querying in Data Browser
With multiple dataset selection for simultaneous querying, you are now able to start data querying in Data Browser by selecting more than one dataset. This allows you to search selected datasets by common entities (e.g. Case) and property values (e.g. male for the gender property), and get combined results from selected datasets. In addition, you can select only those instances for an entity (e.g. Case identifier) that are common in all selected datasets, if such instances are available. Those can then be filtered further by other entities/properties that the common instances have in selected datasets.
Additionally, the List (table) view on the Data Browser page has been removed and priority has been given to the Detail page.
Improved organization of Public Reference Files
The Public Reference Files gallery has been renamed into Public Files and split into two categories, Public Reference Files and Public Test Files, where the former holds all common reference files, while the latter contains common test samples. Both of these categories can be accessed from the Data menu on the main menu bar.
This change does not affect the API, so all related API calls remain the same.
Recently published apps
GDC DNASeq Harmonization Workflow
The GDC DNASeq Harmonization Workflow is developed by the National Cancer Institute's Genomic Data Commons. It is used for harmonization of genomic data for datasets such as The Cancer Genome Atlas (TCGA) and is publicly available on the CGC.