Back to All Events

Increasing Data Access and Interoperability of the Sequence Read Archive (SRA) through GA4GH Data Repository Service (DRS) API

In the ever-expanding landscape of genomics research, efficient data access and analysis are paramount. Enter the SRA to DRS Converter, a new workflow in the Cancer Genomics Cloud SRA Tools Suite that bridges the gap between the Sequence Read Archive (SRA) and the Data Repository Service (DRS). 

Unlocking 14 Petabytes of Sequencing Data 

The SRA houses over 14 petabytes of invaluable sequencing data relevant to human health and medical research. However, traditional methods of accessing that data involve downloading files using the SRA Toolkit, resulting in redundant copies and significant time overhead. 

Our innovative workflow flips the script. Instead of downloading files locally, researchers can now directly link to the SRA data via a Data Repository Service (DRS) server. Imagine the convenience of pointing to the exact location of your data on Amazon Web Services or Google Cloud, eliminating the need for duplicate copies. 

With the SRA to DRS Converter, you can seamlessly integrate these data streams into your existing software pipelines. Run analyses, perform variant calling or transcript counting, and extract meaningful insights—all without the hassle of local file management. Researchers no longer have to pay storage for duplicate files, saving researchers and taxpayers money and allowing it to be reallocated to funding research.

Join us for an enlightening session with discussion 

  • Jared Rozowsky will explore the GA4GH standards that underpin the DRS protocol, providing a deeper understanding of its architecture. 

  • Cera Fisher will demonstrate the SRA to DRS Converter in action, showcasing its utility within real-world workflows. 

About the speakers

Dr. Jared Rozowsky received his PhD and MSc in Biomedical Engineering from the University of Florida, and holds BS degrees in Mathematics and Biomedical Engineering. Jared is the Program Manager for the CAVATICA platform powered by Seven Bridges, a multi-tenant platform that serves the Gabriella Miller Kids First Data Resource Center and the Common Fund Data Ecosystem, among others.

Dr. Cera Fisher received her PhD in Evolutionary Biology from the University of Connecticut and holds MS and BS degrees in Biology and Society from Arizona State University. Cera is a Community Engagement Manager for the Cancer Genomics Cloud. Cera’s research interests include arthropod genomics and transcriptomics, developmental evolution, and bioinformatics education.