DATASETS AVAILABLE ON CGC


The CGC provides access to more than 3 Petabytes of publicly available data as part of the CRDC ecosystem, including open access and controlled access data. The CGC connects to Genomics Data Commons (GDC), Proteomics Data Commons (PDC), Integrated Canine Data Commons (ICDC), and Cancer Data Service (CDS) to provide access to these data. CGC provides access to Controlled and Open Access data to thousands of files and dozens of disease types. To access any of the following datasets refer to our guide on datasets here.

NATIONAL CANCER INSTITUTE

The NCI's Genomic Data Commons (GDC) provides the cancer research community with a unified repository and cancer knowledge base that enables data sharing across cancer genomic studies in support of precision medicine. 

  • The Cancer Genome Atlas (TCGA)  

  • Therapeutically Applicable Research to Generate Effective Treatments (TARGET)  

  • Clinical Proteomic Tumor Analysis Consortium (CPTAC)  

  • Human Cancer Model Initiative (HCMI)  

  • Cancer Genome Characterization Initiatives (CGCI)  

  • Foundation Medicine (FM)  

  • Multiple Myeloma Research Foundation (MMRF)  

  • Genomics Evidence Neoplasia Information Exchange (GENIE)  

  • Applied Proteogenomics Organizational Learning and Outcomes (APOLLO) 


NATIONAL CANCER INSTITUTE

The objectives of the National Cancer Institute’s Proteomic Data Commons (PDC) are:    

(1) to make cancer-related proteomic datasets easily accessible to the public, and   

(2) facilitate direct multiomics integration in support of precision medicine through interoperability with accompanying data resources (genomic and medical image datasets).  

The PDC was developed to advance our understanding of how proteins help to shape the risk, diagnosis, development, progression, and treatment of cancer. In-depth analysis of proteomic data allows us to study both how and why cancer develops and to devise ways of personalizing treatment for patients using precision medicine. 

  • CPTAC (Clinical Proteomic Tumor Analysis Consortium)  

  • International Cancer Proteogenomic Consortium (ICPC)  

  • Children's Brain Tumor Tissue Consortium (CBTTC)  

  • Applied Proteogenomics Organizational Learning and Outcomes (APOLLO)   


NATIONAL CANCER INSTITUTE

The Integrated Canine Data Commons (ICDC) was established to further research on human cancers by enabling comparative analysis with canine cancer. Canines have many genes similar to human genes, which are involved in cancers in similar ways. They develop spontaneous diseases and respond to treatments like humans, live in a family environment shared with humans, and like humans receive health care and participate in clinical trials. Canines are also of scientific interest because of their accelerated aging process. 

The data in the ICDC are structured and queryable according to the ICDC data model. Submitted data are harmonized to maintain data and metadata consistency, integrity and availability to the ICDC users and the rest of the CRDC. 

All data in the ICDC are open-access and, with appropriate attribution, can be included in publications. In addition to the ICDC’s graphical user interface, there is also an Application Programming Interface (API) that software programs may use to query the data. 

Clinical Trial 

  • COTC007B - Preclinical Comparison of Three Indenoisoquinoline Candidates in Tumor-Bearing Dogs 

  • COTC022 - A Contemporaneous Controlled Study of the Standard of Care (SOC) in Dogs with Appendicular Osteosarcoma 

  • UBC01 - Antitumor Activity and Molecular Effects of Vemurafenib in Dogs with BRAF-mutant Bladder Cancer 

Genomics 

  • GLIOMA01 - Comparative Molecular Life History of Spontaneous Canine and Human Gliomas 

  • MGT01 - Molecular Homology and Differences Between Spontaneous Canine Mammary Cancer and Human Breast Cancer 

  • ORGANOIDS01 - Characterization of Healthy and Urothelial Carcinoma Canine Organoids for Applications in Personalized Medicine and Translational Research 

  • OSA01 - A Multi-Platform Sequencing Analysis of Canine Appendicular Osteosarcoma 

  • TCL01 - Whole exome sequencing analysis of canine cancer cell lines 

  • UBC02 - Basal and Luminal Molecular Subtypes in Naturally-Occurring Canine Urothelial Carcinoma Are Associated With Tumor Immune Signatures and Dog Breed 

  • UC01 - Whole Exome Sequencing Analysis of Canine Urothelial Carcinomas without BRAF V595E 

Transcriptomics 

  • NCATS-COP01 - Models for Diagnosis and Treatment of Human Cancers Using Comparative Canine-Human Transcriptomics  


NATIONAL CANCER INSTITUTE

The Cancer Data Service (CDS) is one of several data commons within the Cancer Research Data Commons (CRDC).  

The CDS provides data storage and sharing capabilities for NCI-funded studies that fall under the following categories:  
•    Studies with data that do not match an existing CRDC data commons   
•    Studies with data that do not fit current data type criteria and/or the minimum metadata standards for a CRDC data commons  

CDS currently hosts a variety of data types from NCI projects such as the Human Tumor Atlas Network (HTAN), Division of Cancer Control and Population Sciences (DCCPS), and Childhood Cancer Data Initiative (CCDI) as well as data from independent research projects. 

  • PHS001287 - CPTAC Proteogenomic Study

  • PHS001437 - Pediatric Preclinical Testing Consortium (PPTC) 

  • PHS001524 - The Genetic Basis of Aggressive Prostate Cancer, The Role of Rare Variation 

  • PHS001554 - Detection of Colorectal Cancer Susceptibility Loci Using Genome-Wide Sequencing 

  • PHS001713 - Development of A Tumor Molecular Analyses Program and Its Use to Support Treatment Decisions (UNCseqTM) 

  • PHS001787 - Discovery of Colorectal Cancer Susceptibility Genes in High-Risk Families 

  • PHS001819 - Whole Genome Sequencing to Discover Familial Myeloma Risk Genes 

  • PHS001980 - University of Texas PDX Development and Trial Center Grant 

  • PHS002011 - Limited Use Pilot Test Data 

  • PHS002050 - Molecular Pathological Epidemiology of Colorectal Cancer 

  • PHS002250 - CIDR: Discovery, Biology, and Risk of Inherited Variants in Glioma 

  • PHS002305 - Washington University PDX Development and Trial Center (PDXNet) 

  • PHS002371 - Human Tumor Atlas Network (HTAN) 

  • PHS002432 - Wistar PDX Development and Trial Center 

  • PHS002504 - UCSF Database for the Advancement of JMML - Integration of Metadata with “Omic” Data (CCDI) 

  • PHS002517 - Molecular Characterization across Pediatric Brain Tumors and Other Solid and Hematologic Malignancies for Research, Diagnostic, and Precision Medicine (CCDI) 

  • PHS002518 - NGS Panel for Pediatric Malignancies (CCDI) 

  • PHS002529 - Comprehensive Genomic Sequencing of Pediatric Cancer Cases (CCDI) 

  • PHS002599 - Supplement data from Beat AML (acute myeloid leukemia) (CCDI) 

  • PHS002620 - Feasibility and Clinical Utility of Whole Genome Profiling in Pediatric and Young Adult Cancers (CCDI) 

  • PHS002637 - CIDR: The Role of Rare Coding Variation in Prostate Cancer in Men of African Ancestry - RESPOND Project 2

  • PHS002790 - Molecular Characterization Initiative (CCDI) 

  • PHS003111 - Clonal Evolution During Metastatic Spread in High-Risk Neuroblastoma (CCDI)