Divya Sain Divya Sain

Release notes

ICDC data now available for import on the CGC

Integrated Canine Data Commons (ICDC) is a cloud-based repository of canine cancer data that was established to further research on human cancers by enabling comparative analysis with canine cancer. We have implemented a file import system that allows you to import ICDC data into your projects using manifest files generated on the ICDC website.

PDC data update

Currently available version of PDC data on the CGC has been updated with the following PDC releases:

  • International Cancer Proteogenomic Consortium - Proteogenomic Characterization of HBV-Related Hepatocellular Carcinoma data, December 2019.

  • Pediatric Brain Tumor Atlas - CBTTC program Pediatric/AYA brain tumor dataset, November 2019.

Read More
Divya Sain Divya Sain

Release notes

Bulk moving of files and folders via the API

In order to help you further optimize your API usage and the number of calls required to organize files and folders within projects, we have introduced the option of moving files or folders in bulk from one project location to another. Bulk move is aimed at improving API usage and user experience in general for all users who use the API to run analyses at scale.

Recently published apps

The following toolkits had their versions updated and were bumped to CWL1.0:

  • SnpEff

  • Samtools

Read More
Divya Sain Divya Sain

Release notes

Automatic billing notifications now sent via email to CGC users

When running a task on the CGC, billing group owners (both those using pilot funds and those using standard billing groups) will get automatic email notifications in the following situations:

  • when they are approaching their spending limit (10% left) and

  • when the spending limit is reached.

Emails for standard billing group owners will contain information about the limit amount and a link to the corresponding billing group. Automatic billing notifications will also create JIRA tickets for the Seven Bridges Support Team, so they can be informed about these situations.

Read More
Divya Sain Divya Sain

Release notes

Search by ID through multiple datasets at once

We have improved the existing Search by ID feature by enabling you to perform a search that will be applied across all available datasets. The search is performed by clicking Search by ID from the Data Browser’s dataset selection screen, returns sets of matched entities from all available datasets and allows you to select an entity (or a combination of entities) to start the Data Browser with. The search covers every available UUID and ID, either belonging to an entity or property, while retaining the existing capability of searching by file name.

Read More
Divya Sain Divya Sain

Release notes

Recently published apps

Several new CWL1.0 apps have been published to the Public Apps Gallery:

New BROAD Best Practices workflows: Data Pre-processing and Germline snps and indels variant calling in version 4.1.0.0. These workflows are built according to BROAD’s best practices following their WDL scripts, and together they allow for producing analysis-ready BAM files and VCF files with germline mutations.

eQTL analysis workflows: FastQTL and MatrixEQTL – Expression quantitative trait loci (eQTLs) are genomic variants related to variation in expression levels of mRNAs. These loci could be either cis, in the neighborhood of a gene transcription start site (TSS) or trans, distant eQTLs.

NanoStringQCPro 1.10.0: NanoString® has introduced the nCounter technology for direct counting of molecules in samples, which enables direct detection of specific RNA, DNA and protein molecules. It provides highly robust data across clinically relevant samples while reducing hands-on time and simplifying analysis.

Read More
Divya Sain Divya Sain

Release notes

Added support for Amazon EC2 P3 GPU Instances

We have added support for Amazon P3 GPU instance family to the CGC. Amazon EC2 P3 instances deliver high performance compute in the cloud with up to 8 NVIDIA® V100 Tensor Core GPUs and up to 100 Gbps of networking throughput. These instances deliver up to one petaflop of mixed-precision performance per instance to significantly accelerate machine learning and high performance computing applications.

Read More
Divya Sain Divya Sain

Release notes

CGC meets Dockstore

Now you can import CWL workflows from Dockstore.org with a single click. Dockstore is an open platform for sharing Docker-based apps described with the Common Workflow Language (CWL), Workflow Description Language (WDL) or Nextflow, which enables bioinformaticians to share analytical tools that can be executed in a compliant execution environment, such as the CGC. This integration should allow users to have streamlined interoperability between the two platforms without the need to manually port apps by exporting and importing CWL code. Learn more.

Define Compute Resources per Task Run

When creating a task via visual interface, you are now able to set top level instance type and max number of parallel instances for your execution without having to create a new version of the app. Learn more about setting execution hints on task level from our documentation.

Read More
Divya Sain Divya Sain

Release notes

Human Cell Atlas Preview Datasets Public Project

Human Cell Atlas Preview Datasets are now available as a public project on the CGC. The project contains files released to the research community within the first three single-cell sequencing datasets as “Human Cell Atlas Preview Datasets”. The available datasets are:

  • Census of Immune Cells

  • Ischaemic Sensitivity of Human Tissue

  • Melanoma Infiltration of Stromal and Immune Cells

Read More
Divya Sain Divya Sain

Release notes

Access task secondary files via the API

You can now use our sevenbridges-python client to access secondary files for task inputs and outputs.

New and improved functionality:

  1. API users can now see exactly which files were used as secondary files for inputs.

  2. Python client can now easily get those files via a simple call, as shown in the example below.

  3. All of this is also supported for CWL 1.x tools and workflows, where the secondary files can be defined as JS expressions.

Whole Genome Sequencing - Quality Control - CWL1.0 Workflow

Whole Genome Sequencing - Quality Control - CWL1.0 Workflow is intended as a general-purpose QC flow for users processing WGS data, regardless of the number of samples. It should offer plots which can be easily visually inspected by the end users, as well as structured data output suitable for aggregation and parsing in an automated setup.

Read More
Divya Sain Divya Sain

Release notes

Export files to a volume within the same region

It is now possible to mount volumes from all supported cloud providers and regions in read-write (RW) mode on the CGC. File export is possible to volumes that are in the same location (cloud provider and region) as the file that is being exported, which prevents additional data transfer costs to be caused by the export procedure.

Read More
Divya Sain Divya Sain

Release notes

GDC DATASETS VERSION UPDATE

As of August 7, GDC datasets available through the Data Browser and the API correspond to GDC Data Release 18.

Read More
Divya Sain Divya Sain

Release notes

Import Files from the PDC

We have implemented an additional file import system that allows you to import proteomic data into your projects using manifest files generated on the Proteomic Data Commons (PDC) Data Portal. The process consists of two stages:

  • generating a manifest file for the selected data on the PDC Data Portal; and

  • importing the selected data into a project on the CGC, with the help of the generated manifest file.

The process currently works with CPTAC3 data only, as indicated in our documentation

We have also made PDC CPTAC3 metadata available on the CGC, as a single JSON file available in the Public Files gallery. For more details on how to find the file and use the metadata, please read our short tutorial.

Read More
Divya Sain Divya Sain

Release notes

Recently published apps

We published two additional GDC workflows on the CGC - mRNA Analysis pipeline and Tumor-only Variant Calling pipeline.

The mRNA pipeline performs quantification analysis on raw RNA-Seq data (FASTQs or unmapped BAM files) with STAR for alignment and HTseq for counting.

The Tumor-only Variant Calling workflow utilizes GATK4's MuTect2 to call variants on tumor samples. The workflow is used for for harmonization of genomic data for datasets such as The Cancer Genome Atlas (TCGA).

Read More
Divya Sain Divya Sain

Release notes

GDC Datasets version update

As of July 10, GDC datasets available through the Data Browser and the API correspond to GDC Data Release 17.

CPTAC-3 data release

With this release we will have controlled access data from the CPTAC-3 project available on the CGC for search and filtering in the Data Browser and through the API.

Read More
Divya Sain Divya Sain

Release notes

Supported browsers update

Internet Explorer is no longer a supported browser on the Cancer Genomics Cloud. When trying to access the CGC using Internet Explorer, you will be presented with an adequate explanatory message stating that you are using an unsupported browser and suggesting that you switch to a supported one.

We have also updated the minimum required versions for the supported browsers:

Read More
Divya Sain Divya Sain

Release notes

Recently published apps

The following apps have been ported to CWL 1.0 and are now available as CWL 1.0 apps in the Public Apps gallery:

  • Optitype 1.2

  • VEP annotation workflow 90.5

  • Ensembl-VEP 90.5

Read More
Divya Sain Divya Sain

Release notes

Writing rate limit-efficient API scripts

We put new documentation online that helps you making your API scripts rate limit-efficient. Code snippets demonstrate recommended use of the Seven Bridges Python client to minimize API calls for common tasks, including finding projects, iterating over result sets of queries, importing files from volumes, exporting files to volumes, updating file metadata, copying files between projects, deleting files, and submitting tasks for execution.

Read More
Divya Sain Divya Sain

Release notes

Support for Google Cloud Preemptible Instances (beta)

In line with availability of Spot Instances for AWS-based projects, we are now introducing support for Google Cloud Preemptible Instances in projects that are based in a Google Cloud location. As with AWS Spot Instances, Preemptible instances can also significantly reduce the cost of your task executions as they are the cloud provider’s spare capacity that is offered at lower prices than regular on-demand instances.

Learn more from our documentation.

Read More
Divya Sain Divya Sain

Release notes

Supported instances update

You can now use next generation AWS Memory Optimized instances (R5) in task executions and Data Cruncher analyses. R5 instances support the high memory requirements of certain applications to increase performance and reduce latency.

Learn more about supported instance types.

Read More
Divya Sain Divya Sain

Release notes

Multi-cloud

If you store your files in AWS US East (N. Virginia) and/or GCP US West (Oregon) regions, the CGC now allows you to manage all your work from a single space and spin up chosen computation resources at the location where your data lives.

New CWL web editor is now live

We have released an updated version of our CWL web editor. This release integrates the functionality of our desktop editor, Rabix Composer, with the CGC.

Read More