Release notes
Seven Bridges CLI now available for macOS ARM users
In order to make sure all of our users have uniform experience and can make the most out of the CGC, the Seven Bridges Command Line Interface (SB CLI) is now also available for macOS ARM (M1/M2) users. The new build allows the growing population of users using Apple computers with M1 and M2 chips to install the SB CLI and interact with the CGC from the command line. The macOS ARM version of the SB CLI is available for download from the Data Tools section and from the related documentation page.
Release notes
Disabled accounts can now be reactivated in a snap
Accounts that are locked or disabled due to inactivity can now be automatically reactivated by their owners. A new, streamlined flow allows you to initialize the process by sending a reactivation email to your email address after logging in with your last used credentials. By clicking the link in the email and setting a new password on the CGC, you will have unrestricted access to your account and data again.
Recently published apps
We have just published and updated our public apps gallery with:
GATK VariantEval BETA 4.2.5.0, a tool which is used for evaluating variant calls.
GATK FilterMutectCalls 4.2.5.0, a tool which is used to filter somatic SNVs and indels called by Mutect2.
Picard CreateSequenceDictionary 2.25.7, a tool for creating a DICT index file for a sequence.
WARP ExomeGermlineSingleSample 2.4.4, a pipeline for data pre-processing and variant calling in human WES data.
Release notes
DRS import from manifest file now available on the CGC
Expanding on the current feature that enables you to import DRS files by entering DRS URIs, we have now enabled DRS file import using manifest files containing all relevant information to import the files, along with associated metadata. This provides an easy and streamlined way to import a large number of DRS files from different sources such as Seven Bridges academic platforms or external sources.
Recently published apps
We have just published and updated our Public Apps gallery with the BCFtools 1.15.1 toolkit - CWL1.2, containing the following tools:
BCFtools Annotate - edits VCF files, adds or removes annotations.
BCFtools Call - calls SNPs/indels (former “view”).
BCFtools Cnv - calls Copy Number Variations.
BCFtools Concat - concatenates VCF/BCF files from the same set of samples.
BCFtools Consensus - creates consensus sequence by applying VCF variants.
BCFtools Convert - converts VCF/BCF to other formats and back.
Release notes
Recently published apps
We have just published and updated our Public Apps gallery with Regenie 3.1.3, a tool which is used for whole genome regression analysis.
Release notes
Recently published apps
We have published the following apps in our Public Apps gallery:
Mosdepth 0.3.3 toolkit: Mosdepth, a tool used for fast depth calculation on WGS, WES or targeted BAM and CRAM files and Mosdepth plot_dist which plots Mosdepth results.
Personal Cancer Genome Reporter 1.0.3 which is used for functional annotation and classification of somatic variants.
Cancer Predisposition Sequencing Reporter 1.0.0 which analyzes cancer-predisposing germline variants.
We have also updated versions and published tools from the following two toolkits: SRA (v3.0.0, CWL1.2) and Salmon (v1.5.2, CWL1.2). Tools that got the update are:
SRA sam-dump that converts SRA data into SAM format. With aligned data, NCBI uses Compression by Reference, which only stores the differences in base pairs between sequence data and the segment it aligns to. The process to restore original data, for example as FASTQ, requires fast access to the reference sequences that the original data was aligned to.
SRA fasterq-dump that converts SRA data into FASTQ format while using temporary files and multi-threading to speed up the extraction.
Release notes
Recently published apps
We have published three VarDict (v1.8.3, CWL1.2) tools and one workflow:
VarDictJava is the VarDict variant caller Java port. It can be used to call SNPs, MNVs, small indels or complex variants from DNA or RNA alignments. VarDictJava can be used for amplicon-based variant calling and supports both single sample and paired sample analysis.
VarDict var2vcf_valid, a CWL tool that takes VarDict variants tabular file and outputs variants in VCF format.
VarDict var2vcf_paired, a CWL tool that converts VarDict tabular output to VCF.
VarDict Variant Calling workflow (also VarDict v1.8.3, CWL1.2), which can be used for single sample and paired sample variant calling using VarDictJava starting from WES, WGS or amplicon data.
We have also published the following workflows and a toolkit:
CNVnator Analysis workflow 0.4.1 for CNV calling by doing read-depth (RD) analysis of input BAM files.
CNVpytor workflow 1.1 for CNV/CNA detection and analysis based on read depth and allele imbalance in WGS.
Release notes
Data Cruncher and Interactive Analysis become Data Studio and Interactive Browsers
Data Studio, previously Data Cruncher, is an interactive analysis tool which allows you to explore and visualize data using environments like JupyterLab and RStudio. Previously located under the Interactive Analysis tab, it has now been given a more prominent location in the project navigation by having its own tab located next to Tasks. With the removal of Data Studio from the Interactive Analysis tab, the tab's name has been changed to Interactive Browsers in order to better reflect its contents.
Recently published apps
We have just published an updated version (4.2.5.0) of Mutect2 workflows:
GATK Somatic SNVs and INDELs (Mutect2) 4.2.5.0, a workflow used for somatic short variant calling. It runs on a single tumor-normal pair or on a single tumor sample, and performs additional filtering and functional annotation tasks, and
GATK Create Mutect2 Panel of Normals 4.2.5.0 that creates a panel of normals for use in other GATK workflows. The workflow takes multiple normal sample callsets and passes them to GATK Somatic SNVs and INDELs (Mutect2) 4.2.5.0 with tumor-only mode (although it is called tumor-only, normal samples are given as the input) and additionally collates sites present in two or more samples into a sites-only VCF.
Three apps from the MetaXcan toolkit:
S-PrediXcan for computing associations between omic features and a complex trait starting from GWAS summary statistics.
S-MultiXcan for computing association from predicted gene expression to a trait, using multiple studies for each gene.
MetaMany for serially performing multiple MetaXcan runs on a GWAS study from summary statistics using multiple tissues.
The MetaXcan Workflow for computing associations between omic features and complex traits across multiple tissues. The workflow includes two tools from MetaXcan framework - MetaMany and S-MultiXcan and it uses summary statistics from a GWAS study and multiple models that predict the expression or splicing quantification.
Release notes
Recently published apps
We have just published the V-pipe 2.99.2 for SARS-CoV-2 workflow for analyzing high throughput SARS-CoV-2 sequencing data. V-pipe integrates several tools for the analysis of viral high throughput sequencing data. It allows for assessing viral diversity at the level of SNVs, short variant sequences (or local haplotypes), and long-range haplotypes (or global haplotypes).
Release notes
Recently published apps
We have just published the updated 0.7.17 version of BWA MEM Bundle, a well-known tool designed for aligning sequence reads onto a large reference genome, and BWA INDEX, used for indexing the reference sequence as a prior step required for BWA MEM Bundle. Both tools are published in CWL1.2.
Release notes
Recently published apps
We have published the following apps in our Public Apps gallery:
Cyrius (v1.1.1, CWL1.2), a tool that genotypes CYP2D6 in WGS data. It takes WGS BAM or CRAM files and creates a TSV report with CYP2D6 alleles.
Two PharmCAT (v1.6.0, CWL1.2) tools:
PharmCAT VCF Preprocess is a tool that takes a VCF file and prepares it for downstream processing with PharmCAT, and
PharmCAT, a tool for interpreting guideline variants in VCF files.
Two Biobambam2 (v2.0.183, CWL1.2) tools:
Biobambam2 Bamtofastq that converts BAM/CRAM/SAM files to FASTQ format, and
Biobambam2 Bamseqchksum - tool for calculating hashes for the contents of the provided alignments file.
Release notes
Recently published apps
We have just published the following apps:
An updated version of the SRA Download and Set Metadata workflow (SRA Toolkit 3.0.0) that downloads metadata associated with SRA accession via SRA Run Info CGI, (on-demand instance) FASTQ files and sets corresponding metadata.
OptiType (v1.3.5, CWL1.2), a tool designed for precision HLA typing from next-generation sequencing data. It is based on the assumption that the correct HLA genotype explains the highest number of mapped reads.
fastENLOC (v1.0, CWL1.2), a tool that enables integrative genetic association analysis of molecular QTL data and GWAS data.
Release notes
Recently published apps
We have just published the following apps in our Public Apps gallery:
TwoSampleMR, a tool that performs Mendelian randomization testing for a given exposure-outcome pair. It is a wrapper around the TwoSampleMR R package and uses summary statistics data for making causal inference.
CCS, a tool that combines multiple subreads of the same SMRTbell molecule and outputs one highly accurate consensus sequence.
lima, a tool used with PacBio single-molecule sequencing data for barcode and primer sequences identification.
PacBio Flowcell Data Processing, a workflow that can be used to process PacBio CCS or CLR data in preparation for variant calling.
PacBio CCS or CLR WGS Variant Calling workflow that can be used to call structural variants in PacBio CCS or CLR data. The workflow can also call small variants in CCS data using Clair3.
Release notes
Recently published apps
We’ve just published AnnotationDbi select and mapIds, a tool that maps one type of IDs to another. It is based on Bioconductor annotation data packages.
Release notes
Recently published apps
New apps have been added to the CGC:
Two tools from the Samplot toolkit:
Samplot Plot takes alignment files and coordinates for a region containing the SV call of interest (Chromosome, Start position, and End position) and creates a plot of the SV region.
Samplot Vcf can be used to create visualizations of structural variant calls from a VCF file.
Seven tools from the Smoove toolkit:
Smoove Annotate annotates SV calls with SV quality and gene information from GFF3 files.
Smoove Call calls structural variants with Lumpy and optionally calls svtyper.
Smoove Duphold annotates SV calls in the file based on information from the provided alignment files.
Smoove Genotype runs svtyper in parallel on provided SV inputs.
Smoove Merge merges SV calls from individual files with SV calls and sorts them using svtools.
Smoove Paste squares matching SV calls from individual files to a single joint file with final calls.
Smoove Plot-counts takes a VCF file created by other Smoove tools and plots counts of split and discordant reads before and after filtering.
Upgraded four Sambamba tools to 0.8.1 (and CWL 1.2) and added three new tools:
Sambamba Flagstat generates statistics from read flags in a BAM file.
Sambamba Index creates a BAI or FAI index for the provided input.
Sambamba Markdup can be used to mark or remove duplicate reads from an input BAM file.
Sambamba Merge merges alignments in BAM format.
Sambamba Slice can be used to copy a slice (region) of the coordinate sorted and indexed input file in BAM or FASTA format.
Sambamba Sort sorts alignments in BAM format.
Sambamba View accepts alignments in BAM or SAM format and outputs data in a user-specified format.
Release notes
GDC Datasets version update
As of March 11, 2022, GDC datasets available through the Data Browser and the API correspond to GDC Data Release 31.
Recently published apps
We have added four apps to our public apps gallery:
Single cell RNA-seq velocity analysis with scVelo 0.2.4 workflow that performs preprocessing, marker gene analysis, and velocity analysis of single-cell expression data. It is based on SingleCellExperiment, Seurat, scran, scater, AnnotationHub, scuttle, and scVelo.
Velocyto.py - Velocyto 0.17.17 is a package for the analysis of expression dynamics in single cell RNAseq data. In particular, it enables estimations of RNA velocities of single cells by distinguishing unspliced and spliced mRNAs in standard single-cell RNA sequencing protocols. Velocyto.py is a command line tool (distributed with the package) that is used to generate spliced/unspliced count matrices.
SBG single cell object convertor tool that performs conversion of single cell data object type for commonly used formats: Seurat, AnnotatedData, and SingleCellExperiment.
Single cell RNA-seq trajectory analysis with slingshot and tradeSeq, a tool that performs single cell trajectory analysis with slingshot 2.0.0, and differential expression testing on inferred trajectories with tradeSeq 1.6.0. Slingshot takes advantage of single cell data principal components analysis (PCA) and clustering to infer probable paths of cell development.
Release notes
Support for Nextflow and WDL workflows available on the CGC
Apart from significant contributions from Seven Bridges team members to the development of the Common Workflow Language (CWL) and its extensive implementation on the CGC, we are now taking a step further and providing support for two more workflow description languages, Nextflow and WDL. This presents a groundbreaking move in the direction of enabling you to reduce the time needed to bring your apps to the CGC, eliminate the need for conversion of your Nextflow or WDL code, while still allowing you to use a better interface for running workflows and all other out-of-the-box features in the Seven Bridges ecosystem.
CDS data import updates
The latest update of the CDS data import functionality on the CGC removes the limitation of having to use a controlled data project as the target project for CDS data import. The use of controlled data projects is still required for successful importing of controlled data from the CDS, but open access CDS data can now be freely imported in open data projects on the CGC.
Release notes
AWS i3 instances available on all environments
With this update you can use the newest Amazon EC2 I3 instances designed for data-intensive, high transaction, low latency workloads, offering the best price per I/O performance (I3) and the lowest price per GB of SSD instance storage on Amazon EC2 (I3en).
Recently published apps
We have published GATK RNAseq short variant discovery 4.2.0.0 workflow, which represents a CWL implementation of the official GATK best practices workflow given in WDL for RNASeq variant discovery. Starting from an unmapped BAM file, the workflow performs alignment to the reference genome, followed by marking of duplicates, reassigning of mapping qualities, base recalibration, variant calling, and variant filtering.
Release notes
Recently published apps
We have published 10 tools from the GRIDSS module software suite (toolkit) containing tools useful for the detection of genomic rearrangements:
GRIDSS tool, a structural variation caller for Illumina sequencing data. It calls variants based on alignment-guided positional de Bruijn graph genome-wide break-end assembly, split read, and read pair evidence.
GRIDSS Extract Overlapping Fragments is used to extract reads of interest for targeted GRIDSS variant calling.
GRIDSS Annotate VCF Kraken2 adds Kraken2 classifications to single breakend and breakpoint inserted sequences.
GRIDSS Annotate VCF RepeatMasker adds RepeatMasker classifications to inserted sequences.
GRIDSS GeneratePonBedpe aggregates variants from multiple VCFs and counts the number of samples supporting each.
Release notes
Recently published apps
We’ve just published OlinkAnalyze DE, a tool that performs differential expression analysis on Olink Normalized Protein eXpression (NPX) data, and OlinkAnalyze QC that generates a quality control and exploratory analysis report on Olink NPX data.
Release notes
SBFS support for macFUSE 4.x
SBFS is a command-line tool which enables interaction with CGC project files that are mounted as a local file system. In order to use SBFS, it is necessary to have the FUSE component installed. While FUSE is a part of the Linux kernel, on macOS it is necessary to install FUSE for macOS (which is now called macFUSE) and we are now adding support for macFUSE version 4.x (macFUSE 4.0.0 was released in October 2020, and that is when the name was changed from “FUSE for macOS” to “macFUSE”, while its latest version is macFUSE 4.2.4). Please note that SBFS is available as a BETA tool. Also, it’s not available for the Windows operating system, but only for Linux and macOS.