Bioinformatics & Data Analysis Services

Our computational pipelines and databases process more than 100 terabases of sequence per month. Our core services include:

development and assistance with experimental design
genetic variant detection
somatic mutation detection
mutation signature analysis
transcriptome analysis (including miRNAs, down to single-cell resolution)
structural variant analysis
copy number variant analysis
de novo genome assembly and annotation
epigenomic (bisulfite and ChIP-seq) analysis
sample tracking and database management

We also publish open-source software and scripts.

Our team of bioinformaticians includes sequence analysts, database experts, IT specialists and quality assessment professionals. Our technology platform is ISO27001 certified for information security and includes 24 deep learning GPU devices, 30,000+ hyper-threaded cores and 20+ PetaBytes of computer storage.

Learn More:

All of our collaborators receive the raw sequence and binary alignment (BAM) files for all in-house sequenced libraries. We also offer alignment and analysis of externally generated libraries.

Options include:

Externally generated libraries

Analysis of externally generated fastq files is available upon request, providing that the data is compatible with our pipelines. The data will be trimmed to internal project requirements.

Re-alignment of previously sequenced samples

Alignment to a single reference, as a BAM file, is included for each sequenced library. Additional alignments can be generated if the reference sequence is publicly available.

Custom reference alignment

We support alignment to custom references that are publicly available, or which can be provided as a FASTA file, dependent upon the quality of the file.

Somatic analysis is offered for single sample, paired tumour/normal and multiple timepoint. We offer a variety of bioinformatic services for these samples.

Single Sample:

SNV and Indels

Generated from whole genome, exome and RNA sequencing, this pipeline detects small variants in the sequence data as compared to the reference genome which includes single nucleotide variants (SNVs) and insertions/deletions (indels). Variants are associated with gene and dbSNP information. *Genome, Exome or RNA offered

Copy number variation

Generated from whole genome sequencing, this pipeline calls non-diploid regions in a single genome, using the overall coverage of the genome to calculate a background coverage, and identifying areas with higher or lower than expected coverage. CNV regions are provided. *Genome, Exome or RNA offered

Paired Tumour/Normal:

Loss of heterozygosity (LOH)

This pipeline detects regions that are heterozygous in the normal and homozygous in the matched tumour. This can only be run on genome. LOH regions are provided, along with plots. *Genome Only

SNV and Indels

This pipeline detects SNVs and indels that are in one library and not in the other library from whole genome or exome sequencing

Copy number variation

This pipeline can detect regions of copy number change between a normal and matched tumour. This can be run on genome and exomes, although results are generally noisier with exomes, and FFPE samples. Regions of copy number change are provided, along with gene annotation and plots. *Genome or Exome

Multiple timepoint:

Targeted analysis of specific variants

This pipeline looks for a user specified list of SNVs, indels or gene fusions to confirm if a variant is present in a sample. This can be run on genome, exome or transcriptome data.

For cancer immunogenetics the GSC offers the following:

HLA Typing
Neoantigen prediction
Tumour cell type abundance
Clonetype analysis of T and B Cell receptor repertoire

Please contact us for more information.

The epigenetic analysis we offer is methylation analysis.

Methylation analysis

This pipeline aligns bisulphite treated genome libraries and reports the methylation status at each base. Alignment and conversion QC metrics are also provided.

We offer many different pipelines for structural variant (SV) analysis including copy number changes, alignment and assembly-based SV calling as well as consensus SV calling.

Copy number changes

Alignment-based SV Calling

Can be performed on RNAseq or genomic reads.
This pipeline performs alignment-based SV calling on genome data. Events called include: translocations, inversions, deletion, duplications, small insertions.

Assembly-based SV calling

Can be preformed on RNAseq or genomic reads.
This pipeline performs de novo sequence assembly on the RNAseq reads (ABySS). The assembled contigs are used to call structural rearrangements including novel transcripts, alternative splicing, large scale rearrangements and fusions, small scale indels, ITDs and PTDs (RNAseq) (trans-ABySS).

Consensus SV calling

This pipeline compares SV calls from multiple samples from the same patient. Examples include: identification of somatic SV calls in tumour and absent from matched normal, expressed SVs present in RNA and matched genome, or novel rare SVs present in a child but absent from both parents.

The expression analysis options we offer are for RNA gene and exon level quantification, RNA transcript isoform level quantification as well as miRNA expression quantification and novel gene prediction. We also offer other custom analysis of RNA and miRNA expression data.

RNA gene and exon level quantification

This pipeline calculates normalized coverage (RPKM) at gene and exon level. QC metrics include gene diversity, 5'/3' bias, and strand-specificity.

RNA transcript isoform level quantification

This pipeline calculates normalized coverage of all known transcripts from the raw sequence data. The pipeline does not detect novel isoforms.

miRNA expression quantification

This pipeline calculates normalized coverage of known miRNAs (RPM).

miRNA novel gene prediction

The identification of possible novel miRNAs not found in public databases.

Differential expression and other custom analysis of RNA and miRNA expression data

Custom analysis of RNAseq data beyond expression quantification on individual libraries may include:

Differential expression between 2 individual libraries or 2 groups of libraries to identify up- or down-regulated genes.
Clustering of samples based on gene expression to identify possible subgroups.
GO term annotation of significantly up- and down-regulated genes to identify dysregulated pathways

We provide a number of other bioinformatic services including genome assembly, germline analysis and many others. We are also able to submit data to public repositories.

Genome Assembly

non-reference organism
human genome assembly

Single sample
Trio analysis
Pedigree calling

Microbial characterization and classification

This pipeline calculates normalized (RPM) levels of known microbes including bacteria, viruses and fungi. Can be run on genome, exome or RNAseq data. For quantification of user-specified microbial species that are not on our production list, there will be an additional cost for customizing our analysis.
In addition to quantification of standard microbial species, custom analysis can be done to investigate unclassified microbial content, or detect integration into the human genome using assembly based methods.

Complex tissue cell characterization

This pipeline uses expression analysis to identify cell composition of a tissue. Results can be compared and plotted against external data (Eg. TCGA cancer types).

We can help to facilitate the submission of data to public repositories including EGA, SRA, DBGAP, DCC, and ICGC Argo.

Questions?

Didn't find what you're looking for?

Please feel free to contact us at info@bcgsc.ca and we will get back to you shortly.

Or tell us about your project and how we can help:

Canada’s Michael Smith Genome Sciences Centre respectfully acknowledges that we operate on the traditional, ancestral and unceded territories of the xʷməθkwəy̓əm (Musqueam), Səl̓ílwətaʔ/Selilwitulh (Tsleil-Waututh), and Skwxwú7mesh (Squamish) nations who have cared and nurtured this land for all time. We give thanks, as uninvited guests, to be able to live and work on these lands.

Learn More:

Standard Analysis

Options include:

Somatic Analysis

Single Sample:

SNV and Indels

Copy number variation

Paired Tumour/Normal:

Loss of heterozygosity (LOH)

SNV and Indels

Copy number variation

Multiple timepoint:

Targeted analysis of specific variants

Cancer Immunogenetics

Epigenetic Analysis

Methylation analysis

Structural Variant Analysis

Copy number changes

Alignment-based SV Calling

Assembly-based SV calling

Consensus SV calling

Expression Analysis

RNA gene and exon level quantification

RNA transcript isoform level quantification

miRNA expression quantification

miRNA novel gene prediction

Differential expression and other custom analysis of RNA and miRNA expression data

Genome Assembly

Genome Assembly

Germline Analysis

Cell and Tissue Characterization

Microbial characterization and classification

Complex tissue cell characterization

Submission to External Repositories

Questions?

Or tell us about your project and how we can help:

Intro

Notes

Genome Sciences Centre