Skip to content. | Skip to navigation

Personal tools
Log in

Navigation

You are here: Home / Services / Bioinformatic Services

Bioinformatic Services

Analysis modules offered by the Bioinformatics Platform at the GSC are described below. Additional analysis is available upon request. Please contact us for more information.

Analysis

Description

Standard analysis, included with data generated at the GSC

The costs for Illumina sequencing include providing a binary alignment file (bam) for all sequenced libraries. Scripts are available for download to convert bam formatted files to fastq files for independent off-site alignment.  If the UCSC genome browser supports the reference genome used for the alignment, additional files for visualizing the alignment results in the genome browser are also available on request at no additional charge. These file types include 'wig' (and/or 'bigwig'), 'bedgraph' (and/or 'bigbedgraph') and 'bed' (and/or 'bigbed') formats.  Additional support (i.e. requesting additional visualization files) is available up to 45 days following the posting of the data.

Genome/exome single nucleotide variants (SNVs) analysis

Single sample analysis: This pipeline detects small variants in the sequence data as compared to the reference genome (which includes SNVs and insertions and deletions (indels)).

Paired sample analysis: This pipeline detects SNVs and indels that are in one library and not in the other library (e.g. tumour vs. matched normal comparison). Resulting variants are reported in four plain text files in vcf format including two files with SNV results and two files with indel results, and in both cases one file lists all the somatic SNVs/indels identified, and the other file lists only the very high confidence SNVs/indels. The results are annotated for the high confidence files.

Transcriptome (RNA-Seq) SNV analysis

This pipeline detects single nucleotide variants in RNA-Seq sequence data as compared to the reference genome.

Both unfiltered and confidence-filtered SNV lists are provided, in tab delimited plain text and vcf formats, respectively.

If the sequencing data is from a human sample, the filtered SNVs are also annotated to describe whether the event occurs within a known transcript, and whether the event is known to occur in the general population (dbSNP membership) or whether the event is in the Catalog Of Somatic Mutations in Cancer (COSMIC).

Note: RNA-Seq indel detection is also available, and is similar to that shown for genome/exome (as above).

Loss of heterozygosity (LOH) and

Copy number variation (CNV)


A healthy human genome should have two non-identical copies of chromosomes 1 to 22. However, in some regions, the information from one of the two copies may be lost. These regions are referred to as LOH (Loss of Heterozygosity) regions. The coordinates for the regions where LOH is observed will be provided in a text file.

The CNV pipeline compares two samples (e.g. tumour and matched normal) and detects any areas of the DNA with a variation in copy number (i.e. deletions or amplifications). The results are provided in a text file.

Plots of LOH and CNV by chromosome are also provided.

Coverage analysis (aka gene expression quantification)


 

This pipeline yields information on coverage of exons, introns, and whole genes based on publicly available annotations. The organism must be specified. This pipeline does not support splicing and other transcript isoform-level reporting. The resulting file is a text file which includes raw and normalized metrics, including RPKM.

  • Results include coverage measures for exons, introns, transcripts based on publicly available annotations.
  • The coverage measures reported include RPKM as well as raw read counts and average coverage depth.

QC - RNA coverage metrics

This pipeline reports coverage information for an RNA alignment including number of genes with coverage above various thresholds, 5'/3' bias, and strand-specificity. The result is a text file.

Exon-exon junction support This pipeline can be run on RNA-Seq alignments generated internally. Using the list of genes used to create the RNA-Seq alignment reference, this pipeline reports the number of reads with large-gapped alignments that span an intron. The result is a text file.
In development: RNA-Seq isoform-level expression analysis Information on coverage of exons and introns based on publicly available annotations.

RNA-Seq differential expression (DE) analysis

*Custom request only

This pipeline compares two individual libraries or two groups of libraries. The purpose is to compare the expression of genes between RNA-Seq libraries. This is a custom analysis that can include:

  • coverage analysis for all samples.
  • case + control comparisons to determine a list of genes that differ the most significantly in expression between the two groups.
  • clustering of samples based on gene expression to identify possible subgroups.
  • This is a custom analysis and can be modified to suit the needs of researcher. 

miRNA expression analysis


This pipeline performs expression analysis by aligning the reads of an miRNA library and measuring expression levels. The result is a tab delimited expression matrix of all miRNAs in miRBase and all samples.

Example:

sample1sample2sample3
hsa-mir-144.MIMAT0000436 3 4 25
hsa-mir-144.MIMAT0004600 86 48 340
hsa-mir-145.MIMAT0000437 32506 96425 50975
hsa-mir-145.MIMAT0004601 312 1684 846
...

miRNA novel gene prediction

This pipeline identifies possible novel miRNAs not found in public databases.

miRNA differential expression (DE) analysis

*Custom request only


This pipeline compares two individual libraries or two groups of libraries. The purpose is to compare the expression of miRNAs between libraries or groups. This is a custom analysis that can include:

  • coverage analysis for all samples.
  • case + control comparisons to determine a list of genes that differ the most significantly in expression between the two groups.
  • clustering of samples based on gene expression to identify possible subgroups.
  • This is a custom analysis and can be modified to suit needs of researcher. 

Transcriptome assembly (Trans-ABySS)

 

This pipeline aligns assembled contigs to a specified reference genome. The process includes annotation of novel transcripts, splicing, untranslated regions (UTRs), exons/introns, large scale rearrangements and fusions, small scale indels, internal tandem duplication (ITDs) and partial tandem duplication (PTDs). The result is a custom file.

Genome assembly

 

This pipeline aligns assembled contigs to a specified reference genome. The process includes identification of large scale rearrangements and fusions and small scale indels. The result is a custom file.

Genome validator

 

This pipeline compares transcriptome and/ or genome rearrangements with normal or other related bam files to identify orthogonal support for events. Libraries for comparison must be specified and BAM files must be available. The result is a custom file. Example: Tumour transcriptome data compared to tumor, normal and FFPE (archival) genome data.

Microbial detection

This pipeline can be added to transcriptome or genome assembly analysis. It assembles RNA-seq reads to identify possible microbial content. The result is a custom file and can include:

  • Detection of standard identified microbes
  • Detection of user specified microbes
  • Investigation of non-human reads not matching to standard/specified microbe species (on request)

Chromatin immunoprecipitation sequencing (ChIP seq) analysis

This pipeline detects 'peaks' from the alignment of ChIPseq data to the genome, which are areas where target proteins are likely to be interacting with the DNA in a cell. The output is a peaks file and a bigwig file, which can be used to visualize the peaks in a browser.

In Development: Analysis of bisulphite (bisulfite) genomes

Includes QC metrics, including bisulphite conversion rates and variant calling.
Page last modified Jun 11, 2015