Targeted Assembler of [short] Sequence Reads

Project Description


TASR is a genomics application that allows hypothesis-driven interrogation of genomic regions (sequence targets) of interest.  It only considers NGS reads for assembly that have overlap potential to input sequence targets.


Some of the applications of TASR include, but are not limited to, detecting expressed transcripts in cancer sample RNA-Seq data, profiling fusion transcripts or transcripts with variant bases, investigating chromosomal breakpoints in WGS data, predicting HLA types, etc


It has enormous potential for interrogating comprehensively enormous data sets, such as those from The Cancer Genome Atlas (TCGA).  For instance, with TASR, we profiled a novel lncRNA in over 7600 RNA-Seq bam files representing that many samples from 27 cancer types in less than 7 days (160 simultaneous process at a time in cluster sharing environment) on modest hardware (total computing power of 974 cores, 79 IBM X3550 M2 Intel Xeon E5430, Sun X2270 M2 Intel Xeon X5355 and Sun X2200 M2 AMD Opteron 2354 computers averaging 2.5 GHz and 16GB RAM each networked using Gigabit ethernet infrastructure).  This is achieved in part to TASR's  low memory requirements and speedy assembly execution (reading .bam file with samtools took ~7h in average while the assembly 0.5 seconds).


TASR, which is a specialized de novo targeted assembler utility derived from our SSAKE assembler (http://www.bcgsc.ca/platform/bioinfo/software/ssake), is the engine behind HLAminer (http://www.bcgsc.ca/platform/bioinfo/software/hlaminer), the first published technology for obtaining HLA predictions directly from NGS shotgun datasets (WGS, RNA-Seq, Exon capture).


Concept of targeted assembly implemented in TASR can be visualized here: www.youtube.com/watch?v=j-g8Geh5ST8

TASR is implemented in PERL and runs on any platform where PERL is installed.


About the author: www.renewarren.ca 

If you use TASR in your research, please cite:

Warren RL, Holt RA. 2011. Targeted Assembly of Short Sequence Reads. PLoS ONE 6(5): e19816. doi:10.1371/journal.pone.0019816

Current Release
TASR 1.6.1

Released Aug 05, 2016

Support for compressed sequence reads files
More about this release…

All Releases

Version Released Description Compatibility Licenses Status
1.6.1 Aug 05, 2016 Support for compressed sequence reads files More about this release… GPL final
1.6 Jan 01, 2016 Initial Bloom filter implementation. Better prefix-tree handling. Whole pair recruitment leading to longer, more specific contigs. More about this release… GPL final
1.5.1 Dec 24, 2013 fixed TASR for Perl >= 5.16.0, where deprecated getopts.pl has been removed. More about this release… GPL
Project Resources

Project owner: Rene Warren