ABySS

Assembly By Short Sequences - a de novo, parallel, paired-end sequence assembler

Project Description

ABySS

ABySS is a de novo, parallel, paired-end sequence assembler that is designed for short reads. The single-processor version is useful for assembling genomes up to 100 Mbases in size. The parallel version is implemented using MPI and is capable of assembling larger genomes.

To assemble transcriptome data, see Trans-ABySS.

Publications

  • ABySS: A parallel assembler for short read sequence data. Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I. Genome Research, 2009-June. (Genome Research, PubMed)

  • De novo Transcriptome Assembly with ABySS. İnanç Birol, Shaun D Jackman, Cydney Nielsen, Jenny Q Qian, Richard Varhol, Greg Stazyk, Ryan D Morin, Yongjun Zhao, Martin Hirst, Jacqueline E Schein, Doug E Horsman, Joseph M Connors, Randy D Gascoyne, Marco A Marra and Steven JM Jones. Bioinformatics. 2009-June. (Bioinformatics Advance Access)
  • De novo assembly and analysis of RNA-seq data. Gordon Robertson, Jacqueline Schein, Readman Chiu, Richard Corbett, Matthew Field, Shaun D Jackman, Karen Mungall, Sam Lee, Hisanaga Mark Okada, Jenny Q Qian, Malachi Griffith, Anthony Raymond, Nina Thiessen, Timothee Cezard, Yaron S Butterfield, Richard Newsome, Simon K Chan, Rong She, Richard Varhol, Baljit Kamoh, Anna-Liisa Prabhu, Angela Tam, YongJun Zhao, Richard A Moore, Martin Hirst, Marco A Marra, Steven J M Jones, Pamela A Hoodless Marco A Marra, Steven J M Jones, Pamela A Hoodless and İnanç Birol. Nature Methods. 2010-Oct. (Nature)

Current Release
ABySS 1.5.2

Released Jul 10, 2014

In this release we introduce Konnector, a fast and memory-efficient tool to fill the gap between paired-end reads. Konnector determines the intervening sequence by building a Bloom filter de Bruijn graph and searching for paths between paired-end reads within the graph. A companion tool called abyss-bloom is also provided which can be used to construct reusable bloom filter files for input to Konnector; otherwise, Konnector will build an in-memory Bloom filter for one-time use. In addition to Konnector, we have fixed bugs related to compiling with GCC 4.8+ and parsing BWA output SAM files.
More about this release…

Download file Get ABySS for all platforms
Source
If you are using Plone 3.2 or higher, you probably want to install this product with buildout. See our tutorial on installing add-on products with buildout for more information.

All Releases

Version Released Description Compatibility Licenses Status
1.5.2 Jul 10, 2014 In this release we introduce Konnector, a fast and memory-efficient tool to fill the gap between paired-end reads. Konnector determines the intervening sequence by building a Bloom filter de Bruijn graph and searching for paths between paired-end reads within the graph. A companion tool called abyss-bloom is also provided which can be used to construct reusable bloom filter files for input to Konnector; otherwise, Konnector will build an in-memory Bloom filter for one-time use. In addition to Konnector, we have fixed bugs related to compiling with GCC 4.8+ and parsing BWA output SAM files. More about this release… GPLv3 for non-commercial usage final
1.5.1 May 08, 2014 In this release we fix a compatibility issue with Trans-ABySS 1.5.0 where the output of abyss-filtergraph is not strand-specific. Also, we include additional FCC portability fixes. More about this release… GPLv3 for non-commercial usage final
1.5.0 May 01, 2014 In this release we have added full strand specific RNA-Seq support such that output contigs are correctly oriented with respect to the original transcripts sequenced. Also, there are new parameters to abyss-pe, xtip and Q, that are used to improve assembly in high coverage regions like highly expressed transcripts. Setting xtip=1 will more aggressively remove certain tips. The 'Q' parameter will prevent low quality bases from being used in the assembly. The version has been bumped to 1.5.0 to signify compatibility with Trans-ABySS 1.5.0. More about this release… GPLv3 for non-commercial usage final
1.3.7 Dec 11, 2013 Scaffolds can now be rescaffolded using long sequences such as RNA-Seq assemblies produced from Trans-ABySS. Added support for gcc 4.8+ and Mac OS X 10.9 Mavericks with clang. Finally, we've licensed ABySS under GPL for non-commercial purposes. Please read the LICENSE file for more details. More about this release… GPLv3 for non-commercial usage final
1.3.6 Jul 31, 2013 ABYSS and ABYSS-P are now ~20% faster. Fixed many portability issues and bugs, and improved some error messages. More about this release… BCCA (academic use) final
1.3.5 Mar 05, 2013 This release introduces new tools to merge overlapping read pairs, layout and merge contigs with perfect sequence overlap, and calculate contig contiguity and correctness metrics. Also, it includes updates to the existing documentation, bug fixes, and attempts to fill scaffold gaps with a consensus of all paths between contigs. More about this release… BCCA (academic use) final
1.3.4 May 30, 2012 This release eliminates two sources of misassemblies, one in the path extension logic of SimpleGraph. Two, the default value of m, which is the minimum overlap required between two contigs to merge them, is increased from 30 to 50. This release also fixes various portability issues. A new script, abyss-fatoagp, is included to create an AGP file for GenBank submission. More about this release… BCCA (academic use) final
1.3.3 Mar 13, 2012 Specify the minimum alignment length when aligning the reads to the contigs with the parameter l. Improve the scaffolding algorithm that identifies repeats. Improve the documentation. More about this release… BCCA (academic use) final
1.3.2 Dec 13, 2011 Improve distance estimates between contigs, enable scaffolding by default, and remove small shim contigs that don't add useful sequence to the assembly. The default aligner is abyss-map. MergePaths uses a non-greedy algorithm that reduces sequence duplication but may reduce contiguity. More about this release… BCCA (academic use) final
1.3.1 Oct 24, 2011 Fix a bug in KAligner and fix a compiler error for Mac OS X. More about this release… BCCA (academic use) final
1.3.0 Sep 09, 2011 Mate-pair data can be used to scaffold contigs. Specify your mate-pair libraries using the `mp' parameter of abyss-pe. More about this release… BCCA (academic use) final
1.2.7 Apr 15, 2011 Support using bwa or bowtie to align reads to contigs. New parameter, d, to specify the acceptable error of a distance estimate. More about this release… BCCA (academic use) final
1.2.6 Feb 07, 2011 Sequence variants are popped if the two variants are at least 90% similar. Contigs that overlap by fewer than k-1 bp are found and may be merged. More about this release… BCCA (academic use) final
1.2.5 Nov 15, 2010 Fix a colour-space-specific bug and a bug causing the error Assertion `fstSol.size() == 1' failed. More about this release… BCCA (academic use) final
1.2.4 Oct 14, 2010 Replace gaps of Ns that span a region of ambiguous sequence with a consensus sequence of the possible sequences that fill the gap. The consensus sequence uses IUPAC-IUB ambiguity codes. More about this release… BCCA (academic use) final
1.2.3 Sep 08, 2010 Fix two bugs that caused the error messages: Assertion `m_comm.receiveEmpty()' failed. and error: unexpected ID More about this release… BCCA (academic use) final
1.2.2 Aug 25, 2010 Merge contigs after popping bubbles. Handle multi-line FASTA sequences. Report the amount of memory used. More about this release… BCCA (academic use) final
1.2.1 Jul 12, 2010 Handle mate pair libraries with reverse-forward orientation as produced by circular, large-fragment libraries. Distance estimates are improved. More about this release… BCCA (academic use) final
1.2.0 May 26, 2010 Scaffold over gaps in coverage and unresolved repeats. Read sequence from SAM and BAM files. Set q=3 by default. Set E=0 when coverage is low (<2). Generate a Graphviz dot file of the paired-end assembly. More about this release… BCCA (academic use) final
1.1.2 Feb 15, 2010 Pop bubbles resulting from indels. Read tar files. Fix performance issues in ParseAligns by syncing KAligner threads periodically. More about this release… BCCA (academic use) final
1.1.1 Jan 19, 2010 Pop complex bubbles either completely or not at all. Choose better (typically lower) default values for the parameters e and c. More about this release… AFL final
1.1.0 Dec 21, 2009 ABySS will expand tandem repeats when it is possible to determine the exact number of the repeat. The paired-end path-finding algorithm, SimpleGraph, is multithreaded. Fixed a bug in MergePaths that could misassemble repeats larger than the paired-end fragment size. The output format of AdjList, DistanceEst and SimpleGraph has changed. More about this release… AFL final
1.0.9 May 15, 2009 Significantly reduce the memory usage of KAligner and ParseAligns. abyss-pe can read multiple input files and read FASTA or FASTQ files. More about this release… AFL final
1.0.8 Apr 02, 2009 Fix the bug causing the error Assertion `marked == split' failed. More about this release… AFL final
1.0.7 Mar 31, 2009 The parallel MPI assembler is now deterministic; it will produce the same result every time. More about this release… AFL final
1.0.6 Mar 25, 2009 Fix a race condition in the erosion algorithm. More about this release… AFL final
1.0.5 Mar 11, 2009 Portability fixes. More about this release… AFL final
1.0.4 Mar 09, 2009 Remove the need to specify the parameters -e,--erode and -b,--bubbles. Use less disk space by using pipes to avoid intermediate files. Many improvements to the paired-end algorithm. More about this release… BCCA (academic use) final
1.0.3 Feb 05, 2009 Tidy up the ends of blunt contigs. Merge blunt contigs that are connected by pairs and overlap. More about this release… BCCA (academic use) final
1.0.2 Nov 21, 2008 Include a parallel binary compiled for OpenMPI. More about this release… BCCA (academic use) final
1.0.16 Nov 13, 2009 Improve the performance and memory usage of KAligner and AdjList, particularly for very large data sets. More about this release… AFL final
1.0.15 Oct 19, 2009 New parameters, e and E, to set the coverage threshold of the erosion algorithm. Values for the parameters e and the coverage threshold, c, will be chosen automatically if they're not specified. The read length is now an optional parameter. Two important bug fixes, see below. More about this release… AFL final
1.0.14 Sep 08, 2009 Assemble multiple libraries of different fragment sizes. More about this release… AFL final
1.0.13 Aug 26, 2009 Read files compressed with gzip (.gz) or bzip2 (.bz2). More about this release… AFL final
1.0.12 Aug 19, 2009 Both ABYSS and KAligner are run only once per assembly, which speeds up the paired-end assembly stage by nearly a factor of two. The k-mer coverage information is correct in every contig file. A tool is included to convert colour-space contigs to nucleotide contigs. Discard reads that fail the chastity filter. More about this release… AFL final
1.0.11 Jul 21, 2009 Assemble colour-space reads. Read files in qseq format. KAligner is multithreaded. Integrate with Sun Grid Engine (SGE). Prevent misassemblies mediated by tandem segmental duplications. More about this release… AFL final
1.0.10 Jun 18, 2009 ParseAligns is improved to handle any number of reads as long as mate pairs are found interleaved in the same file. Merge overlapping paired-end contigs that were previously being missed in some situations. Number paired-end contigs so that their IDs do not overlap with the single-end contigs. More about this release… AFL final
1.0 Aug 07, 2008 Initial version of abyss. More about this release… BCCA (academic use) final