One of the most essential applications in bioinformatics affected by High-Throughput Sequencing (HTS) data deluge is the sequence alignment problem where nucleotide or amino acid sequences are queried against targets to find regions of close similarity. When queries are too many and/or targets are too large the alignment problem becomes a computationally challenging problem. This is especially true when targets are dynamic such as intermediate steps of a de novo assembly process. To address this problem, we have designed and developed DIDA, a distributed and parallel indexing and alignment algorithm. First, we partition the targets into smaller parts using a heuristic balanced cut. Next, we create an index for each partition. The reads are then “flowed” through a Bloom filter to dispatch the alignment task to the corresponding node(s). Finally, the reads are aligned on all partitions and the results are combined together to create the final output.
Mohamadi H, Vandervalk BP, Raymond A, Jackman SD, Chu J, et al. (2015) DIDA: Distributed Indexing Dispatched Alignment. PLoS ONE 10(4): e0126409. doi: 10.1371/journal.pone.0126409
Released Apr 24, 2015
This version introduces compression of intermediate files to reduce disk space requirements.
|1.0.1||Apr 24, 2015||This version introduces compression of intermediate files to reduce disk space requirements.||BCCA (academic use)||final|
|1.0.0||Feb 25, 2015||dida-wrapper, dida-mpi, and the batch versions are optimized. Fixed many portability issues and bugs, and improved some error messages.||BCCA (academic use)||final|
|0.1.3||Dec 13, 2014||Adding wrapper and fully streamlined versions of DIDA.||BCCA (academic use)||final|
|0.1.2||Jul 10, 2014||New merging step compatible with BWA, Bowtie, Novoalign, and ABySS-map with different merging strategies.||BCCA (academic use)||final|
|0.1.1||This is not a final release. Experimental releases should only be used for testing and development. Do not use these on production sites, and make sure you have proper backups before installing.||BCCA (academic use)||pre-release|