Despite the rapid advance in single-cell RNA sequencing (scRNA-seq) technologies within the last decade, single-cell transcriptome analysis workflows have primarily used gene expression data while isoform sequence analysis at the single-cell level still remains fairly limited. Detection and discovery of isoforms in single cells is difficult because of the inherent technical shortcomings of scRNA-seq data, and existing transcriptome assembly methods are mainly designed for bulk RNA samples. To address this challenge, we developed RNA-Bloom, an assembly algorithm that leverages the rich information content aggregated from multiple single-cell transcriptomes to reconstruct cell-specific isoforms. Assembly with RNA-Bloom can be either reference-guided or reference-free, thus enabling unbiased discovery of novel isoforms or foreign transcripts. We compared both assembly strategies of RNA-Bloom against five state-of-the-art reference-free and reference-based transcriptome assembly methods. In our benchmarks on a simulated 384-cell data set, reference-free RNA-Bloom reconstructed 37.9%-38.3% more isoforms than the best reference-free assembler, whereas reference-guided RNA-Bloom reconstructed 4.1%-11.6% more isoforms than reference-based assemblers. When applied to a real 3840-cell data set consisting of more than 4 billion reads, RNA-Bloom reconstructed 9.7%-25.0% more isoforms than the best competing reference-based and reference-free approaches evaluated. We expect RNA-Bloom to boost the utility of scRNA-seq data beyond gene expression analysis, expanding what is informatically accessible now.
Journal
Genome Research, 2020