A multi-split mapping algorithm for circular RNA, splicing, trans-splicing and fusion detection

Genome Biol. 2014 Feb 10;15(2):R34. doi: 10.1186/gb-2014-15-2-r34.

Abstract

Numerous high-throughput sequencing studies have focused on detecting conventionally spliced mRNAs in RNA-seq data. However, non-standard RNAs arising through gene fusion, circularization or trans-splicing are often neglected. We introduce a novel, unbiased algorithm to detect splice junctions from single-end cDNA sequences. In contrast to other methods, our approach accommodates multi-junction structures. Our method compares favorably with competing tools for conventionally spliced mRNAs and, with a gain of up to 40% of recall, systematically outperforms them on reads with multiple splits, trans-splicing and circular products. The algorithm is integrated into our mapping tool segemehl (http://www.bioinf.uni-leipzig.de/Software/segemehl/).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • DNA, Complementary / genetics
  • High-Throughput Nucleotide Sequencing
  • RNA / genetics*
  • RNA Splicing / genetics*
  • RNA, Circular
  • RNA, Messenger / metabolism
  • Software
  • Trans-Splicing / genetics*

Substances

  • DNA, Complementary
  • RNA, Circular
  • RNA, Messenger
  • RNA

Associated data

  • GEO/GSE29040
  • GEO/GSE43574
  • GEO/GSM951482
  • SRA/SRR018261
  • SRA/SRR018262
  • SRA/SRR166809
  • SRA/SRX151602