miRDeep2 accurately identifies known and hundreds of novel microRNA genes in seven animal clades

Nucleic Acids Res. 2012 Jan;40(1):37-52. doi: 10.1093/nar/gkr688. Epub 2011 Sep 12.

Abstract

microRNAs (miRNAs) are a large class of small non-coding RNAs which post-transcriptionally regulate the expression of a large fraction of all animal genes and are important in a wide range of biological processes. Recent advances in high-throughput sequencing allow miRNA detection at unprecedented sensitivity, but the computational task of accurately identifying the miRNAs in the background of sequenced RNAs remains challenging. For this purpose, we have designed miRDeep2, a substantially improved algorithm which identifies canonical and non-canonical miRNAs such as those derived from transposable elements and informs on high-confidence candidates that are detected in multiple independent samples. Analyzing data from seven animal species representing the major animal clades, miRDeep2 identified miRNAs with an accuracy of 98.6-99.9% and reported hundreds of novel miRNAs. To test the accuracy of miRDeep2, we knocked down the miRNA biogenesis pathway in a human cell line and sequenced small RNAs before and after. The vast majority of the >100 novel miRNAs expressed in this cell line were indeed specifically downregulated, validating most miRDeep2 predictions. Last, a new miRNA expression profiling routine, low time and memory usage and user-friendly interactive graphic output can make miRDeep2 useful to a wide range of researchers.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Algorithms*
  • Animals
  • Argonaute Proteins / metabolism
  • Cell Line, Tumor
  • Computer Graphics
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Interspersed Repetitive Sequences
  • Mice
  • MicroRNAs / chemistry
  • MicroRNAs / genetics*
  • MicroRNAs / metabolism
  • Polymerase Chain Reaction
  • RNA Interference
  • Ribonuclease III / antagonists & inhibitors
  • Ribonuclease III / genetics
  • Sequence Analysis, RNA
  • Software

Substances

  • Argonaute Proteins
  • MicroRNAs
  • Ribonuclease III

Associated data

  • GEO/GSE31069