Trends in Genetics
Volume 33, Issue 7, July 2017, Pages 464-478
Journal home page for Trends in Genetics

Review
The Dimensions, Dynamics, and Relevance of the Mammalian Noncoding Transcriptome

https://doi.org/10.1016/j.tig.2017.04.004Get rights and content

Trends

The mammalian transcriptome is hugely diverse owing to pervasive transcription and alternative splicing. Advances in RNA-Seq (e.g., targeted, single-molecule, and single-cell techniques) continue to shape our understanding.

lncRNA diversity remains under-appreciated, and the dynamics of lncRNA expression, splicing, and functional roles remain poorly characterized.

High-resolution and single-cell studies show that lncRNAs are not poorly expressed but are expressed with heightened spatiotemporal precision. lncRNAs are also enriched for splicing, with near-universal alternative splicing of noncoding exons.

The emergence of high-throughput forward-genetic screens utilizing CRISPR/Cas9 targeted genome manipulation and precise, scalable methods for resolving RNA structure and RNA–protein interactions has accelerated lncRNA characterization.

Precise, dynamic expression and complex splicing fit with central role of lncRNAs in the mammalian developmental program.

The combination of pervasive transcription and prolific alternative splicing produces a mammalian transcriptome of great breadth and diversity. The majority of transcribed genomic bases are intronic, antisense, or intergenic to protein-coding genes, yielding a plethora of short and long non-protein-coding regulatory RNAs. Long noncoding RNAs (lncRNAs) share most aspects of their biogenesis, processing, and regulation with mRNAs. However, lncRNAs are typically expressed in more restricted patterns, frequently from enhancers, and exhibit almost universal alternative splicing. These features are consistent with their role as modular epigenetic regulators. We describe here the key studies and technological advances that have shaped our understanding of the dimensions, dynamics, and biological relevance of the mammalian noncoding transcriptome.

Section snippets

Appreciating Transcriptome Diversity

Mammals possess roughly the same number and a similar repertoire of protein-coding genes as nematode worms. By contrast, the intergenic and intronic regions of the mammalian genome are far greater. Indeed, while the number of protein-coding genes is largely static across the animal kingdom, noncoding genome content increases in size with developmental complexity [1].

Initial studies of the mammalian transcriptome were prefaced on the assumption that most genes encode proteins and that mRNAs

The Reality of Pervasive Transcription

The first clear evidence that the mammalian transcriptome included large numbers of non-protein-coding intergenic and antisense RNAs, as well as many stable intron-derived RNAs, came from genome-wide tiling arrays 20, 21, 22 and sequencing of cloned cDNAs 4, 23, 24, 25. These unexpected findings garnered controversy, and when very few reads obtained in early RNA-Seq experiments aligned outside known protein-coding genes the evidence for ‘pervasive transcription’ was questioned [26].

However,

The Similar Life Histories of mRNAs and lncRNAs

Aside from several minor idiosyncrasies (reviewed elsewhere [40]), many if not most lncRNAs are regulated, transcribed, and processed in a similar fashion to mRNAs [41].

lncRNAs and mRNAs are roughly comparable in size and structure 37, 42, although some lncRNAs are very large, in excess of 100 kb [43]. Similarly to mRNAs, many lncRNAs are transcribed by RNA polymerase II, regulated by morphogens and conventional transcription factors, dysregulated in disease, capped at their 5′-ends, and

lncRNA Expression Is Highly Precise and Dynamic

One of the key concerns about the biological relevance of lncRNAs has been their low abundance in tissue samples, sometimes argued to be simply a manifestation of ‘transcriptional noise’ [51]. However, accumulating evidence suggests that this reflects heightened spatiotemporal precision rather than low background expression.

It is clear that, while some lncRNAs such as MALAT1 and NEAT1 are widely expressed [52], most lncRNAs are highly tissue-specific, more so than protein-coding genes 4, 30, 36

Prolific Alternative Splicing Diversifies the Transcriptome

Extensive alternative splicing of human mRNAs was recognized many years ago 66, 67, but the scope of its influence on the mammalian transcriptome was not fully appreciated before the advent of RNA-Seq. Early systematic analyses of alternative splicing with RNA-Seq showed that 92–94% [68] or 92–97% [69] (i.e., probably all) multi-exon human protein-coding genes undergo alternative splicing. Unique isoforms may be deployed in specific contexts, remolding the transcriptome during development and

Near-Universal Alternative Splicing of Noncoding Exons

lncRNAs also undergo alternative splicing, although their relatively low abundance in homogenized tissues hinders accurate resolution of these events. The GENCODE catalog (v7) of noncoding RNAs lists alternative isoforms for only around a quarter of lncRNA loci, and indicates that lncRNAs generally have fewer exons and shorter mature transcripts than mRNAs [37]. However, a subsequent detailed characterization of 398 lncRNAs from the same catalog by rapid amplification of cDNA ends and long-read

Functional Characterization of lncRNAs: Unique Challenges and Emerging Solutions

Many well-characterized lncRNAs function as regulatory molecules in the epigenetic control of gene expression, and fulfill roles in differentiation and development 2, 89. These roles are easily reconciled with the distinctive features of lncRNA biology described above, namely their precise expression and complex alternative splicing, providing a conceptual framework to guide further discovery and characterization.

We owe much of what we know about lncRNA function to the characterization of

lncRNAs as Modular Epigenetic Regulators

The phenomena interrogated by the techniques mentioned above are central to understanding lncRNA functionality – namely, the ability of lncRNAs to form specific and multilateral RNA–protein, RNA–DNA, and RNA–RNA interactions. Their diverse binding properties and flexibility in size and structure means that lncRNAs are ideally suited to facilitate interactions between other biomolecules, and thereby organize and regulate cellular processes [105].

It is unsurprising then that many lncRNAs

Glossary

Adaptive radiation
an evolutionary process in which organisms diversify rapidly from an ancestral species into a multitude of new forms.
Branch point
a genetic element involved in splicing located near the 3′ end of the intron and immediately upstream of the poly-pyrimidine tract.
Clustered regularly interspaced short palindromic repeats (CRISPR)
a genetic element found in prokaryotes, which forms the basis of a recent genome engineering technology (CRISPR/Cas9) that enables permanent modification

References (121)

  • K. Sarma

    ATRX directs binding of PRC2 to Xist RNA and Polycomb targets

    Cell

    (2014)
  • G. Liu

    A meta-analysis of the genomic and transcriptomic composition of complex life

    Cell Cycle

    (2013)
  • K.V. Morris et al.

    The rise of regulatory RNA

    Nat. Rev. Genet.

    (2014)
  • J. Cheng

    Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution

    Science

    (2005)
  • P. Carninci

    The transcriptional landscape of the mammalian genome

    Science

    (2005)
  • Z. Wang

    RNA-Seq: a revolutionary tool for transcriptomics

    Nat. Rev. Genet.

    (2009)
  • T.R. Mercer

    Targeted RNA sequencing reveals the deep complexity of the human transcriptome

    Nat. Biotechnol.

    (2011)
  • D. Sharon

    A single-molecule long-read survey of the human transcriptome

    Nat. Biotechnol.

    (2013)
  • H. Tilgner

    Comprehensive transcriptome analysis using synthetic long-read sequencing reveals molecular co-association of distant splicing events

    Nat. Biotechnol.

    (2015)
  • S. Stephen

    Large-scale appearance of ultraconserved elements in tetrapod genomes and slowdown of the molecular clock

    Mol. Biol. Evol.

    (2008)
  • M.K. Iyer

    The landscape of long noncoding RNAs in the human transcriptome

    Nat. Genet.

    (2015)
  • I. Ulitsky

    Evolution to the rescue: using comparative genomics to understand long non-coding RNAs

    Nat. Rev. Genet.

    (2016)
  • M. Pheasant et al.

    Raising the estimate of functional human sequences

    Genome Res.

    (2007)
  • T.R. Mercer

    Specific expression of long noncoding RNAs in the mouse brain

    Proc. Natl. Acad. Sci. U. S. A.

    (2008)
  • S.J. Liu

    Single-cell analysis of long non-coding RNAs in the developing human neocortex

    Genome Biol.

    (2016)
  • I.W. Deveson

    Universal alternative splicing of noncoding exons

    bioRxiv

    (2017)
  • O. Shalem

    High-throughput functional genomics using CRISPR-Cas9

    Nat. Rev. Genet.

    (2015)
  • E.J. McFadden et al.

    Biochemical methods to investigate lncRNA and the influence of lncRNA:protein complexes on chromatin

    Biochemistry

    (2016)
  • P. Kapranov

    Large-scale transcriptional activity in chromosomes 21 and 22

    Science

    (2002)
  • J.L. Rinn

    The transcriptional activity of human chromosome 22

    Genes Dev.

    (2003)
  • P. Kapranov

    Examples of the complex architecture of the human transcriptome revealed by RACE and high-density tiling arrays

    Genome Res.

    (2005)
  • Y. Okazaki

    Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs

    Nature

    (2002)
  • S. Katayama

    Antisense transcription in the mammalian transcriptome

    Science

    (2005)
  • G. St Laurent

    Intronic RNAs constitute the major fraction of the non-coding RNA in mammalian cells

    BMC Genomics

    (2012)
  • H. van Bakel

    Most ‘dark matter’ transcripts are associated with known genes

    PLoS Biol.

    (2010)
  • L. Jiang

    Synthetic spike-in standards for RNA-seq experiments

    Genome Res.

    (2011)
  • S.A. Hardwick

    Spliced synthetic genes as internal controls in RNA sequencing experiments

    Nat. Methods

    (2016)
  • M.B. Clark

    The reality of pervasive transcription

    PLoS Biol.

    (2011)
  • S. Djebali

    Landscape of transcription in human cells

    Nature

    (2012)
  • E. Birney

    Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project

    Nature

    (2007)
  • ENCODE

    An integrated encyclopedia of DNA elements in the human genome

    Nature

    (2012)
  • FANTOM

    A promoter-level mammalian expression atlas

    Nature

    (2014)
  • F. Yue

    A comparative encyclopedia of DNA elements in the mouse genome

    Nature

    (2014)
  • C.I. Brannan

    The product of the H19 gene may function as an RNA

    Mol. Cell. Biol.

    (1990)
  • M.N. Cabili

    Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses

    Genes Dev.

    (2011)
  • T. Derrien

    The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression

    Genome Res.

    (2012)
  • X.C. Quek

    lncRNAdb v2.0: expanding the reference database for functional long noncoding RNAs

    Nucleic Acids Res.

    (2015)
  • C.-C. Hon

    An atlas of human long non-coding RNAs with accurate 5′ ends

    Nature

    (2017)
  • J.J. Quinn et al.

    Unique features of long non-coding RNA biogenesis and function

    Nat. Rev. Genet.

    (2016)
  • J.S. Mattick

    The genetic signatures of noncoding RNAs

    PLoS Genet.

    (2009)
  • Cited by (0)

    View full text