Trends in Genetics
ReviewThe Dimensions, Dynamics, and Relevance of the Mammalian Noncoding Transcriptome
Section snippets
Appreciating Transcriptome Diversity
Mammals possess roughly the same number and a similar repertoire of protein-coding genes as nematode worms. By contrast, the intergenic and intronic regions of the mammalian genome are far greater. Indeed, while the number of protein-coding genes is largely static across the animal kingdom, noncoding genome content increases in size with developmental complexity [1].
Initial studies of the mammalian transcriptome were prefaced on the assumption that most genes encode proteins and that mRNAs
The Reality of Pervasive Transcription
The first clear evidence that the mammalian transcriptome included large numbers of non-protein-coding intergenic and antisense RNAs, as well as many stable intron-derived RNAs, came from genome-wide tiling arrays 20, 21, 22 and sequencing of cloned cDNAs 4, 23, 24, 25. These unexpected findings garnered controversy, and when very few reads obtained in early RNA-Seq experiments aligned outside known protein-coding genes the evidence for ‘pervasive transcription’ was questioned [26].
However,
The Similar Life Histories of mRNAs and lncRNAs
Aside from several minor idiosyncrasies (reviewed elsewhere [40]), many if not most lncRNAs are regulated, transcribed, and processed in a similar fashion to mRNAs [41].
lncRNAs and mRNAs are roughly comparable in size and structure 37, 42, although some lncRNAs are very large, in excess of 100 kb [43]. Similarly to mRNAs, many lncRNAs are transcribed by RNA polymerase II, regulated by morphogens and conventional transcription factors, dysregulated in disease, capped at their 5′-ends, and
lncRNA Expression Is Highly Precise and Dynamic
One of the key concerns about the biological relevance of lncRNAs has been their low abundance in tissue samples, sometimes argued to be simply a manifestation of ‘transcriptional noise’ [51]. However, accumulating evidence suggests that this reflects heightened spatiotemporal precision rather than low background expression.
It is clear that, while some lncRNAs such as MALAT1 and NEAT1 are widely expressed [52], most lncRNAs are highly tissue-specific, more so than protein-coding genes 4, 30, 36
Prolific Alternative Splicing Diversifies the Transcriptome
Extensive alternative splicing of human mRNAs was recognized many years ago 66, 67, but the scope of its influence on the mammalian transcriptome was not fully appreciated before the advent of RNA-Seq. Early systematic analyses of alternative splicing with RNA-Seq showed that 92–94% [68] or 92–97% [69] (i.e., probably all) multi-exon human protein-coding genes undergo alternative splicing. Unique isoforms may be deployed in specific contexts, remolding the transcriptome during development and
Near-Universal Alternative Splicing of Noncoding Exons
lncRNAs also undergo alternative splicing, although their relatively low abundance in homogenized tissues hinders accurate resolution of these events. The GENCODE catalog (v7) of noncoding RNAs lists alternative isoforms for only around a quarter of lncRNA loci, and indicates that lncRNAs generally have fewer exons and shorter mature transcripts than mRNAs [37]. However, a subsequent detailed characterization of 398 lncRNAs from the same catalog by rapid amplification of cDNA ends and long-read
Functional Characterization of lncRNAs: Unique Challenges and Emerging Solutions
Many well-characterized lncRNAs function as regulatory molecules in the epigenetic control of gene expression, and fulfill roles in differentiation and development 2, 89. These roles are easily reconciled with the distinctive features of lncRNA biology described above, namely their precise expression and complex alternative splicing, providing a conceptual framework to guide further discovery and characterization.
We owe much of what we know about lncRNA function to the characterization of
lncRNAs as Modular Epigenetic Regulators
The phenomena interrogated by the techniques mentioned above are central to understanding lncRNA functionality – namely, the ability of lncRNAs to form specific and multilateral RNA–protein, RNA–DNA, and RNA–RNA interactions. Their diverse binding properties and flexibility in size and structure means that lncRNAs are ideally suited to facilitate interactions between other biomolecules, and thereby organize and regulate cellular processes [105].
It is unsurprising then that many lncRNAs
Glossary
- Adaptive radiation
- an evolutionary process in which organisms diversify rapidly from an ancestral species into a multitude of new forms.
- Branch point
- a genetic element involved in splicing located near the 3′ end of the intron and immediately upstream of the poly-pyrimidine tract.
- Clustered regularly interspaced short palindromic repeats (CRISPR)
- a genetic element found in prokaryotes, which forms the basis of a recent genome engineering technology (CRISPR/Cas9) that enables permanent modification
References (121)
The technology and biology of single-cell RNA sequencing
Mol. Cell
(2015)- et al.
lincRNAs: genomics, evolution, and mechanisms
Cell
(2013) The expanding RNA polymerase III transcriptome
Trends Genet.
(2007)The long noncoding RNAs NEAT1 and MALAT1 bind active chromatin sites
Mol. Cell
(2014)Linking long noncoding RNA localization and function
Trends Biochem. Sci.
(2016)Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs
Cell
(2007)Origins and impacts of new mammalian exons
Cell Rep.
(2015)- et al.
Long noncoding RNAs in cell-fate programming and reprogramming
Cell Stem Cell
(2014) - et al.
Physiological roles of long noncoding RNAs: insight from knockout mice
Trends Cell Biol.
(2014) Role of a neuronal small non-messenger RNA: behavioural alterations in BC1 RNA-deleted mice
Behav. Brain Res.
(2004)