Unexpected selection to retain high GC content and splicing enhancers within exons of multiexonic lncRNA loci

  1. Chris P. Ponting
  1. MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3PT, United Kingdom
  1. Corresponding author: wilfried.haerty{at}dpag.ox.ac.uk

Abstract

If sequencing was possible only for genomes, and not for RNAs or proteins, then functional protein-coding exons would be recognizable by their unusual patterns of nucleotide composition, specifically a high GC content across the body of exons, and an unusual nucleotide content near their edges. RNAs and proteins can, of course, be sequenced but the extent of functionality of intergenic long noncoding RNAs (lncRNAs) remains under question owing to their low nucleotide conservation. Inspired by the nucleotide composition patterns of protein-coding exons, we sought evidence for functionality across lncRNA loci from diverse species. We found that such patterns across multiexonic lncRNA loci mirror those of protein-coding genes, although to a lesser degree: Specifically, compared with introns, lncRNA exons are GC rich. Additionally we report evidence for the action of purifying selection to preserve exonic splicing enhancers within human multiexonic lncRNAs and nucleotide composition in fruit fly lncRNAs. Our findings provide evidence for selection for more efficient rates of transcription and splicing within lncRNA loci. Despite only a minor proportion of their RNA bases being constrained, multiexonic intergenic lncRNAs appear to require accurate splicing of their exons to transact their function.

Keywords

Footnotes

  • Received July 16, 2014.
  • Accepted November 25, 2014.

This article, published in RNA, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

| Table of Contents
OPEN ACCESS ARTICLE