Unamplified cap analysis of gene expression on a single-molecule sequencer

  1. Yoshihide Hayashizaki2
  1. OMICs Science Center, RIKEN Yokohama Institute, Tsurumi-ku, Yokohama 230-0045, Japan
    1. 1 These authors contributed equally to this work.

    Abstract

    We report the development of a simplified cap analysis of gene expression (CAGE) protocol adapted for single-molecule sequencers that avoids second strand synthesis, ligation, digestion, and PCR. HeliScopeCAGE directly sequences the 3′ end of cap trapped first-strand cDNAs. As with previous versions of CAGE, we better define transcription start sites (TSS) than known models, identify novel regions of transcription and alternative promoters, and find two major classes of TSS signal, sharp peaks and broad regions. However, using this protocol, we observe reproducible evidence of regulation at the much finer level of individual TSS positions. The libraries are quantitative over 5 orders of magnitude and highly reproducible (Pearson's correlation coefficient of 0.987). We have also scaled down the sample requirement to 5 μg of total RNA for a standard HeliScopeCAGE library and 100 ng for a low-quantity version. When the same RNA was run as 5-μg and 100-ng versions, the 100 ng was still able to detect expression for ∼60% of the 13,468 loci detected by a 5-μg library using the same threshold, allowing comparative analysis of even rare cell populations. Testing the protocol for differential gene expression measurements on triplicate HeLa and THP-1 samples, we find that the log fold change compared to Illumina microarray measurements is highly correlated (0.871). In addition, HeliScopeCAGE finds differential expression for thousands more loci including those with probes on the array. Finally, although the majority of tags are 5′ associated, we also observe a low level of signal on exons that is useful for defining gene structures.

    Footnotes

    • Received September 17, 2010.
    • Accepted April 4, 2011.
    | Table of Contents

    Preprint Server