Trends in Genetics
Volume 36, Issue 10, October 2020, Pages 739-750
Journal home page for Trends in Genetics

Opinion
Simple Repeats as Building Blocks for Genetic Computers

https://doi.org/10.1016/j.tig.2020.06.012Get rights and content

Highlights

  • SSRs are important determinants of phenotype.

  • There is strong evidence that they adopt alternative non-B conformations in vivo.

  • These programmable elements change conformation according to context and are called flipons.

  • Their binary nature enables logical operations on both DNA and RNA substrates.

  • They embody elements of an instructive code that enhances the ability of genetic computers to evolve and adapt.

Processing of RNA involves heterogeneous nuclear ribonucleoproteins. The simple sequence repeats (SSRs) they bind can also adopt alternative DNA structures, like Z DNA, triplexes, G quadruplexes, and I motifs. Those SSRs capable of switching conformation under physiological conditions (called flipons) are genetic elements that can encode alternative RNA processing by their effects on RNA processivity, most likely as DNA:RNA hybrids. Flipons are elements of a binary, instructive genetic code directing how genomic sequences are compiled into transcripts. The combinatorial nature of this code provides a rich set of options for creating genetic computers able to reproduce themselves and use a heritable and evolvable code to optimize survival. The underlying computational logic potentiates a diverse set of genetic programs that modify cis-mediated heritability and disease risk.

Section snippets

Genetic Computation

Alternative DNA structures epitomize a nontraditional way of encoding genetic information [1,2]. Sequences that adopt alternative conformations under physiological conditions are called flipons [1,3]. Flipons are binary ON–OFF switches that enable genetic programming of adaptive responses to environmental changes. Flipons proffer a different way to perform computations with polynucleotides. They represent a novel innovation to fast track the evolution of metazoans, increasing the number of

What Are Flipons?

Flipons are nucleic sequences that can flip conformation under physiological conditions. They are elements of an instructive genetic code that specifies the compilation of transcripts form the genome (Figure 1, Key Figure) [1,3]. Examples of these non-B structures (NoBs) include the left-handed Z duplexes, three-stranded triplexes (T3Xs), four-stranded G quadruplexes (G4Qs), and I motifs (I4Ms) (Figure 1). They are referred to here as NoBs, regardless of whether they are made of RNA or DNA.

Flipons In Vivo

A number of criteria exist for demonstrating the relevance of NoBs to biology: biochemical validation in vitro; angstrom resolution structures to eliminate ersatz explanations; functional studies to demonstrate mechanism in vivo; Mendelian genetics to confirm causality (Box 1). These requirements are satisfied for Z DNA and evidence exists for the biological function of others [3] (Box 1).

Direct evidence for NoBs formation in mammalian cells under physiological conditions comes from in vivo

What Are SSRs?

Many of the sequences involved in NoB formation map to SSR, 1–6 base pairs long. SSRs comprise about 3% of the human genome [8]. The overlap between SSRs and NoBs is described in Table 1, and is not surprising given that simple repeats led to the discovery of NoBs soon after Watson and Crick modeled B-form duplex DNA (1953) [9]. The description of T3X (1957) and G4Q (1960) in synthetic homopolymer repeats [10., 11., 12.] was followed by the characterization of Z DNA in d(CG) repeat crystals

SSRs and Fliponware

Between 3% and 9% of SSRs have NoB signatures detectable in the permanganate experiments described previously [2]. The NoBs also overlap sites of transcription. Of those transcribed in a human Burkitt lymphoma cell line, 14.2% of Z-DNA sequence motifs colocalize with an RNA polymerase II. For G4Qs, the overlap is 17%. Some signatures, like that for cruciforms, are too infrequent to detect in these experiments [2], even though plasmids can form cruciforms when transfected into cells [15]. Some

SSR and Codonware

SSRs alter gene expression and phenotype in many ways. For example, the Genotype–Tissue Expression Project identified 28 000 SSRs in 17 tissues affecting DNA methylation [19] and gene expression [20], accounting for 10–15% of the cis heritability [21]. They play an important role in RNA processing. Recognition of SSRs by heterogeneous nuclear ribonucleoprotein particles (hnRNPs) is necessary but not sufficient for alternative splicing (AS) [22]. The effects of AS are many. Exons are included or

SSRs, Flipons, and DRHs

Here, I posit that SSRs bridge the biology of flipons and codons (referred to as fliponware and codonware in Figure 2, Box 2): they facilitate the transfer of information from the genome to the transcriptome. SSRs do so by forming NoBs from DRHs (Figure 3). The energy to flip DRHs to NoBs is supplied by RNA polymerases. These enzymes unwind DNA to generate negative supercoiling (NSC) in their wake [26], faster than topoisomerases can relax it. The NSC powers NoBs formation [2,27., 28., 29.].

The

Flipons as Binary Switches

Flipons, as exemplified by SSRs, represent a different way to perform computations with polynucleotides (Box 3). They can implement binary logic similar to the way sequence-specific binding proteins turn a gene on or off [40]. The probability of transition depends on SSR sequence, repeat length, base modifications, chromatin state, ionic conditions, and wetware [2,3,41., 42., 43., 44., 45., 46.]. The outcomes vary by flipon class.

Flipons and R Loops

Not all DRHs involve SSRs. R loops are larger structures formed when hybridization of the nascent transcript to the DNA template displaces the DNA-coding strand. Their formation is driven by NSC [46]. R loops form efficiently with G-rich transcripts, probably due to the stability of rG-dC DRHs [46,66] (Figure 3F). They occur most frequently at gene ends, spanning 100–2000 base pairs, similar in size to negatively supercoiled domains detected in vivo [26].

R loops provide another mechanism to

Flipon Failure and Disease

Flipons are dynamic structures. When they freeze or fail disease often results. Such malfunctions provide indirect evidence that NoBs do form in vivo. NoBs that persist are thought to explain the association of SSRs with genome fragility [72,73], DNA damage [4], cancer progression [74,75], and Mendelian disease [74., 75., 76.]. Frozen flipons also arise in diseases due to loss-of-function helicase mutations [37]. The same may happen with DNA repair pathway defects [77]. The failure of wetware

Computing with Flipons

The flipon code is by nature binary, based on events that require a transition from one conformation to another. The number of transcripts possible expands combinatorically as the flipon count in a gene increase. The variability generated is larger than possible with a strictly linear genome.

Flipons act locally to alter the information extracted from their neighborhood. Their targeting of wetware to different locations has the potential to enhance alternative splicing, base modification, 3D RNA

Bootstrapping Flipon Computation

The output from a digital genome depends on the initial flipon conformations. During development, these states are set by specific epigenetic markings on the pioneer nucleosomes transmitted via sperm and bound with high affinity to G-rich sequences [80., 81., 82.]. These marks direct nucleosome phasing and have the potential to generate sufficient NSC to form NoBs that localize the necessary wetware. Editing of the epigenetic marks by the wetware on these and other nucleosomes influences future

A Code Is a Code by Any Other Name

Flipons are elements of a digital genome that evolves by programming rather than by mutation. The flipon code is instructive, based on low-complexity SSRs, directing the compilation of genomic sequences into a variety of RNAs. By contrast, codons embody a semantic code enriched in high complexity sequences that specify the detailed architecture of a single protein, one nucleotide triplet at a time. Flipons and codons are both genetic codes, each with a different function. Flipons are

Concluding Remarks

The formation of alternative flipon conformations by SSRs requires work. It is a trade of energy for information. The structures formed allow alternative processing of transcripts. The design is sloppy and inefficient. Most of the transcripts made are intronic or defective and junked without ever being used. Flipons enable exploration of a larger transcript space. They allow rapid updating of genetic programs in response to environmental challenges. Their programmability potentially enables

References (114)

  • M.P. Crossley

    R-loops as cellular regulators and genomic threats

    Mol. Cell

    (2019)
  • B.P. Belotserkovskii et al.

    Anchoring nascent RNA to the DNA template could interfere with transcription

    Biophys. J.

    (2011)
  • H. Paulson

    Repeat expansion diseases

    Handb. Clin. Neurol.

    (2018)
  • R. Blossey et al.

    The latest twists in chromatin remodeling

    Biophys. J.

    (2018)
  • M. Caputi et al.

    Determination of the RNA binding specificity of the heterogeneous nuclear ribonucleoprotein (hnRNP) H/H'/F/2H9 family

    J. Biol. Chem.

    (2001)
  • N.M. Mannion

    The RNA-editing enzyme ADAR1 controls innate immune responses to RNA

    Cell Rep.

    (2014)
  • A. Herbert

    ALU non-B-DNA conformations, flipons, binary codes and evolution

    R. Soc. Open Sci.

    (2020)
  • A.M. Fleming

    Human DNA repair genes possess potential G-quadruplex sequences in their promoters and 5′-untranslated regions

    Biochemistry

    (2018)
  • M. Duca

    The triple helix: 50 years later, the outcome

    Nucleic Acids Res.

    (2008)
  • M. Zeraati

    I-motif DNA structures are formed in the nuclei of human cells

    Nat. Chem.

    (2018)
  • D. Varshney

    The regulation and functions of DNA and RNA G-quadruplexes

    Nat. Rev. Mol. Cell Biol.

    (2020)
  • J.A. Shortt

    Finding and extending ancient simple sequence repeat-derived regions in the human genome

    Mob. DNA

    (2020)
  • J.D. Watson et al.

    Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid

    Nature

    (1953)
  • G. Felsenfeld

    Formation of a three-stranded polynucleotide molecule

    J. Am. Chem. Soc.

    (1957)
  • M. Gellert

    Helix formation by guanylic acid

    Proc. Natl. Acad. Sci. U. S. A.

    (1962)
  • A.H. Wang

    Molecular structure of a left-handed double helical DNA fragment at atomic resolution

    Nature

    (1979)
  • K. Gehring

    A tetrameric DNA structure with protonated cytosine.cytosine base pairs

    Nature

    (1993)
  • D. Zhabinskaya et al.

    Theoretical analysis of competing conformational transitions in superhelical DNA

    PLoS Comput. Biol.

    (2012)
  • D.A.T. Sekibo et al.

    The effects of DNA supercoiling on G-quadruplex formation

    Nucleic Acids Res.

    (2017)
  • W. Li

    Structural competition involving G-quadruplex DNA and its complement

    Biochemistry

    (2003)
  • J. Quilez

    Polymorphic tandem repeats within gene promoters act as modifiers of gene expression and DNA methylation in humans

    Nucleic Acids Res.

    (2016)
  • S.F. Fotsing

    The impact of short tandem repeat variation on gene expression

    Nat. Genet.

    (2019)
  • M. Gymrek

    Abundant contribution of short tandem repeats to gene expression variation in humans

    Nat. Genet.

    (2016)
  • G. Dreyfuss

    Messenger-RNA-binding proteins and the messages they carry

    Nat. Rev. Mol. Cell Biol.

    (2002)
  • S. Ahmad

    Breaching self-tolerance to Alu duplex RNA underlies MDA5-mediated inflammation

    Cell

    (2018)
  • B. Swinnen

    RNA toxicity in non-coding repeat expansion disorders

    EMBO J.

    (2020)
  • J. Ma et al.

    DNA supercoiling during transcription

    Biophys. Rev.

    (2016)
  • X. Fernández

    Chromatin regulates DNA torsional energy via topoisomerase II-mediated relaxation of positive supercoils

    EMBO J.

    (2014)
  • C. Naughton

    Transcription forms and remodels supercoiling domains unfolding large-scale chromatin structures

    Nat. Struct. Mol. Biol.

    (2013)
  • S.S. Teves et al.

    Transcription-generated torsional stress destabilizes nucleosomes

    Nat. Struct. Mol. Biol.

    (2013)
  • B.P. Belotserkovskii

    Mechanisms and implications of transcription blockage by guanine-rich DNA sequences

    Proc. Natl. Acad. Sci. U. S. A.

    (2010)
  • Y. Zhao

    Real-time detection reveals responsive cotranscriptional formation of persistent intramolecular DNA and intermolecular DNA:RNA hybrid G-quadruplexes stabilized by R-loop

    Anal. Chem.

    (2017)
  • B.P. Belotserkovskii

    DNA sequences that interfere with transcription: implications for genome function and stability

    Chem. Rev.

    (2013)
  • A.F. Voter

    A guanine-flipping and sequestration mechanism for G-quadruplex unwinding by RecQ helicases

    Nat. Commun.

    (2018)
  • I.X. Wang

    Human proteins that interact with RNA/DNA hybrids

    Genome Res.

    (2018)
  • T. Lee et al.

    The biology of DHX9 and its potential as a therapeutic target

    Oncotarget

    (2016)
  • M. Sauer et al.

    G-quadruplex unwinding helicases and their function in vivo

    Biochem. Soc. Trans.

    (2017)
  • M. Lee

    Minute negative superhelicity is sufficient to induce the B-Z transition in the presence of low tension

    Proc. Natl. Acad. Sci. U. S. A.

    (2010)
  • S. Bae

    Intrinsic Z-DNA is stabilized by the conformational selection mechanism of Z-DNA-binding proteins

    J. Am. Chem. Soc.

    (2011)
  • A.A. Nielsen

    Genetic circuit design automation

    Science

    (2016)
  • Cited by (22)

    View all citing articles on Scopus
    View full text