Trends in Genetics
OpinionSimple Repeats as Building Blocks for Genetic Computers
Section snippets
Genetic Computation
Alternative DNA structures epitomize a nontraditional way of encoding genetic information [1,2]. Sequences that adopt alternative conformations under physiological conditions are called flipons [1,3]. Flipons are binary ON–OFF switches that enable genetic programming of adaptive responses to environmental changes. Flipons proffer a different way to perform computations with polynucleotides. They represent a novel innovation to fast track the evolution of metazoans, increasing the number of
What Are Flipons?
Flipons are nucleic sequences that can flip conformation under physiological conditions. They are elements of an instructive genetic code that specifies the compilation of transcripts form the genome (Figure 1, Key Figure) [1,3]. Examples of these non-B structures (NoBs) include the left-handed Z duplexes, three-stranded triplexes (T3Xs), four-stranded G quadruplexes (G4Qs), and I motifs (I4Ms) (Figure 1). They are referred to here as NoBs, regardless of whether they are made of RNA or DNA.
Flipons In Vivo
A number of criteria exist for demonstrating the relevance of NoBs to biology: biochemical validation in vitro; angstrom resolution structures to eliminate ersatz explanations; functional studies to demonstrate mechanism in vivo; Mendelian genetics to confirm causality (Box 1). These requirements are satisfied for Z DNA and evidence exists for the biological function of others [3] (Box 1).
Direct evidence for NoBs formation in mammalian cells under physiological conditions comes from in vivo
What Are SSRs?
Many of the sequences involved in NoB formation map to SSR, 1–6 base pairs long. SSRs comprise about 3% of the human genome [8]. The overlap between SSRs and NoBs is described in Table 1, and is not surprising given that simple repeats led to the discovery of NoBs soon after Watson and Crick modeled B-form duplex DNA (1953) [9]. The description of T3X (1957) and G4Q (1960) in synthetic homopolymer repeats [10., 11., 12.] was followed by the characterization of Z DNA in d(CG) repeat crystals
SSRs and Fliponware
Between 3% and 9% of SSRs have NoB signatures detectable in the permanganate experiments described previously [2]. The NoBs also overlap sites of transcription. Of those transcribed in a human Burkitt lymphoma cell line, 14.2% of Z-DNA sequence motifs colocalize with an RNA polymerase II. For G4Qs, the overlap is 17%. Some signatures, like that for cruciforms, are too infrequent to detect in these experiments [2], even though plasmids can form cruciforms when transfected into cells [15]. Some
SSR and Codonware
SSRs alter gene expression and phenotype in many ways. For example, the Genotype–Tissue Expression Project identified 28 000 SSRs in 17 tissues affecting DNA methylation [19] and gene expression [20], accounting for 10–15% of the cis heritability [21]. They play an important role in RNA processing. Recognition of SSRs by heterogeneous nuclear ribonucleoprotein particles (hnRNPs) is necessary but not sufficient for alternative splicing (AS) [22]. The effects of AS are many. Exons are included or
SSRs, Flipons, and DRHs
Here, I posit that SSRs bridge the biology of flipons and codons (referred to as fliponware and codonware in Figure 2, Box 2): they facilitate the transfer of information from the genome to the transcriptome. SSRs do so by forming NoBs from DRHs (Figure 3). The energy to flip DRHs to NoBs is supplied by RNA polymerases. These enzymes unwind DNA to generate negative supercoiling (NSC) in their wake [26], faster than topoisomerases can relax it. The NSC powers NoBs formation [2,27., 28., 29.].
The
Flipons as Binary Switches
Flipons, as exemplified by SSRs, represent a different way to perform computations with polynucleotides (Box 3). They can implement binary logic similar to the way sequence-specific binding proteins turn a gene on or off [40]. The probability of transition depends on SSR sequence, repeat length, base modifications, chromatin state, ionic conditions, and wetware [2,3,41., 42., 43., 44., 45., 46.]. The outcomes vary by flipon class.
Flipons and R Loops
Not all DRHs involve SSRs. R loops are larger structures formed when hybridization of the nascent transcript to the DNA template displaces the DNA-coding strand. Their formation is driven by NSC [46]. R loops form efficiently with G-rich transcripts, probably due to the stability of rG-dC DRHs [46,66] (Figure 3F). They occur most frequently at gene ends, spanning 100–2000 base pairs, similar in size to negatively supercoiled domains detected in vivo [26].
R loops provide another mechanism to
Flipon Failure and Disease
Flipons are dynamic structures. When they freeze or fail disease often results. Such malfunctions provide indirect evidence that NoBs do form in vivo. NoBs that persist are thought to explain the association of SSRs with genome fragility [72,73], DNA damage [4], cancer progression [74,75], and Mendelian disease [74., 75., 76.]. Frozen flipons also arise in diseases due to loss-of-function helicase mutations [37]. The same may happen with DNA repair pathway defects [77]. The failure of wetware
Computing with Flipons
The flipon code is by nature binary, based on events that require a transition from one conformation to another. The number of transcripts possible expands combinatorically as the flipon count in a gene increase. The variability generated is larger than possible with a strictly linear genome.
Flipons act locally to alter the information extracted from their neighborhood. Their targeting of wetware to different locations has the potential to enhance alternative splicing, base modification, 3D RNA
Bootstrapping Flipon Computation
The output from a digital genome depends on the initial flipon conformations. During development, these states are set by specific epigenetic markings on the pioneer nucleosomes transmitted via sperm and bound with high affinity to G-rich sequences [80., 81., 82.]. These marks direct nucleosome phasing and have the potential to generate sufficient NSC to form NoBs that localize the necessary wetware. Editing of the epigenetic marks by the wetware on these and other nucleosomes influences future
A Code Is a Code by Any Other Name
Flipons are elements of a digital genome that evolves by programming rather than by mutation. The flipon code is instructive, based on low-complexity SSRs, directing the compilation of genomic sequences into a variety of RNAs. By contrast, codons embody a semantic code enriched in high complexity sequences that specify the detailed architecture of a single protein, one nucleotide triplet at a time. Flipons and codons are both genetic codes, each with a different function. Flipons are
Concluding Remarks
The formation of alternative flipon conformations by SSRs requires work. It is a trade of energy for information. The structures formed allow alternative processing of transcripts. The design is sloppy and inefficient. Most of the transcripts made are intronic or defective and junked without ever being used. Flipons enable exploration of a larger transcript space. They allow rapid updating of genetic programs in response to environmental challenges. Their programmability potentially enables
References (114)
A genetic instruction code based on DNA conformation
Trends Genet.
(2019)Permanganate/S1 nuclease footprinting reveals non-B DNA structures with regulatory potential across a mammalian genome
Cell Syst.
(2017)Enzymic synthesis of polynucleotides I. polynucleotide phosphorylase of Azotobacter vinelandii
Biochim. Biophys. Acta
(1956)Short inverted repeats are hotspots for genetic instability: relevance to cancer genomes
Cell Rep.
(2015)Circular RNAs in human cancer
Mol. Cancer
(2017)RNA/DNA hybrid interactome identifies DXH9 as a molecular player in transcriptional termination and R-loop-associated DNA damage
Cell Rep.
(2018)Effect of dC --> d(m(5)C) substitutions on the folding of intramolecular triplexes with mixed TAT and C(+)GC base triplets
Biochimie
(2018)- et al.
Kinetic hybrid i-motifs: intercepting DNA with RNA to form a DNA(2)-RNA(2) i-motif
Biochimie
(2008) - et al.
Epigenome regulation by dynamic nucleosome unwrapping
Trends Biochem. Sci.
(2020) Regulation of CSF1 promoter by the SWI/SNF-like BAF complex
Cell
(2001)
R-loops as cellular regulators and genomic threats
Mol. Cell
Anchoring nascent RNA to the DNA template could interfere with transcription
Biophys. J.
Repeat expansion diseases
Handb. Clin. Neurol.
The latest twists in chromatin remodeling
Biophys. J.
Determination of the RNA binding specificity of the heterogeneous nuclear ribonucleoprotein (hnRNP) H/H'/F/2H9 family
J. Biol. Chem.
The RNA-editing enzyme ADAR1 controls innate immune responses to RNA
Cell Rep.
ALU non-B-DNA conformations, flipons, binary codes and evolution
R. Soc. Open Sci.
Human DNA repair genes possess potential G-quadruplex sequences in their promoters and 5′-untranslated regions
Biochemistry
The triple helix: 50 years later, the outcome
Nucleic Acids Res.
I-motif DNA structures are formed in the nuclei of human cells
Nat. Chem.
The regulation and functions of DNA and RNA G-quadruplexes
Nat. Rev. Mol. Cell Biol.
Finding and extending ancient simple sequence repeat-derived regions in the human genome
Mob. DNA
Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid
Nature
Formation of a three-stranded polynucleotide molecule
J. Am. Chem. Soc.
Helix formation by guanylic acid
Proc. Natl. Acad. Sci. U. S. A.
Molecular structure of a left-handed double helical DNA fragment at atomic resolution
Nature
A tetrameric DNA structure with protonated cytosine.cytosine base pairs
Nature
Theoretical analysis of competing conformational transitions in superhelical DNA
PLoS Comput. Biol.
The effects of DNA supercoiling on G-quadruplex formation
Nucleic Acids Res.
Structural competition involving G-quadruplex DNA and its complement
Biochemistry
Polymorphic tandem repeats within gene promoters act as modifiers of gene expression and DNA methylation in humans
Nucleic Acids Res.
The impact of short tandem repeat variation on gene expression
Nat. Genet.
Abundant contribution of short tandem repeats to gene expression variation in humans
Nat. Genet.
Messenger-RNA-binding proteins and the messages they carry
Nat. Rev. Mol. Cell Biol.
Breaching self-tolerance to Alu duplex RNA underlies MDA5-mediated inflammation
Cell
RNA toxicity in non-coding repeat expansion disorders
EMBO J.
DNA supercoiling during transcription
Biophys. Rev.
Chromatin regulates DNA torsional energy via topoisomerase II-mediated relaxation of positive supercoils
EMBO J.
Transcription forms and remodels supercoiling domains unfolding large-scale chromatin structures
Nat. Struct. Mol. Biol.
Transcription-generated torsional stress destabilizes nucleosomes
Nat. Struct. Mol. Biol.
Mechanisms and implications of transcription blockage by guanine-rich DNA sequences
Proc. Natl. Acad. Sci. U. S. A.
Real-time detection reveals responsive cotranscriptional formation of persistent intramolecular DNA and intermolecular DNA:RNA hybrid G-quadruplexes stabilized by R-loop
Anal. Chem.
DNA sequences that interfere with transcription: implications for genome function and stability
Chem. Rev.
A guanine-flipping and sequestration mechanism for G-quadruplex unwinding by RecQ helicases
Nat. Commun.
Human proteins that interact with RNA/DNA hybrids
Genome Res.
The biology of DHX9 and its potential as a therapeutic target
Oncotarget
G-quadruplex unwinding helicases and their function in vivo
Biochem. Soc. Trans.
Minute negative superhelicity is sufficient to induce the B-Z transition in the presence of low tension
Proc. Natl. Acad. Sci. U. S. A.
Intrinsic Z-DNA is stabilized by the conformational selection mechanism of Z-DNA-binding proteins
J. Am. Chem. Soc.
Genetic circuit design automation
Science
Cited by (22)
Flipons and small RNAs accentuate the asymmetries of pervasive transcription by the reset and sequence-specific microcoding of promoter conformation
2023, Journal of Biological ChemistryThe Intransitive Logic of Directed Cycles and Flipons Enhances the Evolution of Molecular Computers by Augmenting the Kolmogorov Complexity of Genomes
2023, International Journal of Molecular SciencesExpanding horizons of tandem repeats in biology and medicine: Why ‘genomic dark matter’ matters
2023, Emerging Topics in Life SciencesA Hybrid Approach of Image Retrieving In Biometric ID
2023, 2023 3rd International Conference on Computing and Information Technology, ICCIT 2023