Close encounters: Moving along bumps, breaks, and bubbles on expanded trinucleotide tracts
Introduction
Alterations in simple repetitive sequences lie at the center of DNA evolution and sequence diversity, which drives adaptation [1], [2], [3], [4], [5], [6]. However, rapid changes in repetitive sequences can also result in deleterious effects on gene expression and function, leading to disease. Simple trinucleotide repeats (TNR) have taken on special significance in this regard since their expansion underlies more than 30 human neurodegenerative and neuromuscular diseases [7], [8], [9], [10].
The structure-forming potential of TNRs lies at the heart of the instability (Fig. 1A–C) [7], [8], [9], [10], and the structural details of their features were established early on [11], [12]. Trinucleotide repeats are quasi-palindromes, and form hydrogen bonded, hairpins harboring A/A, T/T, or C/C mismatches (Fig. 1A, N = A,T,G) within their stem [11], [12], [13], or, depending on the sequence, triplex [14], [15], [16], [17] (Fig. 1B) and quadruplex structures [18], [19] (Fig. 1C), among others [20], [21], [22] (Fig. 1B,C). Indeed, some TNRs can form quadruplexes comprising both DNA and RNA [23], [24]. Regardless of the type of structure, it must be removed to avoid mutation [25], and if not, the gene grows by roughly the size of the extrahelical DNA loop that forms during a biological transaction (e.g., Fig. 1C). Indeed, in yeast, triplet repeats capable of forming secondary structure escape repair [26] and increase the frequency of expansion by 5–1000-fold [27], [28], [29]. Equivalent small unstructured loops are rapidly removed and do not expand [26].
The location of the expanded TNRs can occur anywhere in or around a gene (Fig. 1D), but whether it occurs in coding or non-coding regions determines the size of the expansion [7], [8], [9], [10]. One can easily predict that genetic selection would rid organisms of a cells expressing a toxic, expanded protein, while non-coding expansions would bebetter tolerated [9]. However, there is an unexplained linkage between the TNR gene position of disease-length allele and the parent who transmits the expansion. Short expansions in coding alleles tend to be transmitted through the male germline, while long expansions in non-coding alleles tend to be transmitted through the female germline [9]. In HD, for example, it was noted early on that patients develop disease if they were above a premutation theshold of roughly 35 CAGs (Fig. 1E) [30], [31], [32], [33]. Transmitted CAG expansions rarely grow beyond 100 CAG repeats in a living patient [9], [34], and intergenerational “jumps” are most often between 3 and 15 TNRs [30], [31], [34], [35]. If transmitted maternally, the size of the CAG expansion changes little or even shortens [36]. The same patterns are observed in HD primate models [37] and in mouse models [38], [39], [40] of HD, although the threshold for expansion occurs at longer tract lengths.
In non-coding TNR alleles in Fragile X syndrome (FXS) [41], [42], [43], myotonic dystrophy 1 gene (DM1) [44], [45], and Frierdreichs ataxia (FRDA) [46], [47] patients (Fig. 1C), the expansions are larger. The premutationpremutation range for CGG, CTG and GAA repeats, repectively, is around 55–200, and TNR expansion to the full mutation occurs in the oocytes with gains of hundreds of repeats and thousands in congenital cases [48], [49], [50]. In the non-coding TNRs, contractions typically occur during male transmission [51], [52], indicating that the sperm is far more restrictive in the TNR lengths that are tolerated[53]. For example, the full mutation allele in FXS is deleted back to its premutation length in the sons of premutation mothers around fetal day 13, in their dividing spermatogonia [54], [55]. Sons of premutation mothers cannot transmit a disease-length allele, yet it is maintained in their somatic cells and causes disease [54], [55]. In oocytes from FXS mothers, the large expanded allele continues to expand with age even though cell division has ceased.
TNRs not only expand during transmission in the germ cell [9], [30], [32], TNRs continue to expand in somatic cells throughout life [56], [57]. In HD, for example, the somatic expansion exceeds the size of the inherited lengths, and rare alleles of up to 1000 have been detected in animals [57] (Fig. 1F). The somatic changes in the long TNRs are observed in most non-coding genes, but the size changes of the somatic lengths are small by comparison to the large jumps that occur in early germ cell development. Importantly, both the germ cells and the somatic expansions contribute to toxicity [58], [59], [60], [61]. In this regard, the somatic expansion has raised the possiblity that shortening the repeats during life may be therapeutic [62], [63]. Indeed, use of genetic and pharmacological methods to reduce the TNR length in somatic cells has improved pathophysiology in HD [34], [35], [64], FXS [65], GAA [66], [67], and DM1 [68].
Not surprisingly, these tantalizing features of expansion continue to spark interest into this fascinating mutation, and raise issues as to whether there are mutliple underlying mechanisms. Secondary structure within the TNRs is at the root of the problem; it can be constitutive or arise actively during polymerase passage, but in both cases create a need to process and remove the impediment. Self-folding of a structure-capable DNA is stable enough to either block polymerase passage, or to peel away from the duplex just long enough that the advancing polymerase prevents reannealing of the extrahelical strand [69]. In either case, the structure is trapped, and the cell faces a decision to unwind, to remove, to move past, or to incorporate the looped DNA. Genome stability or instability ensues depending on the choice. By this simple definition, polymerase fidelity and processivity over these tracts are likely to be as important as the stability of the structure in determining the mechanisms and the extent of mutation.
Surprisingly little is known about the role of polymerases in completing error-prone synthesis, or what features of a lesion might determine or exclude the use of a particular polymerase at TNR tracts. Based on structure, mammalian polymerases fall into three of five classes (Table 1), which play pivotal roles in DNA replication (pols γ, φ, ν, α, δ, ε) (family A,B), base excision repair (pol β) (family X), replication and repair in mitochondrial DNA (pol γ) (family A), non-homologous end-joining and immunological diversity (pols λ, μ, and terminaldeoxynucleotidyl transferase) (family X), and DNA damage tolerance including translesion synthesis (η, κ, ζ, Rev1) (family Y) (Table 1). Some of their known roles are summarized (Table 2), and each polymerase is described in detail in excellent reviews [70], [71], [72].
Over the past 3 years, however, there have been surprising new insights into the mechanisms that build on our understanding of expansion and highlight the importance of polymerases. Here, we discuss polymerase encounters on abnormal structures or “difficult to copy” TNRs. In the following sections, we will focus on polymerase encounters at three representative DNA barriers, which we refer to as bumps (small chemical lesions), bubbles (RNA loops, and secondary structures), and breaks. Each of these present distinct perturbation contexts of the DNA helix and the outcome of expansion relies on distinct genomic and polymerase contexts. Three questions will be considered: (1) is expansion a process of replication or repair, i.e., would expansion arise from use of a repair or replicative polymerase, (2) do disease-length repeats alter the ways in which they are viewed by a polymerase when damage is encountered, (3) how might secondary structures alter the polymerase that is chosen.
So, does expansion occur during replication or repair, and how do we distinguish among the polymerases that might be used? The hereditary nature of the expansion diseases prompted early focus on the germ cell, and early replication-dependent expansion in spermatogonia was a popular hypothesis. However, over the last 20 years, the consensus viewpoint has shifted to DNA repair as the major expansion mechanism in both dividing and non-diving cells [7], [8], [9], [10]. Expansion occurs in neurons and quiescent oocytes [38], [73], [74]. Neither cell type divides, but widespread age-dependent expansion occurs in both [49], [50], [51], [52]. No new oocytes are created after birth, indicating that age-dependent expansion in adult oocytes cannot arise from a new round of proliferation [75]. Proliferating patient fibroblasts or lymphoblasts harboring disease-length alleles do not typically expand in culture. However, in one cell line isolated from a DM1 transgenic mouse, the rate of CTG tracts is unrelated to the rate of replication [76]. On long CGG tracts, replicative polymerases stall [77], and frequently break in cells [78], [79]. As modeled in yeast and in mammalian cells, expansions during proliferation arise from rescue of stalled forks by recombination [80], break-induced replication (BIR) [81], and strand switching [82], [83]. Thus, unlike early models, the consensus viewpoint is that expansion is a process of repair even in proliferating cells.
If expansion occurs during repair, what polymerases are used? All DNA replication and repair polymerases carry out gap-filling synthesis in three basic reactions, and pair the template with incoming nucleotide using a common structural framework, depicted as a finger, palm, and thumb structure (Fig. 2) [84]. Moreover, DNA secondary structures apparently influence polymerase function [85], [86], [87]. High-fidelity polymerases have precise active site geometry and closely monitor the DNA minor groove during translocation of the nascent base pair, which can trigger the 3′–5′ exonuclease proofreading activity if a mismatch or abnormal base pair is sensed (Fig. 2A,B) [70], [71], [72], [86], [88], [89]. However, their rigid structural constraints require correctly matched base pairs, and unusual DNA structures are not accepted in the active site [70], [71], [72], [86], [88], [89], [90], [91], [92], [93]. A DNA translesion polymerase (TLP) will typically lack exonuclease activity and has evolved larger active sites to accommodate the damage template, as well as the incoming nucleotide in the optimal geometry [70], [71], [72], [86], [88], [89], [91], [92], [93]. However, the impact of encountering a TNR secondary structure for these error-prone polymerases is unknown in most cases [86].
Section snippets
Encountering bumps within triplet repeat tracts
The cell’s main strategy to deal with an oxidized base is to remove it and fill the gap with the correct base using base excision repair (BER), and to a lesser extend NER/TCR [7], [8], [9], [10], [93]. In BER, the oxidized bases are recognized and removed by a DNA glycosylase, followed by Apoendonuclease 1 (APE1) processing to generate a 3′OH suitable for polymerase extension [7], [8], [9], [10]. Gap-filling synthesis is performed by polymerase β (pol β) [90], [93] (or back-up by other X family
Encountering bubbles and breaks within TNRs
But what happens if a polymerase encounters a stable TNR secondary structure or an R-loop? We refer to both of these structures as DNA “bubbles” since the two anti-parallel DNA strands remain open and cannot reanneal to reform duplex DNA. They differ, however, in that a secondary structure blocks polymerase by intra-strand pairing while the R-loop uses interstrand pairing between DNA and RNA.
Concluding remarks and future perspectives
Although our understanding of instability is ever increasing, new questions keep emerging, even as older ones are solved. For example, if stable R-loops lead to transcriptional silencing and hypermethylationat the FXS allele, why do oocytes harboring full mutations expand in smaller steps with age? DNA repair nuclease factors must be able to access the site to cause instability, yet these sites are generally “nuclease resistant” in heterochromatin. If BIR occurs during replication stalling and
Conflict of interest
The authors have no conflicts if interest.
Acknowledgements
This work was supported by National Institutes of Health grants NS060115 (to CTM), GM119161 (to CTM)and CA092584 (to CTM). The authors would like to thank S. Mirkin, L. Symington, K. Cimprich, S. Doublie, S. Wallace, S. Wilson, and T. Kunkel who generously gave us permission for the use of images for illustration in this review.
References (194)
- et al.
Divergent microsatellite evolution in the human and chimpanzee lineages
FEBS Lett.
(2007) - et al.
Epigenetic modifications in trinucleotide repeat diseases
Trends Mol. Med.
(2013) - et al.
The balancing act of DNA repeat expansions
Curr. Opin. Genet. Dev.
(2013) - et al.
When secondary comes first – The importance of non-canonical DNA structures
Biochimie
(2013) - et al.
GAA instability in Friedreich's ataxia shares a common, DNA-directed and intraallelic mechanism with other trinucleotide diseases
Mol. Cell
(1998) - et al.
The high-resolution structure of the triplex formed by the GAA/TTC triplet repeat associated with Friedreich’s ataxia
J. Mol. Biol.
(1999) - et al.
i-Motif DNA: Structure, stability and targeting with ligands
Bioorg. Med. Chem.
(2014) - et al.
Metabolism of DNA secondary structures at the eukaryotic replication fork
DNA Repair
(2014) - et al.
Inhibition of FEN-1 processing by DNA secondary structure at trinucleotide repeats
Mol. Cell
(1999) - et al.
Variation of the CGG repeat at the fragile X site results in genetic instability: resolution of the Sherman paradox
Cell
(1991)