Journal of Molecular Biology
ReviewRGG/RG Motif Regions in RNA Binding and Phase Separation
Graphical abstract
Introduction
RGG/RG repeats were first identified in the mid-1980s, with papers on the adenovirus hexon-assembly protein [1], [2], nucleolin [3] and fibrillarin [4]. Involvement of RGG/RG repeat-containing proteins in phase separation processes has sparked renewed interest in these regions. Protein phase separation refers to the emergence of a protein-enriched phase and a protein-depleted phase from a uniformly dispersed solution [5], [6], [7], [8]. The enriched phase often assumes the form of liquid droplets containing highly concentrated protein, akin to oil droplets in an oil–water mixture. Protein phase separation is now understood to underlie the formation of membraneless organelles and enables cells to compartmentalize specific molecules and functions, much like membranes do. Unlike membrane-enclosed organelles, formation and dissolution of membraneless organelles is readily controlled by post-translational modifications [9], [10], [11], protein concentration [12] or environmental cues [13], providing a powerful mechanism for cellular regulation. Examples of membraneless organelles include the nucleolus, P-bodies and stress granules, with roles that include, but are not limited to, ribosome biogenesis [14], mRNA translational repression and degradation [15] and stalling of mRNA translation, respectively [15]. These membraneless organelles are frequently termed ribonucleoprotein (RNP) bodies or granules, because they generally contain both proteins and RNA. This review will provide a molecular-level perspective on RGG/RG motif interactions with RNA, their involvement in phase separation in vitro and within cellular RNP bodies/membraneless organelles and regulation of their interactions by methylation.
Since RGG/RG motifs were first described, they have been identified in many other proteins, often occurring as multiple repeats. In humans, there are greater than 100 proteins with at least two RGG motifs that are spaced less than 5 residues apart and greater than 1700 proteins with at least two RG motifs that are spaced less than 5 residues apart [16]. As shown in Table 1, there is a large range in the number of RGG/RG repeats found in proteins. The FET proteins [fused in sarcoma (FUS), Ewing's sarcoma protein (EWS) and TATA binding associated factor 15 (TAF15)] contain approximately 20 RGG repeats each, while FMR1 contains only 2 full RGG repeats. This range in the number of RGG repeats can have an effect on the affinity/avidity of interactions formed by these RGG/RG regions [17], [18], as well as on multivalency, key for phase separation [19]. RGG/RG repeats usually occur in low-complexity regions, intrinsically disordered regions (IDRs) composed largely of limited amino acid types, or as part of intrinsically disordered proteins (IDPs) [20]. These IDRs and IDPs do not assume a single folded structure but instead rapidly interconvert between highly heterogeneous conformational states. IDRs that phase separate are usually low complexity [21]. RGG/RG repeats are sometimes associated with other repetitive sequences, including polyalanine, polyaspartate, polyglutamate, polyglutamine and polyglycine repeats, as well as other diglycine-containing motifs such as FGG, PGG, SGG and YGG (Table 1). These accompanying features, which vary between RGG regions, likely modulate their interactions, although the effect of these features on RGG-containing regions is not well studied. Some associated sequence features, such as FG repeats [22] and polyglutamine and polyglutamine/asparagine-rich repeats [23], [24], [25], [26], are also know to promote phase separation and/or aggregation on their own. Note that GR and GGR sequences are also quite common in RGG/RG-containing proteins, leading to the term glycine–arginine-rich (GAR) regions, which specifies content but not sequence.
Section snippets
RGG Interaction with RNA
RGG-containing regions are RNA-binding regions. This was first realized when the RGG/RG region of hnRNPU was shown to be responsible for the RNA binding properties of the protein [27]. The authors also pointed out that the RGG region was likely “unordered, extended and flexible,” which was at odds with the prevailing paradigm in the early 1990s that proteins needed to assume a folded structure to function. Binding was demonstrated for RNA homopolymers, including poly(G), poly(U), poly(A) and to
RGG/RG Motifs in Cellular RNA Granules
Many RGG/RG repeat-containing proteins are localized to micron-sized membraneless organelles that are visible in micrographs of cells [5], [6], [7], [8]. Ddx4 is an important component of perinuclear granules or nuage in mammals and Drosophila [55], [56], analogous to Caenorhabditis elegans P-granules [55] that contain Laf-1, PGL-1 and PGL-3 [57]. Nucleolin and fibrillarin are two important components of the nucleolus [4], [58], [59]. Stress granules contain members of the FET family—FUS, EWS
Identification of Methylation Sites
The discovery of RGG/RG repeats coincided with the discovery that many arginines in RGG/RG motifs are extensively methylated. Asymmetric dimethylarginine (aDMA) was observed in an heterogenous nuclear ribonucleoprotein (hnRNP) homologous protein purified from the slime mold Physarum polycephalum. Amino acid analysis of this slime mold protein indicated that roughly half of all the arginines in the protein were present as aDMA [102]. Purification of nucleolin from hepatoma cells and tryptic
Conclusion
RGG/RG repeats are important mediators of protein:protein and protein:RNA interactions that are critical for formation of membraneless ribonucleoprotein granules. Arginine side chains can form electrostatic, hydrogen-bonding, hydrophobic, π–π and cation–π interactions with protein and nucleic acid groups, which drive phase separation and interactions with DNA and RNA. Glycine backbone amide groups also hydrogen bond and form π–π stacking interactions with protein and RNA. Although RGG regions
Acknowledgments
The authors wish to acknowledge valuable discussions with Brian Tsang, Michael Nosella and Dr. Tae Hun Kim, as well as funding to J.D.F.-K. from the Canadian Institutes of Health Research (114985) and Natural Sciences and Engineering Council of Canada (06718).
References (151)
- et al.
Nucleotide sequence of the EcoRI-F fragment of adenovirus 2 genome
Gene
(1979) - et al.
Liquid–liquid phase separation in cellular signaling systems
Curr. Opin. Struct. Biol.
(2016) - et al.
Stress granules
Curr. Biol.
(2009) Defining the RGG/RG motif
Mol. Cell
(2013)- et al.
High affinity interactions of nucleolin with G–G-paired rDNA
J. Biol. Chem.
(1999) RNA controls PolyQ protein phase transitions
Mol. Cell
(2015)Fragile X mental retardation protein targets G quartet mRNAs important for neuronal function
Cell
(2001)Microarray identification of FMRP-associated brain mRNAs and altered mRNA translational profiles in fragile X syndrome
Cell
(2001)- et al.
Design of RNA-binding proteins and ligands
Curr. Opin. Struct. Biol.
(2001) Phase transition of a disordered nuage protein generates environmentally responsive membraneless organelles
Mol. Cell
(2015)
RNA–protein complexes
Curr. Opin. Struct. Biol.
Proteins C23 and B23 are the major nucleolar silver staining proteins
Life Sci.
Requirements for stress granule recruitment of fused in sarcoma (FUS) and TAR DNA-binding protein of 43 kDa (TDP-43)
J. Biol. Chem.
RAP55, a cytoplasmic mRNP component, represses translation in Xenopus oocytes
J. Biol. Chem.
ATPase-modulated stress granules contain a diverse proteome and substructure
Cell
The cold-inducible RNA-binding protein migrates from the nucleus to cytoplasmic stress granules by a methylation-dependent mechanism and acts as a translational repressor
Exp. Cell Res.
Phase separation by low complexity domains promotes stress granule assembly and drives pathological fibrillization
Cell
Cell-free formation of RNA granules: low complexity sequence domains form dynamic fibers within hydrogels
Cell
Formation and maturation of phase-separated liquid droplets by RNA-binding proteins
Mol. Cell
RNA seeds higher-order assembly of FUS protein
Cell Rep.
Coacervation of tropoelastin
Adv. Colloid Interf. Sci.
LSm proteins form heptameric rings that bind to RNA via repeating motifs
Trends Biochem. Sci.
The structure and dynamics of higher-order assemblies: amyloids, signalosomes, and granules
Cell
Eukaryotic stress granules: the ins and outs of translation
Mol. Cell
Structure and organization of the gene coding for the DNA binding protein of adenovirus type 5
Nucleic Acids Res.
Clustering of glycine and NG,NG-dimethylarginine in nucleolar protein C23
Biochemistry
Fibrillarin: a new protein of the nucleolus identified by autoimmune sera
Biol. Cell.
Phase separation in biology; functional organization of a higher order
Cell Commun. Signal.
Liquid phase condensation in cell physiology and disease
Science
Biomolecular condensates: organizers of cellular biochemistry
Nat. Rev. Mol. Cell Biol.
The C-terminal RGG domain of human Lsm4 promotes processing body formation stimulated by arginine dimethylation
Mol. Cell. Biol.
Analysis of the physiological activities of Scd6 through its interaction with Hmt1
PLoS One
PRMT1 is required for RAP55 to localize to processing bodies
RNA Biol.
Germline P granules are liquid droplets that localize by controlled dissolution/condensation
Science
New roles for the nucleolus in health and disease
BioEssays
The discovery and analysis of P bodies
Adv. Exp. Med. Biol.
Identification of Ewing's sarcoma protein as a G-quadruplex DNA- and RNA-binding protein
FEBS J.
Phase transitions in the assembly of multivalent signalling proteins
Nature
Sequence complexity of disordered protein
Proteins
Pi–Pi contacts are an overlooked protein feature relevant to phase separation
eLife
FG-rich repeats of nuclear pore proteins form a three-dimensional meshwork with hydrogel-like properties
Science
Fluorescence correlation spectroscopy shows that monomeric polyglutamine molecules form collapsed structures in aqueous solutions
Proc. Natl. Acad. Sci. U. S. A.
PolyQ-mediated regulation of mRNA granules assembly
Biochem. Soc. Trans.
Edc3p and a glutamine/asparagine-rich domain of Lsm4p function in processing body assembly in Saccharomyces cerevisiae
J. Cell Biol.
Primary structure and binding activity of the hnRNP U protein: binding RNA through RGG box
EMBO J.
The EWS gene, involved in Ewing family of tumors, malignant melanoma of soft parts and desmoplastic small round cell tumors, codes for an RNA binding protein with novel regulatory domains
Oncogene
Liquid demixing of intrinsically disordered proteins is seeded by poly(ADP-ribose)
Nat. Commun.
Thermodynamics of the fragile X mental retardation protein RGG box interactions with G quartet forming RNA
Biochemistry
Discrimination of common and unique RNA-binding activities among fragile X mental retardation protein paralogs
Hum. Mol. Genet.
Structure-function studies of FMRP RGG peptide recognition of an RNA duplex–quadruplex junction
Nat. Struct. Mol. Biol.
Cited by (239)
Reexamining the diverse functions of arginine in biochemistry
2024, Biochemical and Biophysical Research CommunicationsThe implications of physiological biomolecular condensates in amyotrophic lateral sclerosis
2024, Seminars in Cell and Developmental BiologyRecent advances in the interplay between stress granules and m<sup>6</sup>A RNA modification
2023, Current Opinion in Solid State and Materials Science