Journal of Molecular Biology
Volume 430, Issue 23, 2 November 2018, Pages 4650-4665
Journal home page for Journal of Molecular Biology

Review
RGG/RG Motif Regions in RNA Binding and Phase Separation

https://doi.org/10.1016/j.jmb.2018.06.014Get rights and content

Highlights

  • RGG/RG regions are disordered protein regions with multiple RGG/RG repeats.

  • They bind RNA and DNA likely via weak multivalent interactions.

  • RGG/RG regions undergo LLPS with or without RNA.

  • RNA binding and LLPS are modulated by methylation of arginines in RGG/RG motifs.

Abstract

RGG/RG motifs are RNA binding segments found in many proteins that can partition into membraneless organelles. They occur in the context of low-complexity disordered regions and often in multiple copies. Although short RGG/RG-containing regions can sometimes form high-affinity interactions with RNA structures, multiple RGG/RG repeats are generally required for high-affinity binding, suggestive of the dynamic, multivalent interactions that are thought to underlie phase separation in formation of cellular membraneless organelles. Arginine can interact with nucleotide bases via hydrogen bonding and π-stacking; thus, nucleotide conformers that provide access to the bases provide enhanced opportunities for RGG interactions. Methylation of RGG/RG regions, which is accomplished by protein arginine methyltransferase enzymes, occurs to different degrees in different cell types and may regulate the behavior of proteins containing these regions.

Introduction

RGG/RG repeats were first identified in the mid-1980s, with papers on the adenovirus hexon-assembly protein [1], [2], nucleolin [3] and fibrillarin [4]. Involvement of RGG/RG repeat-containing proteins in phase separation processes has sparked renewed interest in these regions. Protein phase separation refers to the emergence of a protein-enriched phase and a protein-depleted phase from a uniformly dispersed solution [5], [6], [7], [8]. The enriched phase often assumes the form of liquid droplets containing highly concentrated protein, akin to oil droplets in an oil–water mixture. Protein phase separation is now understood to underlie the formation of membraneless organelles and enables cells to compartmentalize specific molecules and functions, much like membranes do. Unlike membrane-enclosed organelles, formation and dissolution of membraneless organelles is readily controlled by post-translational modifications [9], [10], [11], protein concentration [12] or environmental cues [13], providing a powerful mechanism for cellular regulation. Examples of membraneless organelles include the nucleolus, P-bodies and stress granules, with roles that include, but are not limited to, ribosome biogenesis [14], mRNA translational repression and degradation [15] and stalling of mRNA translation, respectively [15]. These membraneless organelles are frequently termed ribonucleoprotein (RNP) bodies or granules, because they generally contain both proteins and RNA. This review will provide a molecular-level perspective on RGG/RG motif interactions with RNA, their involvement in phase separation in vitro and within cellular RNP bodies/membraneless organelles and regulation of their interactions by methylation.

Since RGG/RG motifs were first described, they have been identified in many other proteins, often occurring as multiple repeats. In humans, there are greater than 100 proteins with at least two RGG motifs that are spaced less than 5 residues apart and greater than 1700 proteins with at least two RG motifs that are spaced less than 5 residues apart [16]. As shown in Table 1, there is a large range in the number of RGG/RG repeats found in proteins. The FET proteins [fused in sarcoma (FUS), Ewing's sarcoma protein (EWS) and TATA binding associated factor 15 (TAF15)] contain approximately 20 RGG repeats each, while FMR1 contains only 2 full RGG repeats. This range in the number of RGG repeats can have an effect on the affinity/avidity of interactions formed by these RGG/RG regions [17], [18], as well as on multivalency, key for phase separation [19]. RGG/RG repeats usually occur in low-complexity regions, intrinsically disordered regions (IDRs) composed largely of limited amino acid types, or as part of intrinsically disordered proteins (IDPs) [20]. These IDRs and IDPs do not assume a single folded structure but instead rapidly interconvert between highly heterogeneous conformational states. IDRs that phase separate are usually low complexity [21]. RGG/RG repeats are sometimes associated with other repetitive sequences, including polyalanine, polyaspartate, polyglutamate, polyglutamine and polyglycine repeats, as well as other diglycine-containing motifs such as FGG, PGG, SGG and YGG (Table 1). These accompanying features, which vary between RGG regions, likely modulate their interactions, although the effect of these features on RGG-containing regions is not well studied. Some associated sequence features, such as FG repeats [22] and polyglutamine and polyglutamine/asparagine-rich repeats [23], [24], [25], [26], are also know to promote phase separation and/or aggregation on their own. Note that GR and GGR sequences are also quite common in RGG/RG-containing proteins, leading to the term glycine–arginine-rich (GAR) regions, which specifies content but not sequence.

Section snippets

RGG Interaction with RNA

RGG-containing regions are RNA-binding regions. This was first realized when the RGG/RG region of hnRNPU was shown to be responsible for the RNA binding properties of the protein [27]. The authors also pointed out that the RGG region was likely “unordered, extended and flexible,” which was at odds with the prevailing paradigm in the early 1990s that proteins needed to assume a folded structure to function. Binding was demonstrated for RNA homopolymers, including poly(G), poly(U), poly(A) and to

RGG/RG Motifs in Cellular RNA Granules

Many RGG/RG repeat-containing proteins are localized to micron-sized membraneless organelles that are visible in micrographs of cells [5], [6], [7], [8]. Ddx4 is an important component of perinuclear granules or nuage in mammals and Drosophila [55], [56], analogous to Caenorhabditis elegans P-granules [55] that contain Laf-1, PGL-1 and PGL-3 [57]. Nucleolin and fibrillarin are two important components of the nucleolus [4], [58], [59]. Stress granules contain members of the FET family—FUS, EWS

Identification of Methylation Sites

The discovery of RGG/RG repeats coincided with the discovery that many arginines in RGG/RG motifs are extensively methylated. Asymmetric dimethylarginine (aDMA) was observed in an heterogenous nuclear ribonucleoprotein (hnRNP) homologous protein purified from the slime mold Physarum polycephalum. Amino acid analysis of this slime mold protein indicated that roughly half of all the arginines in the protein were present as aDMA [102]. Purification of nucleolin from hepatoma cells and tryptic

Conclusion

RGG/RG repeats are important mediators of protein:protein and protein:RNA interactions that are critical for formation of membraneless ribonucleoprotein granules. Arginine side chains can form electrostatic, hydrogen-bonding, hydrophobic, π–π and cation–π interactions with protein and nucleic acid groups, which drive phase separation and interactions with DNA and RNA. Glycine backbone amide groups also hydrogen bond and form π–π stacking interactions with protein and RNA. Although RGG regions

Acknowledgments

The authors wish to acknowledge valuable discussions with Brian Tsang, Michael Nosella and Dr. Tae Hun Kim, as well as funding to J.D.F.-K. from the Canadian Institutes of Health Research (114985) and Natural Sciences and Engineering Council of Canada (06718).

References (151)

  • K. Nagai

    RNA–protein complexes

    Curr. Opin. Struct. Biol.

    (1996)
  • M.A. Lischwe

    Proteins C23 and B23 are the major nucleolar silver staining proteins

    Life Sci.

    (1979)
  • E. Bentmann

    Requirements for stress granule recruitment of fused in sarcoma (FUS) and TAR DNA-binding protein of 43 kDa (TDP-43)

    J. Biol. Chem.

    (2012)
  • K.J. Tanaka

    RAP55, a cytoplasmic mRNP component, represses translation in Xenopus oocytes

    J. Biol. Chem.

    (2006)
  • S. Jain

    ATPase-modulated stress granules contain a diverse proteome and substructure

    Cell

    (2016)
  • F. De Leeuw

    The cold-inducible RNA-binding protein migrates from the nucleus to cytoplasmic stress granules by a methylation-dependent mechanism and acts as a translational repressor

    Exp. Cell Res.

    (2007)
  • A. Molliex

    Phase separation by low complexity domains promotes stress granule assembly and drives pathological fibrillization

    Cell

    (2015)
  • M. Kato

    Cell-free formation of RNA granules: low complexity sequence domains form dynamic fibers within hydrogels

    Cell

    (2012)
  • Y. Lin

    Formation and maturation of phase-separated liquid droplets by RNA-binding proteins

    Mol. Cell

    (2015)
  • J.C. Schwartz

    RNA seeds higher-order assembly of FUS protein

    Cell Rep.

    (2013)
  • G.C. Yeo et al.

    Coacervation of tropoelastin

    Adv. Colloid Interf. Sci.

    (2011)
  • P. Khusial et al.

    LSm proteins form heptameric rings that bind to RNA via repeating motifs

    Trends Biochem. Sci.

    (2005)
  • H. Wu et al.

    The structure and dynamics of higher-order assemblies: amyloids, signalosomes, and granules

    Cell

    (2016)
  • J.R. Buchan et al.

    Eukaryotic stress granules: the ins and outs of translation

    Mol. Cell

    (2009)
  • W. Kruijer et al.

    Structure and organization of the gene coding for the DNA binding protein of adenovirus type 5

    Nucleic Acids Res.

    (1981)
  • M.A. Lischwe

    Clustering of glycine and NG,NG-dimethylarginine in nucleolar protein C23

    Biochemistry

    (1985)
  • R.L. Ochs

    Fibrillarin: a new protein of the nucleolus identified by autoimmune sera

    Biol. Cell.

    (1985)
  • D.M. Mitrea et al.

    Phase separation in biology; functional organization of a higher order

    Cell Commun. Signal.

    (2016)
  • Y. Shin et al.

    Liquid phase condensation in cell physiology and disease

    Science

    (2017)
  • S.F. Banani

    Biomolecular condensates: organizers of cellular biochemistry

    Nat. Rev. Mol. Cell Biol.

    (2017)
  • M. Arribas-Layton

    The C-terminal RGG domain of human Lsm4 promotes processing body formation stimulated by arginine dimethylation

    Mol. Cell. Biol.

    (2016)
  • P.T. Lien

    Analysis of the physiological activities of Scd6 through its interaction with Hmt1

    PLoS One

    (2016)
  • K. Matsumoto

    PRMT1 is required for RAP55 to localize to processing bodies

    RNA Biol.

    (2012)
  • C.P. Brangwynne

    Germline P granules are liquid droplets that localize by controlled dissolution/condensation

    Science

    (2009)
  • L. Nunez Villacis

    New roles for the nucleolus in health and disease

    BioEssays

    (2018)
  • S. Jain et al.

    The discovery and analysis of P bodies

    Adv. Exp. Med. Biol.

    (2013)
  • K. Takahama

    Identification of Ewing's sarcoma protein as a G-quadruplex DNA- and RNA-binding protein

    FEBS J.

    (2011)
  • P. Li

    Phase transitions in the assembly of multivalent signalling proteins

    Nature

    (2012)
  • P. Romero

    Sequence complexity of disordered protein

    Proteins

    (2001)
  • R.M. Vernon

    Pi–Pi contacts are an overlooked protein feature relevant to phase separation

    eLife

    (2018)
  • S. Frey et al.

    FG-rich repeats of nuclear pore proteins form a three-dimensional meshwork with hydrogel-like properties

    Science

    (2006)
  • S.L. Crick

    Fluorescence correlation spectroscopy shows that monomeric polyglutamine molecules form collapsed structures in aqueous solutions

    Proc. Natl. Acad. Sci. U. S. A.

    (2006)
  • V. Castilla-Llorente et al.

    PolyQ-mediated regulation of mRNA granules assembly

    Biochem. Soc. Trans.

    (2014)
  • C.J. Decker et al.

    Edc3p and a glutamine/asparagine-rich domain of Lsm4p function in processing body assembly in Saccharomyces cerevisiae

    J. Cell Biol.

    (2007)
  • M. Kiledjian et al.

    Primary structure and binding activity of the hnRNP U protein: binding RNA through RGG box

    EMBO J.

    (1992)
  • T. Ohno

    The EWS gene, involved in Ewing family of tumors, malignant melanoma of soft parts and desmoplastic small round cell tumors, codes for an RNA binding protein with novel regulatory domains

    Oncogene

    (1994)
  • M. Altmeyer

    Liquid demixing of intrinsically disordered proteins is seeded by poly(ADP-ribose)

    Nat. Commun.

    (2015)
  • K.J. Zanotti

    Thermodynamics of the fragile X mental retardation protein RGG box interactions with G quartet forming RNA

    Biochemistry

    (2006)
  • J.C. Darnell

    Discrimination of common and unique RNA-binding activities among fragile X mental retardation protein paralogs

    Hum. Mol. Genet.

    (2009)
  • A.T. Phan

    Structure-function studies of FMRP RGG peptide recognition of an RNA duplex–quadruplex junction

    Nat. Struct. Mol. Biol.

    (2011)
  • Cited by (239)

    • Reexamining the diverse functions of arginine in biochemistry

      2024, Biochemical and Biophysical Research Communications
    View all citing articles on Scopus
    View full text