Skip to main content
Advertisement

Main menu

  • Home
  • Articles
    • Newest Articles
    • Current Issue
    • Methods & Resources
    • Author Interviews
    • Archive
    • Subjects
  • Collections
  • Submit
    • Submit a Manuscript
    • Author Guidelines
    • License, Copyright, Fee
    • FAQ
    • Why submit
  • About
    • About Us
    • Editors & Staff
    • Board Members
    • Licensing and Reuse
    • Reviewer Guidelines
    • Privacy Policy
    • Advertise
    • Contact Us
    • LSA LLC
  • Alerts
  • Other Publications
    • EMBO Press
    • The EMBO Journal
    • EMBO reports
    • EMBO Molecular Medicine
    • Molecular Systems Biology
    • Rockefeller University Press
    • Journal of Cell Biology
    • Journal of Experimental Medicine
    • Journal of General Physiology
    • Journal of Human Immunity
    • Cold Spring Harbor Laboratory Press
    • Genes & Development
    • Genome Research

User menu

  • My alerts

Search

  • Advanced search
Life Science Alliance
  • Other Publications
    • EMBO Press
    • The EMBO Journal
    • EMBO reports
    • EMBO Molecular Medicine
    • Molecular Systems Biology
    • Rockefeller University Press
    • Journal of Cell Biology
    • Journal of Experimental Medicine
    • Journal of General Physiology
    • Journal of Human Immunity
    • Cold Spring Harbor Laboratory Press
    • Genes & Development
    • Genome Research
  • My alerts
Life Science Alliance

Advanced Search

  • Home
  • Articles
    • Newest Articles
    • Current Issue
    • Methods & Resources
    • Author Interviews
    • Archive
    • Subjects
  • Collections
  • Submit
    • Submit a Manuscript
    • Author Guidelines
    • License, Copyright, Fee
    • FAQ
    • Why submit
  • About
    • About Us
    • Editors & Staff
    • Board Members
    • Licensing and Reuse
    • Reviewer Guidelines
    • Privacy Policy
    • Advertise
    • Contact Us
    • LSA LLC
  • Alerts
  • Follow LSA on Bluesky
  • Follow lsa Template on Twitter
Methods
Transparent Process
Open Access

Cost-effective DNA methylation profiling by FML-seq

View ORCID ProfileJoseph W Foley  Correspondence email, Shirley X Zhu, Robert B West
Joseph W Foley
Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
Roles: Conceptualization, Resources, Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Joseph W Foley
  • For correspondence: jwfoley@nus.edu.sg
Shirley X Zhu
Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
Roles: Investigation
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Robert B West
Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
Roles: Conceptualization, Supervision, Funding acquisition, Writing—original draft, Project administration, Writing—review and editing
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Published 29 September 2023. DOI: 10.26508/lsa.202302326
  • Article
  • Figures & Data
  • Info
  • Metrics
  • Reviewer Comments
  • PDF
Loading

Article Figures & Data

Figures

  • Tables
  • Supplementary Materials
  • Figure 1.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 1. Diagram of fragmentation at methylated loci and sequencing.

    (A) Library preparation reactions. Genomic DNA is digested by a methylation-dependent restriction endonuclease that cuts at a known distance from the methylated cytosine in its motif and leaves a short single-strand overhang of unknown bases. Stem-loop (hairpin) sequencing adapters with complementary random overhangs are ligated to the digested genomic DNA fragments, but the phosphodiester backbone is completed only on one strand because the adapters lack a 5′ phosphate. The resulting single-strand nick is extended by DNA polymerase to fill in a second strand complementary to the adapter’s loop, whereas the unneeded stem strand is degraded. This library of genomic DNA inserts between double-stranded linear short adapters is then amplified by standard polymerase chain reaction with long indexing primers to produce a sequencing-ready library. A standard solid-phase reversible immobilization bead cleanup without size selection is sufficient to purify the library. Paired-end sequencing reads imply the location of the two methylated cytosines resulting in each observed fragment. (B) Counting fragmentation at methylated loci and sequencing fragments as hits at methylated motif sites. The restriction endonuclease used here, MspJI, cuts at the motif mCNNR. Each copy of this motif on either strand implies a potential cut site at a certain distance past its 3′ end. When paired-end sequence reads are aligned to the reference genome, each end of a sequenced fragment counts as one hit for the corresponding motif site; for example, the fragment marked by an asterisk tallies one hit each for the red and green motif sites. The number of hits for a given motif site corresponds to the fraction of genome copies methylated at that motif’s cytosine position.

  • Figure S1.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure S1. DNA digestion by MspJI restriction endonuclease.

    (A) The enzyme’s recognition domain (blue) binds a motif (blue) containing methylated cytosine (red), mCNNR, whereas its endonuclease domain (magenta) cuts at distances of 13 and 17 bp from the methylated cytosine position. Digestion leaves a 5′ overhang of 4 nt with a terminal phosphate, and terminal hydroxyl on the 3′ underhang. (B) Digestion by MspJI cannot result in very short fragments. The shortest theoretically possible library insert would be generated by recognition of a mCNNR site immediately flanking the 4 nt overhang resulting from a previous digestion, leaving a total insert of 21 bp after ligating the adapters. In reality, MspJI may not necessarily be able to bind its motif without additional paired bases on both sides. Empirically, we observe, in high-input libraries without size selection, only about 0.04% of library inserts shorter than 22 bp, 0.05% shorter than 25 bp, 0.33% shorter than 30 bp (Fig S7A). (C) In the special case of a fully methylated CpG within a YNCGNR palindrome, two MspJI enzymes may digest the DNA symmetrically and produce a 32 bp-insert. In fragmentation at methylated loci and sequencing libraries from human genomic DNA, inserts of this length are disproportionately common but not the majority, about 4% of all inserts (Fig S7A).

  • Figure S2.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure S2. Fragmentation at methylated loci and sequencing molecule sequences using MspJI restriction endonuclease and Nextera sequencing adapters.

    For convenience a palindromic YNCGNR site with two complementary CGNR motif sites is shown, though fragments can be produced by a non-palindromic single site and a given fragment does not necessarily contain the motif sites responsible for its digestion. The stem-loop (hairpin) adapters are an equimolar mix of two versions, one for each end of a sequencing-ready library molecule (Illumina P5 and P7), but except for their slightly different sequences, they are used identically and the ligation product has no polarity. By chance, half of the digestion products will be ligated between two of the same adapter (P5 and P5 or P7 and P7), producing an unsequenceable molecule that is unlikely to amplify because of PCR suppression; this limits the efficiency of the library synthesis to 50%. Nick extension replaces the second adapter strand, whereas uracil-DNA glycosylase excises uracil and the abasic sites are hydrolyzed, leaving an amplifiable library molecule. Because the stem-loop adapters lack 5′ phosphate, they ligate to a genomic DNA fragment on only one strand, and if two adapters anneal without a genomic DNA insert, the nick extension stops at the nick on the opposite strand. Oligonucleotide sequences @ 2021 Illumina, Inc. All rights reserved. Derivative works created by Illumina customers are authorized for use with Illumina instruments and products only. All other uses are strictly prohibited.

  • Figure S3.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure S3. Specificity of fragmentation at methylated loci and sequencing for methylated DNA.

    (A) Electropherograms of fragmentation at methylated loci and sequencing libraries prepared from 60 ng methylated or unmethylated lambda bacteriophage genomic DNA with varying numbers of PCR cycles, four replicates per condition. The optimized protocol uses 15 cycles for this amount of methylated genomic DNA, but unmethylated DNA requires about 24 cycles to produce a similar yield. Note: with more than 15 PCR cycles, the overamplified libraries from methylated genomic DNA are converted into slow-migrating artifacts that appear to the right of the upper marker, giving the appearance of low yield in the measurable length range. (B) Alignment of sequence reads at the expected distance from motif sites. Methylation by the host’s Dcm is expected at the CmCWGG motif; however, MspJI cuts at any mCNNR.

  • Figure S4.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure S4. Specificity of fragmentation at methylated loci and sequencing for digested genomic DNA (gDNA).

    Electropherograms of fragmentation at methylated loci and sequencing libraries prepared from 60 ng intact human gDNA from fresh cells or degraded human gDNA from formalin-fixed, paraffin-embedded tissue, using either methylation-specific digestion by MspJI restriction endonuclease or a no-endonuclease control, two replicates per condition. The no-endonuclease control shows no library yield unless the number of PCR cycles is greatly increased, at which point, its yield is similar to a no-DNA control (compare Fig S5C). Note: as in Fig S3A, the yields of libraries from the MspJI condition appear to decrease with high PCR cycles because of overamplification artifacts; furthermore, undigested gDNA fragments from the formalin-fixed, paraffin-embedded sample are visible near and beyond the upper marker because of their short length, even where no library is detectable.

  • Figure S5.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure S5. Properties of fragmentation at methylated loci and sequencing libraries from human cell-line genomic DNA (gDNAs).

    (A) Electropherograms of libraries from 60 ng gDNA, two replicates per cell line. (B) Molar length distributions on a linear axis, normalized to equal totals within the graphing window. Insert length is calculated by subtracting the combined length of the indexed sequencing adapters, 136 bp, from the molecule length. The TapeStation lacks sufficient resolution to show the expected peak of 32 bp inserts (168 bp molecules). (C) Libraries from serial-dilution gDNA samples and no-DNA control.

  • Figure S6.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure S6. Properties of fragmentation at methylated loci and sequencing reads from human cell-line genomic DNAs.

    (A) Alignability of read sequences to the reference genome. “Adapter dimers” are inserts of 10 bp or less and “too short” reads are longer than adapter dimers but shorter than bwa-mem2’s minimum seed length of 19. “Confidently” and “poorly aligned” are uniquely aligned reads above or below the minimum MAPQ of 10 (posterior probability of correct alignment 0.9). (B) Alignment of sequence reads at the expected distance from CGNR motif sites.

  • Figure S7.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure S7. Lengths of fragmentation at methylated loci and sequencing library inserts inferred from sequencing.

    Lengths of “all read pairs” are inferred by adapter trimming, but this is limited to inserts shorter than the sequence read length, 154 nt (untrimmed reads are not shown). Lengths of “confidently aligned” read pairs (MAPQ ≥ 10) are inferred by aligned positions in the reference genome. For direct comparison, the denominator of both fractions is the total number of post-filter sequenced read pairs, that is, “all read pairs” sum to 100% but “confidently aligned” read pairs sum to the proportion of confidently alignable read pairs in each library, and for any given insert length below the read length, the relative height of the “confidently aligned” bar reflects the proportion of “all read pairs” at that length. There are prominent peaks at 4 bp from adapter dimers (no insert of genomic DNA [gDNA] sequence); at 32 bp from symmetrically digested, fully methylated CpG sites; and at 151 bp because of the three-base minimum for detection of adapter sequence. (A) All tested cell lines, 60 ng gDNA each. (B) Serial dilutions of HeLa-S3 and IMR-90. (C) No-template controls (no gDNA).

  • Figure S8.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure S8. Distribution of methylation scores among different categories of annotated functional genome elements.

    (A) Density of sequence reads from fragmentation at methylated loci and sequencing assigned to CGNR sites within the genome elements. Here, reads per million per motif is normalized against the total number of reads assigned to all CGNR sites, rather than the sum within only a given category of genome elements, to compare the different categories despite covering widely different proportions of the genome (286,435 exons, 297,593 introns, 40,351 promoters, 1,402 enhancers). (B) Methylation of CpG sites within the genome elements according to whole-genome bisulfite sequencing.

  • Figure S9.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure S9. Properties of the 40,351 annotated human promoters used for data analysis.

    (A) Lengths of promoter regions in the reference genome. (B) Number of targets for each DNA methylation profiling method per promoter (CG: canonical human DNA methylation; CGNR: the MspJI restriction endonuclease used to target methylated cytosine in fragmentation at methylated loci and sequencing; CCGG: the MspI restriction endonuclease used to reduce the number of targets in reduced-representation bisulfite sequencing). Both copies of palindromic motifs CG and CCGG are counted unless the potentially methylated CpG straddles the boundary of the promoter, resulting in counts that are nearly always even. The promoters are enriched for CpG dinucleotides (average 6.5 per 100 bp), relative to the entire reference genome (1.1 per 100 bp). (C) Fragmentation at methylated loci and sequencing read counts per promoter in a representative sample, K562 replicate 1. The 1,126 promoters without CGNR motif sites are included as zero counts.

  • Figure S10.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure S10. Fragmentation at methylated loci and sequencing reads per million per motif by number of restriction motif sites in the promoter.
  • Figure 2.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 2. Fragmentation at methylated loci and sequencing (FML-seq) recovers a similar biological signal as more costly DNA methylation profiling methods on the same samples.

    Methylated DNA immunoprecipitation data were not available for two cell lines. (A) Normalized methylation signal (blue: EPIC β, whole-genome bisulfite sequencing [WGBS] and reduced-representation bisulfite sequencing percent methylation, Methylated DNA immunoprecipitation VSD, FML-seq log reads per million per motif) at the 500 promoters with the most variation among cell lines according to WGBS. Technical replicates are shown separately. (B) Methylation scores by promoter for K562 in FML-seq or EPIC versus WGBS, replicates pooled. Only 200 randomly sampled promoters are graphed and used for the blue least-squares regression line but Spearman’s ρ is calculated with all promoters. See Fig S11 for a visualization of all promoters. (C) Methylation fold change by promoter for K562 versus HeLa-S3 in FML-seq or EPIC versus WGBS, replicates pooled. (B) The same subset of promoters are graphed as in (B). The red line shows y = x. (D) Overlap of significantly (BH-adjusted P < 0.01) differentially methylated promoters between K562 and HeLa-S3 according to WGBS (24,240 significant), reduced-representation bisulfite sequencing (6,929), and FML-seq (14,023).

  • Figure S11.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure S11. Comparison of different platforms' results.

    (A) Methylation scores in K562. (B) Fold changes for K562 versus HeLa-S3. Results are calculated from all promoters. The red line shows y = x.

  • Figure S12.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure S12. Analysis of fragmentation at methylated loci and sequencing (FML-seq) at individual cytosine positions.

    Only cytosines within CpG dinucleotides are considered overall and only the first cytosine of the CGNR motif is considered for FML-seq read counts. Only chromosome 22, the shortest autosome, is considered to reduce the scale of the analysis (1,686,636 cytosine positions, 874,441 in CGNR). (A) FML-seq read counts per cytosine position in a rep- resentative sample, K562 replicate 1. (B) Different platforms’ methylation scores in K562. (C) Different platforms’ fold changes for K562 versus HeLa-S3, at all cytosine positions with CpG dinucleotides on chromosome 22, the shortest autosome. The red line shows y = x.

  • Figure S13.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure S13. Long sequencing reads are unnecessary for fragmentation at methylated loci and sequencing.

    Reads shown are from the MiSeq v2 sequencing kit at 2 × 154 nt. Confident alignment is MAPQ ≥ 10. Commonly available read lengths are marked: 2 × 38, 2 × 51, 2 × 76, 2 × 101, 2 × 151.

  • Figure S14.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure S14. Deep sequencing is unnecessary for fragmentation at methylated loci and sequencing.

    Lower sequencing depths were simulated by subsampling the full data. (A) Promoters with at least 1 fragmentation at methylated loci and sequencing read attributed to a CGNR motif site within the promoter boundaries. (B) Pearson correlation of log reads per million per motif across all promoters. (C) Spearman correlation of pooled log reads per million per motif with whole-genome bisulfite sequencing pooled percent methylation from the same cell line across all promoters. (D) Silhouette metric of cluster distance (Pearson distance 1 − r to the nearest member of another class ÷ mean distance within the same class), considering only the two cell types used in serial dilutions, HeLa-S3 and IMR-90, for fair comparison. (E) Promoters with significantly differential methylation (DESeq2 BH-adjusted P < 0.01) between HeLa-S3 and IMR-90. The number of promoters with nonzero read counts or statistical significance inevitably increases with total sequencing depth as smaller methylation signals and effect sizes become detectable; however, replicate correlations and silhouettes indicate the total amount of usable information in the sequencing library is reached more quickly as additional depth yields only duplicate reads.

Tables

  • Figures
  • Supplementary Materials
    • View popup
    Table 1.

    Fragmentation at methylated loci and sequencing require less labor and reagent cost than representative cytosine–methylation profiling methods.

    Microarray (Illumina)WGBS (IDT)EM-seq (NEB)RRBS (Zymo)Targeted BS (Illumina)FML-seq
    Reagent cost$265$90$40$45$280$5
    Protocol time4 d1 d2 d2 d2 d2 h
    Limiting scaleEight-sample chip24-tube centrifuge96-well plate24-tube centrifuge24-tube centrifuge96-well plate
    Special equipmentIncubators and water circulatorSonicatorSonicatorSonicator
    Minimum gDNA250 ng100 pg10 ng10 ng500 ng6 ng (1,000 cells)
    Sequencing reads2 × 150 nt × 375 M2 × 150 nt × 375 M2 × 50 nt × 90 M2 × 100 nt × 55 M2 × 50 nt × 40 M
    Sequencing cost$615$615$180$135$80
    Targeted CpG sites0.85 M68 M68 M5 M3 M36 M
    • Cost per sample includes all reagents but not standard consumables (tubes, pipet tips) and is rounded to the nearest five USD. Protocol time includes all steps from isolated gDNA to sequencing-ready library. “Special equipment” for sample preparation excludes common instruments such as pipets, thermal cyclers, centrifuges, and separation magnets. Sequencing read lengths are per the manufacturers’ recommendations. Base-conversion sequencing methods use 30X coverage per ENCODE guidelines, assuming 80% read alignment to the human genome. Sequencing costs use list prices for appropriate Illumina NovaSeq 6000 High Output reagents with up to 96-plex indexing. CpG sites are counted in the CHM13v2 reference sequence, both strands.

Supplementary Materials

  • Figures
  • Tables
  • Table S1. ENCODE accession numbers.

  • Table S2. Protocol costs.

  • Supplemental Data 1.

    Supplemental protocol.[LSA-2023-02326_Supplemental_Data_1.html]

  • Supplemental Data 2.

    Supplemental code.[LSA-2023-02326_Supplemental_Data_2.zip]

PreviousNext
Back to top
Download PDF
Email Article

Thank you for your interest in spreading the word on Life Science Alliance.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Cost-effective DNA methylation profiling by FML-seq
(Your Name) has sent you a message from Life Science Alliance
(Your Name) thought you would like to see the Life Science Alliance web site.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Citation Tools
DNA methylation profiling by FML-seq
Joseph W Foley, Shirley X Zhu, Robert B West
Life Science Alliance Sep 2023, 6 (12) e202302326; DOI: 10.26508/lsa.202302326

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Share
DNA methylation profiling by FML-seq
Joseph W Foley, Shirley X Zhu, Robert B West
Life Science Alliance Sep 2023, 6 (12) e202302326; DOI: 10.26508/lsa.202302326
Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
Issue Cover

In this Issue

Volume 6, No. 12
December 2023
  • Table of Contents
  • Cover (PDF)
  • About the Cover
  • Masthead (PDF)
Advertisement

Jump to section

  • Article
    • Abstract
    • Introduction
    • Results
    • Discussion
    • Materials and Methods
    • Data Availability
    • Acknowledgements
    • References
  • Figures & Data
  • Info
  • Metrics
  • Reviewer Comments
  • PDF

Subjects

  • Chromatin & Epigenetics
  • Genomics & Functional Genomics
  • Methods & Resources

Related Articles

  • No related articles found.

Cited By...

  • No citing articles found.
  • Google Scholar

More in this TOC Section

  • DiPAK senses DPP8/9 activity
  • Modeling RSV infection with respiratory organoids
  • Optogenetic mating
Show more Methods

Similar Articles

EMBO Press LogoRockefeller University Press LogoCold Spring Harbor Logo

Content

  • Home
  • Newest Articles
  • Current Issue
  • Archive
  • Subject Collections

For Authors

  • Submit a Manuscript
  • Author Guidelines
  • License, copyright, Fee

Other Services

  • Alerts
  • Bluesky
  • X/Twitter
  • RSS Feeds

More Information

  • Editors & Staff
  • Reviewer Guidelines
  • Feedback
  • Licensing and Reuse
  • Privacy Policy

ISSN: 2575-1077
© 2025 Life Science Alliance LLC

Life Science Alliance is registered as a trademark in the U.S. Patent and Trade Mark Office and in the European Union Intellectual Property Office.