Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

A reassessment of DNA-immunoprecipitation-based genomic profiling

Abstract

DNA immunoprecipitation followed by sequencing (DIP-seq) is a common enrichment method for profiling DNA modifications in mammalian genomes. However, the results of independent DIP-seq studies often show considerable variation between profiles of the same genome and between profiles obtained by alternative methods. Here we show that these differences are primarily due to the intrinsic affinity of IgG for short unmodified DNA repeats. This pervasive experimental error accounts for 50–99% of regions identified as ‘enriched’ for DNA modifications in DIP-seq data. Correction of this error profoundly altered DNA-modification profiles for numerous cell types, including mouse embryonic stem cells, and subsequently revealed novel associations among DNA modifications, chromatin modifications and biological processes. We conclude that both matched input and IgG controls are essential in order for the results of DIP-based assays to be interpreted correctly, and that complementary, non-antibody-based techniques should be used to validate DIP-based findings to avoid further misinterpretation of genome-wide profiling data.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Characterization of off-target antibody binding in DIP-seq.
Fig. 2: Characterization of similarities between 6mA and IgG DIP-seq data in different species.
Fig. 3: Biological effects of IgG correction.

Similar content being viewed by others

References

  1. Goll, M. G. & Bestor, T. H. Eukaryotic cytosine methyltransferases. Annu. Rev. Biochem. 74, 481–514 (2005).

    Article  PubMed  CAS  Google Scholar 

  2. Bogdanović, O. et al. Active DNA demethylation at enhancers during the vertebrate phylotypic period. Nat. Genet. 48, 417–426 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  3. Feinberg, A. P. & Tycko, B. The history of cancer epigenetics. Nat. Rev. Cancer 4, 143–153 (2004).

    Article  PubMed  CAS  Google Scholar 

  4. Illingworth, R. S. et al. Orphan CpG islands identify numerous conserved promoters in the mammalian genome. PLoS Genet. 6, e1001134 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  5. Weber, M. et al. Chromosome-wide and promoter-specific analyses identify sites of differential DNA methylation in normal and transformed human cells. Nat. Genet. 37, 853–862 (2005).

    Article  PubMed  CAS  Google Scholar 

  6. Harris, R. A. et al. Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications. Nat. Biotechnol. 28, 1097–1105 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  7. Bock, C. Analysing and interpreting DNA methylation data. Nat. Rev. Genet. 13, 705–719 (2012).

    Article  PubMed  CAS  Google Scholar 

  8. Bock, C. et al. Quantitative comparison of genome-wide DNA methylation mapping technologies. Nat. Biotechnol. 28, 1106–1114 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  9. Nair, S. S. et al. Comparison of methyl-DNA immunoprecipitation (MeDIP) and methyl-CpG binding domain (MBD) protein capture for genome-wide DNA methylation analysis reveal CpG sequence coverage bias. Epigenetics 6, 34–44 (2011).

    Article  PubMed  CAS  Google Scholar 

  10. Ko, M. et al. Impaired hydroxylation of 5-methylcytosine in myeloid cancers with mutant TET2. Nature 468, 839–843 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  11. Matarese, F., Carrillo-de Santa Pau, E. & Stunnenberg, H. G. 5-Hydroxymethylcytosine: a new kid on the epigenetic block? Mol. Syst. Biol. 7, 562 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  12. Thomson, J. P. et al. Comparative analysis of affinity-based 5-hydroxymethylation enrichment techniques. Nucleic Acids Res. 41, e206 (2013).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  13. Skvortsova, K. et al. Comprehensive evaluation of genome-wide 5-hydroxymethylcytosine profiling approaches in human DNA. Epigenetics Chromatin 10, 16 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  14. Pastor, W. A., Huang, Y., Henderson, H. R., Agarwal, S. & Rao, A. The GLIB technique for genome-wide mapping of 5-hydroxymethylcytosine. Nat. Protoc. 7, 1909–1917 (2012).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  15. Shen, L. et al. Genome-wide analysis reveals TET- and TDG-dependent 5-methylcytosine oxidation dynamics. Cell 153, 692–706 (2013).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  16. Habibi, E. et al. Whole-genome bisulfite sequencing of two distinct interconvertible DNA methylomes of mouse embryonic stem cells. Cell Stem Cell 13, 360–369 (2013).

    Article  PubMed  CAS  Google Scholar 

  17. Ramsahoye, B. H. et al. Non-CpG methylation is prevalent in embryonic stem cells and may be mediated by DNA methyltransferase 3a. Proc. Natl. Acad. Sci. USA 97, 5237–5242 (2000).

    Article  PubMed  CAS  Google Scholar 

  18. Williams, K. et al. TET1 and hydroxymethylcytosine in transcription and DNA methylation fidelity. Nature 473, 343–348 (2011).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  19. Dawlaty, M. M. et al. Loss of Tet enzymes compromises proper differentiation of embryonic stem cells. Dev. Cell 29, 102–111 (2014).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  20. Song, C. X. et al. Genome-wide profiling of 5-formylcytosine reveals its roles in epigenetic priming. Cell 153, 678–691 (2013).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  21. Papin, C. et al. Combinatorial DNA methylation codes at repetitive elements. Genome Res. 27, 934–946 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  22. Pastor, W. A. et al. Genome-wide mapping of 5-hydroxymethylcytosine in embryonic stem cells. Nature 473, 394–397 (2011).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  23. Traube, F. R. & Carell, T. The chemistries and consequences of DNA and RNA methylation and demethylation. RNA Biol. 14, 1099–1107 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Fu, Y. et al. N 6-methyldeoxyadenosine marks active transcription start sites in Chlamydomonas. Cell 161, 879–892 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  25. Greer, E. L. et al. DNA methylation on N 6-adenine in C. elegans. Cell 161, 868–878 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  26. Zhang, G. et al. N 6-methyladenine DNA modification in Drosophila. Cell 161, 893–906 (2015).

    Article  PubMed  CAS  Google Scholar 

  27. Koziol, M. J. et al. Identification of methylated deoxyadenosines in vertebrates reveals diversity in DNA modifications. Nat. Struct. Mol. Biol. 23, 24–30 (2016).

    Article  PubMed  CAS  Google Scholar 

  28. Wu, T. P. et al. DNA methylation on N 6-adenine in mammalian embryonic stem cells. Nature 532, 329–333 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  29. Liu, J. et al. Abundant DNA 6mA methylation during early embryogenesis of zebrafish and pig. Nat. Commun. 7, 13052 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  30. Yao, B. et al. DNA N 6-methyladenine is dynamically regulated in the mouse brain following environmental stress. Nat. Commun. 8, 1122 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  31. Schiffers, S. et al. Quantitative LC-MS provides no evidence for m6 dA or m4 dC in the genome of mouse embryonic stem cells and tissues. Angew. Chem. Int. Ed. Engl. 56, 11268–11271 (2017).

    Article  PubMed  CAS  Google Scholar 

  32. Luo, G. Z. & He, C. DNA N 6-methyladenine in metazoans: functional epigenetic mark or bystander? Nat. Struct. Mol. Biol. 24, 503–506 (2017).

  33. O’Brown, Z. K. & Greer, E. L. N 6-methyladenine: a conserved and dynamic DNA mark. Adv. Exp. Med. Biol. 945, 213–246 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  34. Razin, A. & Razin, S. Methylated bases in mycoplasmal DNA. Nucleic Acids Res. 8, 1383–1390 (1980).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  35. Lluch-Senar, M. et al. Comprehensive methylome characterization of Mycoplasma genitalium and Mycoplasma pneumoniae at single-base resolution. PLoS Genet. 9, e1003191 (2013).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  36. Ficz, G. et al. Dynamic regulation of 5-hydroxymethylcytosine in mouse ES cells and during differentiation. Nature 473, 398–402 (2011).

    Article  PubMed  CAS  Google Scholar 

  37. Xu, Y. et al. Genome-wide regulation of 5hmC, 5mC, and gene expression by Tet1 hydroxylase in mouse embryonic stem cells. Mol. Cell 42, 451–464 (2011).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  38. Yue, F. et al. A comparative encyclopedia of DNA elements in the mouse genome. Nature 515, 355–364 (2014).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  39. Brown, S. J., Stoilov, P. & Xing, Y. Chromatin and epigenetic regulation of pre-mRNA processing. Hum. Mol. Genet. 21, R1. R90–R96 (2012).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  40. Luo, G. Z., Blanco, M. A., Greer, E. L., He, C. & Shi, Y. DNA N 6-methyladenine: a new epigenetic mark in eukaryotes? Nat. Rev. Mol. Cell Biol. 16, 705–710 (2015).

  41. Gebhard, C. et al. General transcription factor binding at CpG islands in normal cells correlates with resistance to de novo DNA methylation in cancer cells. Cancer Res. 70, 1398–1407 (2010).

    Article  PubMed  CAS  Google Scholar 

  42. Nezlin, R. Aptamers in immunological research. Immunol. Lett. 162, 252–255 (2014).

    Article  PubMed  CAS  Google Scholar 

  43. Waring, M. & Britten, R. J. Nucleotide sequence repetition: a rapidly reassociating fraction of mouse DNA. Science 154, 791–794 (1966).

    Article  PubMed  CAS  Google Scholar 

  44. Tsumura, A. et al. Maintenance of self-renewal ability of mouse embryonic stem cells in the absence of DNA methyltransferases Dnmt1, Dnmt3a and Dnmt3b. Genes Cells 11, 805–814 (2006).

    Article  PubMed  CAS  Google Scholar 

  45. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  46. Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  47. Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  48. Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27, 1571–1572 (2011).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  49. Daley, T. & Smith, A. D. Predicting the molecular complexity of sequencing libraries. Nat. Methods 10, 325–327 (2013).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  50. Aronesty, E. Comparison of sequencing utility programs. Open Bioinforma. J. 7, 1–8 (2013).

    Article  Google Scholar 

  51. Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  52. Tan, G. & Lenhard, B. TFBSTools: an R/bioconductor package for transcription factor binding site analysis. Bioinformatics 32, 1555–1556 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  53. Bao, W., Kojima, K. K. & Kohany, O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob. DNA 6, 11 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  54. Kim, D., Song, L., Breitwieser, F. P. & Salzberg, S. L. Centrifuge: rapid and sensitive classification of metagenomic sequences. Genome Res. 26, 1721–1729 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  55. Merten, O. W. Virus contaminations of cell cultures—a biotechnological view. Cytotechnology 39, 91–116 (2002).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  56. Drexler, H. G. & Uphoff, C. C. Mycoplasma contamination of cell cultures: incidence, sources, effects, detection, elimination, prevention. Cytotechnology 39, 75–90 (2002).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  57. Mahmood, A. & Ali, S. Microbial and viral contamination of animal and stem cell cultures: common contaminants, detection and elimination. J. Stem Cell Res. Ther. 2, 00078 (2017).

    Google Scholar 

  58. Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  59. Mi, H., Muruganujan, A. & Thomas, P. D. PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees. Nucleic Acids Res. 41, D377–D386 (2013).

    Article  PubMed  CAS  Google Scholar 

  60. R Development Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, Austria, 2008).

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Swedish Research Council (2015-03495 to C.E.N.; 2015-02575 to M.B.), LiU-Cancer (2016-007 to C.E.N.), the Swedish Cancer Society (CAN 2017/625 to C.E.N.; CAN 2016/602 to H.G.) and the Medical Research Council, UK (MC_PC_U127574433 to R.R.M. and H.K.M.).

Author information

Authors and Affiliations

Authors

Contributions

C.L., S.V., K.D. and H.K.M. performed experiments; A.L., C.E.N. and S.V. analyzed data; A.L., R.R.M. and C.E.N. wrote the manuscript; and H.V., H.G., R.R.M., M.B. and C.E.N. supervised the work.

Corresponding author

Correspondence to Colm E. Nestor.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1 Reproducibility of off-target binding in DIP-seq between studies.

Signal track for multiple marks and tissues in mice over repetitive regions. STRs, short tandem repeats. All tracks are automatically scaled.

Supplementary Figure 2 Extended identification and validation of off-target binding in DIP-seq.

(a, b) Immuno dot blot n = 1 (a) and ELISA n = 3 biologically independent experiments (b) of 5mC, 5hmC, 5fC and 5caC antibodies in synthetic 426 bp oligos containing the different marks. Boxplots represent median and first and third quartiles with whiskers extending 1.5 * inter-quartile range. (c) Enrichment of IgG or Input reads over the intersection of DIP-seq 5modC (5mC+5hmC+5fC+5caC) n = 592 enriched regions or non-intersecting (5mC/5hmC/5fC/5caC) n = 259002 enriched regions. Data represented as in b. P-values calculated using two-tailed T-test. (d) Correlation matrix of enriched DIP-seq regions per Mbp of mm9. Correlation was calculated as pairwise two-tailed Pearson correlation r2 for each n = 1 biologically independent experiment. (e) Venn diagram of overlapping enriched regions for 5hmC, 5mC and IgG (left). Dinucleotide frequencies for overlapping IgG+5mC+5hmC n = 23317 regions, 5mC+5hmC, n = 6683 regions and mm9 n = 23317 randomly sampled regions. Data represented as in b. (f) Number of methylated CpH from WGBS data per IgG n = 137557 enriched region or 5mC n = 19091 enriched region. P-values calculated using two-tailed Mann-Whitney U-test. (g) Enrichment profile of IgG and 5hmC in DnmtTKO (left) or TetTKO (right) and WT mESCs over IgG n = 137557 enriched regions. Data shown as mean for WT and DnmtTKO n = 1 biologically independent sample (left) and mean and 95% confidence intervals for WT n = 2 and TetTKO n = 3 biologically independent samples (right). (h) DIP using a 5hmC antibody in wild-type (WT) (left) and DnmtTKO (right) mESCs for DIP-qPCR n = 3 and DIP-seq n = 1 biologically independent samples. Data shown as mean ±s.d. Correlation between mean DIP-qPCR and DIP-seq values calculated using two-tailed Spearman correlation. (i) CG content of enriched fragments for DIP and Seal profiling for 5hmC (left) and 5fC (right). Theoretical normal distribution modelled based on mean and s.d. for each mark (Norm). P-values calculated by two-tailed Kolmogorov-Smirnov test using the mean of n = 2 biologically independent experiments. (j) Estimation of PCR duplication for sequencing libraries at a depth of 10 million reads shown as the non-redundant fraction (ie. not duplicated fraction) for n = 2 biologically independent samples. Data represented as in b.

Supplementary Figure 3 Extended analysis of 6mA DIP-seq in multiple species.

(a) Scatterplot showing correlation between IgG motif similarity for DIP-seq and percentage CA-repeats in the respective genomes for M. musculus n = 11, D. rerio n = 2, X. laevis n = 8, C. elegans n = 1 and E. coli n = 2 biologically independent samples. Correlation calculated as two-tailed Spearman's rho for all samples together (n = 24). Line represents linear correlation and 95% confidence interval. (b) Number and overlap of 6mA enriched regions in X. laevis testes identified using Input or IgG controls shown as Venn diagrams (left) and bar plots (right). (c) Number of reads mapping to Mycoplasma species. Kidney n = 3, mESC n = 2, Brain n = 6 biologically independent samples. Boxplots represent median and first and third quartiles with whiskers extending 1.5 * inter-quartile range.

Supplementary Figure 4 Effect of IgG correction in DIP-seq data.

(a) Schematic visualization of false positive rate for enriched regions. Briefly, false positive rate (FPR) was estimated based on the inverse fraction of regions identified by both Input and IgG versus total regions. (b) Estimated false positive rate of enriched regions using IgG or Input as control for Tdg knockdown mESCs for n = 2 biologically independent samples. Data shown as mean. (c) Estimated false positive rate for individual mESC or MEF datasets. *Estimated based on controls from mESCs. (d) Venn diagram of enriched 5hmC regions in mESCs with different techniques and controls of each n = 1 biologically independent samples. (e, f) Fraction of enriched 5modC regions identified using IgG or Input overlapping repetitive elements (e) and dinucleotide repeats (f) for 5caC n = 2, 5fC n = 2, 5hmC n = 7 and 5mC n = 6 biologically independent samples. Presented as mean ± s.d. P-values calculated using two-tailed T-test. (g) Venn diagram of 5mC and 5hmC overlap using IgG or Input controls (top) and paired line plot of 5mC and 5hmC overlap using IgG or Input controls for multiple studies (indicated by symbols, bottom). Data shown as mean and individual data points of n = 6 biologically independent samples. P-values calculated using two-tailed paired T-test. ▲ = ERP000570, = GSE31343, ■ = GSE24841, = GSE42250. (h) GO term enrichment for top genes (n = 500) enriched for 5hmC in mouse embryonic fibroblasts (MEFs) using DIP-seq with either IgG or Input controls. P-values calculated using PANTHER overrepresentation test GO biological processes. (i) Signal track in mESCs of ChIP-seq controls over IgG DIP-seq enriched regions.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–4, Supplementary Discussion and Supplementary Table 5

Reporting Summary

Supplementary Table 1

Summary of analyzed datasets and their relationship to figures

Supplementary Table 2

Motif-enrichment analysis of 21 published 5modC DIP-seq datasets

Supplementary Table 3

Motif-enrichment analysis of 23 published 6mA DIP-seq datasets

Supplementary Table 4

Analysis of cell-culture contamination in 36 DIP-seq datasets

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lentini, A., Lagerwall, C., Vikingsson, S. et al. A reassessment of DNA-immunoprecipitation-based genomic profiling. Nat Methods 15, 499–504 (2018). https://doi.org/10.1038/s41592-018-0038-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41592-018-0038-7

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing