Skip to main content

Integrative Analysis of ChIP-Chip and ChIP-Seq Dataset

  • Protocol
  • First Online:
Book cover Tiling Arrays

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1067))

Abstract

Epigenetic regulation and interactions between transcription factors and regulatory genomic regions play crucial roles in controlling transcriptional regulatory networks that drive development, environmental responses, and disease. Chromatin immunoprecipitation (ChIP) followed by high-throughput sequencing (ChIP-seq) and ChIP followed by genomic tiling microarray hybridization (ChIP-chip) are the two of the most widely used technologies for genome-wide identification of DNA protein interactions and histone modification in vivo. Many algorithms and tools have been developed and evaluated that allow identification of transcription factor binding sites from ChIP-seq or ChIP-chip datasets. However, binding site identification is only the first step; the ultimate goal is to discover the regulatory network of the transcription factor (TF). Here, we present a common workflow for downstream analysis of ChIP-chip and ChIP-seq with an emphasis on annotating binding sites and integration with gene expression data to identify direct and indirect targets of the TF. These tools will help with the overall goal of unraveling transcriptional regulatory networks using datasets publicly available in GEO.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Johnson DS, Mortazavi A, Myers RM, Wold B (2007) Genome-wide mapping of in vivo protein-DNA interactions. Science 316:1497–1502

    Article  PubMed  CAS  Google Scholar 

  2. Robertson G, Hirst M, Bainbridge M, Bilenky M, Zhao Y, Zeng T et al (2007) Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods 4:651–657

    Article  PubMed  CAS  Google Scholar 

  3. Valouev A, Johnson DS, Sundquist A, Medina C, Anton E, Batzoglou S et al (2008) Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data. Nat Methods 5:829–834

    Article  PubMed  CAS  Google Scholar 

  4. Johnson DS, Li W, Gordon DB, Bhattacharjee A, Curry B, Ghosh J et al (2008) Systematic evaluation of variability in ChIP-chip experiments using predefined DNA targets. Genome Res 18:393–403

    Article  PubMed  Google Scholar 

  5. Kidder BL, Hu G, Zhao K (2011) ChIP-Seq: technical considerations for obtaining high-quality data. Nat Immunol 12:918–922

    Article  PubMed  CAS  Google Scholar 

  6. Buck MJ, Lieb JD (2004) ChIP-chip: considerations for the design, analysis, and application of genome-wide chromatin immunoprecipitation experiments. Genomics 83:349–360

    Article  PubMed  CAS  Google Scholar 

  7. Park PJ (2009) ChIP-seq: advantages and challenges of a maturing technology. Nat Rev Genet 10:669–680

    Article  PubMed  CAS  Google Scholar 

  8. Ho JW, Bishop E, Karchenko PV, Negre N, White KP, Park PJ (2011) ChIP-chip versus ChIP-seq: lessons for experimental design and data analysis. BMC Genomics 12:134

    Article  PubMed  CAS  Google Scholar 

  9. Metzker ML (2010) Sequencing technologies—the next generation. Nat Rev Genet 11:31–46

    Article  PubMed  CAS  Google Scholar 

  10. Cock PJ, Fields CJ, Goto N, Heuer ML, Rice PM (2010) The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res 38:1767–1771

    Article  PubMed  CAS  Google Scholar 

  11. Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25

    Article  PubMed  Google Scholar 

  12. Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760

    Article  PubMed  CAS  Google Scholar 

  13. Li R, Li Y, Kristiansen K, Wang J (2008) SOAP: short oligonucleotide alignment program. Bioinformatics 24:713–714

    Article  PubMed  CAS  Google Scholar 

  14. Li H, Homer N (2010) A survey of sequence alignment algorithms for next-generation sequencing. Brief Bioinform 11:473–483

    Article  PubMed  CAS  Google Scholar 

  15. Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE et al (2008) Model-based analysis of ChIP-Seq (MACS). Genome Biol 9:R137

    Article  PubMed  Google Scholar 

  16. Fejes AP, Robertson G, Bilenky M, Varhol R, Bainbridge M, Jones SJ (2008) FindPeaks 3.1: a tool for identifying areas of enrichment from massively parallel short-read sequencing technology. Bioinformatics 24:1729–1730

    Article  PubMed  CAS  Google Scholar 

  17. Albert I, Wachi S, Jiang C, Pugh BF (2008) GeneTrack – a genomic data processing and visualization framework. Bioinformatics 24:1305–1306

    Article  PubMed  CAS  Google Scholar 

  18. Jothi R, Cuddapah S, Barski A, Cui K, Zhao K (2008) Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. Nucleic Acids Res 36:5221–5231

    Article  PubMed  CAS  Google Scholar 

  19. Nix DA, Courdy SJ, Boucher KM (2008) Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks. BMC Bioinformatics 9:523

    Article  PubMed  Google Scholar 

  20. Spyrou C, Stark R, Lynch AG, Tavare S (2009) BayesPeak: Bayesian analysis of ChIP-seq data. BMC Bioinformatics 10:299

    Article  PubMed  Google Scholar 

  21. Ji H, Jiang H, Ma W, Wong WH (2011) Using CisGenome to analyze ChIP-chip and ChIP-seq data. Curr Protoc Bioinformatics Chapter 2:Unit2 13

    Google Scholar 

  22. Muino JM, Kaufmann K, van Ham RC, Angenent GC, Krajewski P (2011) ChIP-seq Analysis in R (CSAR): an R package for the statistical detection of protein-bound genomic regions. Plant Methods 7:11

    Article  PubMed  CAS  Google Scholar 

  23. Taslim C, Huang T, Lin S (2011) DIME: R-package for identifying differential ChIP-seq based on an ensemble of mixture models. Bioinformatics 27:1569–1570

    Article  PubMed  CAS  Google Scholar 

  24. Zhang X, Robertson G, Krzywinski M, Ning K, Droit A, Jones S, Gottardo R (2011) PICS: probabilistic inference for ChIP-seq. Biometrics 67:151–163

    Article  PubMed  Google Scholar 

  25. Wilbanks EG, Facciotti MT (2010) Evaluation of algorithm performance in ChIP-seq peak detection. PLoS One 5:e11471

    Article  PubMed  Google Scholar 

  26. Laajala TD, Raghav S, Tuomela S, Lahesmaa R, Aittokallio T, Elo LL (2009) A practical comparison of methods for detecting transcription factor binding sites in ChIP-seq experiments. BMC Genomics 10:618

    Article  PubMed  Google Scholar 

  27. Kasprzyk A, Keefe D, Smedley D, London D, Spooner W, Melsopp C et al (2004) EnsMart: a generic system for fast and flexible access to biological data. Genome Res 14:160–169

    Article  PubMed  CAS  Google Scholar 

  28. Zhu LJ, Gazin C, Lawson ND, Pages H, Lin SM, Lapointe DS, Green MR (2010) ChIPpeakAnno: a bioconductor package to annotate ChIP-seq and ChIP-chip data. BMC Bioinformatics 11:237

    Article  PubMed  Google Scholar 

  29. Bailey TL, Elkan C (1994) Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol 2:28–36

    PubMed  CAS  Google Scholar 

  30. Bailey TL (2011) DREME: motif discovery in transcription factor ChIP-seq data. Bioinformatics 27:1653–1659

    Article  PubMed  CAS  Google Scholar 

  31. Li L (2009) GADEM: a genetic algorithm guided formation of spaced dyads coupled with an EM algorithm for motif discovery. J Comput Biol 16:317–329

    Article  PubMed  CAS  Google Scholar 

  32. Hochbaum D, Zhang Y, Stuckenholz C, Labhart P, Alexiadis V, Martin R et al (2011) DAF-12 regulates a connected network of genes to ensure robust developmental decisions. PLoS Genet 7:e1002179

    Article  PubMed  CAS  Google Scholar 

  33. Fisher AL, Lithgow GJ (2006) The nuclear hormone receptor DAF-12 has opposing effects on Caenorhabditis elegans lifespan and regulates genes repressed in multiple long-lived worms. Aging Cell 5:127–138

    Article  PubMed  CAS  Google Scholar 

  34. Sandelin A, Alkema W, Engstrom P, Wasserman WW, Lenhard B (2004) JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res 32(Database issue):D91–D94

    Article  PubMed  CAS  Google Scholar 

  35. Schneider TD, Stephens RM (1990) Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 18:6097–6100

    Article  PubMed  CAS  Google Scholar 

  36. Mahony S, Benos PV (2007) STAMP: a web tool for exploring DNA-binding motif similarities. Nucleic Acids Res 35(Web Server issue):W253–W258

    Article  PubMed  Google Scholar 

  37. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S et al (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5:R80

    Article  PubMed  Google Scholar 

  38. Ihaka R, Gentlemen R (1996) R: a language for data analysis and graphics. J Comput Graph Stat 5:299–314

    Google Scholar 

  39. Hinrichs AS, Karolchik D, Baertsch R, Barber GP, Bejerano G et al (2006) The UCSC Genome Browser Database: update 2006. Nucleic Acids Res 34(Database issue):D590–D598

    Article  PubMed  CAS  Google Scholar 

  40. Lawrence M, Gentleman R, Carey V (2009) rtracklayer: an R package for interfacing with genome browsers. Bioinformatics 25:1841–1842

    Article  PubMed  CAS  Google Scholar 

  41. Mahony S, Auron PE, Benos PV (2007) DNA familial binding profiles made easy: comparison of various motif alignment and clustering strategies. PLoS Comput Biol 3:e61

    Article  PubMed  Google Scholar 

  42. Ou J, Zhu LJ (2013) http://www.bioconductor.org/packages/release/bioc/html/GeneNetworkBuilder.html

    Article  PubMed  Google Scholar 

Download references

Acknowledgment

I would like to thank Dr. Michael Brodsky at Program in Gene Function and Expression in University of Massachusetts Medical School for his critical review of the manuscript and his excellent suggestions.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this protocol

Cite this protocol

Zhu, L.J. (2013). Integrative Analysis of ChIP-Chip and ChIP-Seq Dataset. In: Lee, TL., Shui Luk, A. (eds) Tiling Arrays. Methods in Molecular Biology, vol 1067. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-607-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-1-62703-607-8_8

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-62703-606-1

  • Online ISBN: 978-1-62703-607-8

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics