Knowledge Integration for Analyzing ChIP-seq


Article Preview

To capture the genomic profiles for histone modification, chromatin immunoprecipitation (ChIP) is combined with next generation sequencing, which is called ChIP-seq. However, enriched regions generated from the ChIP-seq data are only evaluated on the limited knowledge acquired from manually examining the relevant biological literature. This paper proposes a novel framework, which integrates multiple knowledge sources such as biological literature, Gene Ontology, and microarray data. In order to precisely analyze ChIP-seq data for histone modification, knowledge integration is based on a unified probabilistic model. The model is employed to re-rank the enriched regions generated from peak finding algorithms. Through filtering the reranked enriched regions using some predefined threshold, more reliable and precise results could be generated. The combination of the multiple knowledge sources with the peaking finding algorithm produces a new paradigm for ChIP-seq data analysis.



Advanced Materials Research (Volumes 532-533)

Edited by:

Suozhang Cai and Mingli Li




D. Y. Zhou and Y. L. He, "Knowledge Integration for Analyzing ChIP-seq", Advanced Materials Research, Vols. 532-533, pp. 1344-1348, 2012

Online since:

June 2012





[1] E. R. Mardis, Chip-seq: welcome to the new frontier, Nature Methods, no. 4, p.613 – 614, (2007).


[2] H. Ji, H. Jiang, W. Ma, D. S. Johnson, R. M. Myers, and W. H. Wong, An integrated software system for analyzing chip-chip and chip-seq data, Nature Biotechnology, vol. 26, p.1293–1300, (2008).


[3] Y. Zhang, T. Liu, C. Meyer, J. Eeckhoute, D. Johnson, B. Bernstein, C. Nussbaum, R. Myers, M. Brown, W. Li, and X. S. Liu, Model-based analysis of chip-seq (macs), Genome Biology, vol. 9, no. 9, p. R137, (2008).


[4] J. Rozowsky, G. Euskirchen, R. K. Auerbach, Z. D. Zhang, T. Gibson, R. Bjornson, N. Carriero, M. Snyder, and M. B. Gerstein, Peakseq enables systematic scoring of chip-seq experiments relative to controls, Nature Biotechnology, no. 27, p.66 – 75, (2009).


[5] A. Valouev, D. S. Johnson, and A. Sundquist, Genome-wide analysis of transcription factor binding sites based on chip-seq data, Nature Methods, vol. 5, p.829–834, (2008).


[6] P. V. Kharchenko, M. Y. Tolstorukov, and P. J. Park, Design and analysis of chip-seq experiments for dna-binding proteins, Nature Biotechnology, vol. 26, p.1351 – 1359, (2008).


[7] S. J. C. David A Nix and K. M. Boucher, Empirical methods for controlling false positives and estimating confidence in chip-seq peaks, BMC Bioinformatics, vol. 9, no. 523, (2008).


[8] A. B. K. C. Raja Jothi, Suresh Cuddapah and K. Zhao, Genome-wide identification of in vivo protein-dna binding sites from chip-seq data, Nucleic Acids Research, vol. 36, p.5221–5231, (2008).


[9] H. Xu, C. -L. Wei, F. Lin, and W. -K. Sung, An hmm approach to genome-wide identification of differential histone modification sites from chip-seq data, Bioinformatics, vol. 24, no. 20, p.2344–2349, October (2008).


[10] C. Zang, D. E. Schones, C. Zeng, K. Cui, K. Zhao, and W. Peng, A clustering approach for identification of enriched domains from histone modification chip-seq data, Bioinformatics, vol. 25, no. 15, p.1952–1958, August (2009).


Fetching data from Crossref.
This may take some time to load.