Hierarchical hidden Markov model with application to joint analysis of ChIP-chip and ChIP-seq data

被引:21
作者
Choi, Hyungwon [3 ]
Nesvizhskii, Alexey I. [3 ,4 ]
Ghosh, Debashis [1 ,2 ]
Qin, Zhaohui S. [4 ,5 ]
机构
[1] Penn State Univ, Dept Stat, University Pk, PA 16802 USA
[2] Penn State Univ, Dept Publ Hlth Sci, University Pk, PA 16802 USA
[3] Univ Michigan, Dept Pathol, Ann Arbor, MI 48109 USA
[4] Univ Michigan, Ctr Computat Med & Biol, Ann Arbor, MI 48109 USA
[5] Univ Michigan, Ctr Stat Genet, Dept Biostat, Ann Arbor, MI 48109 USA
关键词
BINDING-SITES; TRANSCRIPTION; FORMALDEHYDE; EPIGENETICS; EXPRESSION; FRAMEWORK; NETWORK; CTCF; MAP;
D O I
10.1093/bioinformatics/btp312
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Chromatin immunoprecipitation (ChIP) experiments followed by array hybridization, or ChIP-chip, is a powerful approach for identifying transcription factor binding sites (TFBS) and has been widely used. Recently, massively parallel sequencing coupled with ChIP experiments (ChIP-seq) has been increasingly used as an alternative to ChIP-chip, offering cost-effective genome-wide coverage and resolution up to a single base pair. For many well-studied TFs, both ChIP-seq and ChIP-chip experiments have been applied and their data are publicly available. Previous analyses have revealed substantial technology-specific binding signals despite strong correlation between the two sets of results. Therefore, it is of interest to see whether the two data sources can be combined to enhance the detection of TFBS. Results: In this work, hierarchical hidden Markov model (HHMM) is proposed for combining data from ChIP-seq and ChIP-chip. In HHMM, inference results from individual HMMs in ChIP-seq and ChIP-chip experiments are summarized by a higher level HMM. Simulation studies show the advantage of HHMM when data from both technologies co-exist. Analysis of two well-studied TFs, NRSF and CCCTC-binding factor (CTCF), also suggests that HHMM yields improved TFBS identification in comparison to analyses using individual data sources or a simple merger of the two.
引用
收藏
页码:1715 / 1721
页数:7
相关论文
共 36 条
[1]   High-resolution profiling of histone methylations in the human genome [J].
Barski, Artern ;
Cuddapah, Suresh ;
Cui, Kairong ;
Roh, Tae-Young ;
Schones, Dustin E. ;
Wang, Zhibin ;
Wei, Gang ;
Chepelev, Iouri ;
Zhao, Keji .
CELL, 2007, 129 (04) :823-837
[2]   Accurate whole human genome sequencing using reversible terminator chemistry [J].
Bentley, David R. ;
Balasubramanian, Shankar ;
Swerdlow, Harold P. ;
Smith, Geoffrey P. ;
Milton, John ;
Brown, Clive G. ;
Hall, Kevin P. ;
Evers, Dirk J. ;
Barnes, Colin L. ;
Bignell, Helen R. ;
Boutell, Jonathan M. ;
Bryant, Jason ;
Carter, Richard J. ;
Cheetham, R. Keira ;
Cox, Anthony J. ;
Ellis, Darren J. ;
Flatbush, Michael R. ;
Gormley, Niall A. ;
Humphray, Sean J. ;
Irving, Leslie J. ;
Karbelashvili, Mirian S. ;
Kirk, Scott M. ;
Li, Heng ;
Liu, Xiaohai ;
Maisinger, Klaus S. ;
Murray, Lisa J. ;
Obradovic, Bojan ;
Ost, Tobias ;
Parkinson, Michael L. ;
Pratt, Mark R. ;
Rasolonjatovo, Isabelle M. J. ;
Reed, Mark T. ;
Rigatti, Roberto ;
Rodighiero, Chiara ;
Ross, Mark T. ;
Sabot, Andrea ;
Sankar, Subramanian V. ;
Scally, Aylwyn ;
Schroth, Gary P. ;
Smith, Mark E. ;
Smith, Vincent P. ;
Spiridou, Anastassia ;
Torrance, Peta E. ;
Tzonev, Svilen S. ;
Vermaas, Eric H. ;
Walter, Klaudia ;
Wu, Xiaolin ;
Zhang, Lu ;
Alam, Mohammed D. ;
Anastasi, Carole .
NATURE, 2008, 456 (7218) :53-59
[3]  
BUI H, 2004, P AAAI SAN JOS CA
[4]   Matlnspector and beyond: promoter analysis based on transcription factor binding sites [J].
Cartharius, K ;
Frech, K ;
Grote, K ;
Klocke, B ;
Haltmeier, M ;
Klingenhoff, A ;
Frisch, M ;
Bayerlein, M ;
Werner, T .
BIOINFORMATICS, 2005, 21 (13) :2933-2942
[5]   Integration of external signaling pathways with the core transcriptional network in embryonic stem cells [J].
Chen, Xi ;
Xu, Han ;
Yuan, Ping ;
Fang, Fang ;
Huss, Mikael ;
Vega, Vinsensius B. ;
Wong, Eleanor ;
Orlov, Yuriy L. ;
Zhang, Weiwei ;
Jiang, Jianming ;
Loh, Yuin-Han ;
Yeo, Hock Chuan ;
Yeo, Zhen Xuan ;
Narang, Vipin ;
Govindarajan, Kunde Ramamoorthy ;
Leong, Bernard ;
Shahab, Atif ;
Ruan, Yijun ;
Bourque, Guillaume ;
Sung, Wing-Kin ;
Clarke, Neil D. ;
Wei, Chia-Lin ;
Ng, Huck-Hui .
CELL, 2008, 133 (06) :1106-1117
[6]  
Consul P.C., 1989, Generalized Poisson Distributions: Properties and Applications
[7]   A supervised hidden markov model framework for efficiently segmenting tiling array data in transcriptional and chIP-chip experiments: systematically incorporating validated biological knowledge [J].
Du, Jiang ;
Rozowsky, Joel S. ;
Korbel, Jan O. ;
Zhang, Zhengdong D. ;
Royce, Thomas E. ;
Schultz, Martin H. ;
Snyder, Michael ;
Gerstein, Mark .
BIOINFORMATICS, 2006, 22 (24) :3016-3024
[8]   The many roles of the transcriptional regulator CTCF [J].
Dunn, KL ;
Davie, JR .
BIOCHEMISTRY AND CELL BIOLOGY, 2003, 81 (03) :161-167
[9]   Mapping of transcription factor binding regions in mammalian cells by ChIP: Comparison of array- and sequencing-based technologies [J].
Euskirchen, Ghia M. ;
Rozowsky, Joel S. ;
Wei, Chia-Lin ;
Lee, Wah Heng ;
Zhang, Zhengdong D. ;
Hartman, Stephen ;
Emanuelsson, Olof ;
Stolc, Viktor ;
Weissman, Sherman ;
Gerstein, Mark B. ;
Ruan, Yijun ;
Snyder, Michael .
GENOME RESEARCH, 2007, 17 (06) :898-909
[10]   The hierarchical hidden Markov model: Analysis and applications [J].
Fine, S ;
Singer, Y ;
Tishby, N .
MACHINE LEARNING, 1998, 32 (01) :41-62