TIP: A probabilistic method for identifying transcription factor target genes from ChIP-seq binding profiles

被引:47
作者
Cheng, Chao [1 ,2 ]
Min, Renqiang [1 ,2 ]
Gerstein, Mark [1 ,2 ,3 ]
机构
[1] Yale Univ, Program Computat Biol & Bioinformat, New Haven, CT 06511 USA
[2] Yale Univ, Dept Mol Biophys & Biochem, New Haven, CT 06511 USA
[3] Yale Univ, Dept Comp Sci, New Haven, CT 06511 USA
基金
美国国家卫生研究院;
关键词
DNA-BINDING; REGULATORY CIRCUITRY; FUNCTIONAL ELEMENTS; HUMAN GENOME; PREDICTION; IDENTIFICATION; THOUSANDS; NETWORK;
D O I
10.1093/bioinformatics/btr552
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: ChIP-seq and ChIP-chip experiments have been widely used to identify transcription factor (TF) binding sites and target genes. Conventionally, a fairly ` simple' approach is employed for target gene identification e. g. finding genes with binding sites within 2 kb of a transcription start site (TSS). However, this does not take into account the number of sites upstream of the TSS, their exact positioning or the fact that different TFs appear to act at different characteristic distances from the TSS. Results: Here we propose a probabilistic model called target identification from profiles (TIP) that quantitatively measures the regulatory relationships between TFs and target genes. For each TF, our model builds a characteristic, averaged profile of binding around the TSS and then uses this to weight the sites associated with a given gene, providing a continuous-valued 'regulatory' score relating each TF and potential target. Moreover, the score can readily be turned into a ranked list of target genes and an estimate of significance, which is useful for case-dependent downstream analysis. Conclusion: We show the advantages of TIP by comparing it to the 'simple' approach on several representative datasets, using motif occurrence and relationship to knock-out experiments as metrics of validation. Moreover, we show that the probabilistic model is not as sensitive to various experimental parameters (including sequencing depth and peak-calling method) as the simple approach; in fact, the lesser dependence on sequencing depth potentially utilizes the result of a ChIP-seq experiment in a more 'cost-effective' manner.
引用
收藏
页码:3221 / 3227
页数:7
相关论文
共 35 条
[1]  
Bailey TL., 1994, Proc Int Conf Intel Syst Mol Biol, V2, P28
[2]   Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project [J].
Birney, Ewan ;
Stamatoyannopoulos, John A. ;
Dutta, Anindya ;
Guigo, Roderic ;
Gingeras, Thomas R. ;
Margulies, Elliott H. ;
Weng, Zhiping ;
Snyder, Michael ;
Dermitzakis, Emmanouil T. ;
Stamatoyannopoulos, John A. ;
Thurman, Robert E. ;
Kuehn, Michael S. ;
Taylor, Christopher M. ;
Neph, Shane ;
Koch, Christoph M. ;
Asthana, Saurabh ;
Malhotra, Ankit ;
Adzhubei, Ivan ;
Greenbaum, Jason A. ;
Andrews, Robert M. ;
Flicek, Paul ;
Boyle, Patrick J. ;
Cao, Hua ;
Carter, Nigel P. ;
Clelland, Gayle K. ;
Davis, Sean ;
Day, Nathan ;
Dhami, Pawandeep ;
Dillon, Shane C. ;
Dorschner, Michael O. ;
Fiegler, Heike ;
Giresi, Paul G. ;
Goldy, Jeff ;
Hawrylycz, Michael ;
Haydock, Andrew ;
Humbert, Richard ;
James, Keith D. ;
Johnson, Brett E. ;
Johnson, Ericka M. ;
Frum, Tristan T. ;
Rosenzweig, Elizabeth R. ;
Karnani, Neerja ;
Lee, Kirsten ;
Lefebvre, Gregory C. ;
Navas, Patrick A. ;
Neri, Fidencio ;
Parker, Stephen C. J. ;
Sabo, Peter J. ;
Sandstrom, Richard ;
Shafer, Anthony .
NATURE, 2007, 447 (7146) :799-816
[3]   Core transcriptional regulatory circuitry in human embryonic stem cells [J].
Boyer, LA ;
Lee, TI ;
Cole, MF ;
Johnstone, SE ;
Levine, SS ;
Zucker, JR ;
Guenther, MG ;
Kumar, RM ;
Murray, HL ;
Jenner, RG ;
Gifford, DK ;
Melton, DA ;
Jaenisch, R ;
Young, RA .
CELL, 2005, 122 (06) :947-956
[4]   Integration of external signaling pathways with the core transcriptional network in embryonic stem cells [J].
Chen, Xi ;
Xu, Han ;
Yuan, Ping ;
Fang, Fang ;
Huss, Mikael ;
Vega, Vinsensius B. ;
Wong, Eleanor ;
Orlov, Yuriy L. ;
Zhang, Weiwei ;
Jiang, Jianming ;
Loh, Yuin-Han ;
Yeo, Hock Chuan ;
Yeo, Zhen Xuan ;
Narang, Vipin ;
Govindarajan, Kunde Ramamoorthy ;
Leong, Bernard ;
Shahab, Atif ;
Ruan, Yijun ;
Bourque, Guillaume ;
Sung, Wing-Kin ;
Clarke, Neil D. ;
Wei, Chia-Lin ;
Ng, Huck-Hui .
CELL, 2008, 133 (06) :1106-1117
[5]   Integrating multiple evidence sources to predict transcription factor binding in the human genome [J].
Ernst, Jason ;
Plasterer, Heather L. ;
Simon, Itamar ;
Bar-Joseph, Ziv .
GENOME RESEARCH, 2010, 20 (04) :526-536
[6]   Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project [J].
Gerstein, Mark B. ;
Lu, Zhi John ;
Van Nostrand, Eric L. ;
Cheng, Chao ;
Arshinoff, Bradley I. ;
Liu, Tao ;
Yip, Kevin Y. ;
Robilotto, Rebecca ;
Rechtsteiner, Andreas ;
Ikegami, Kohta ;
Alves, Pedro ;
Chateigner, Aurelien ;
Perry, Marc ;
Morris, Mitzi ;
Auerbach, Raymond K. ;
Feng, Xin ;
Leng, Jing ;
Vielle, Anne ;
Niu, Wei ;
Rhrissorrakrai, Kahn ;
Agarwal, Ashish ;
Alexander, Roger P. ;
Barber, Galt ;
Brdlik, Cathleen M. ;
Brennan, Jennifer ;
Brouillet, Jeremy Jean ;
Carr, Adrian ;
Cheung, Ming-Sin ;
Clawson, Hiram ;
Contrino, Sergio ;
Dannenberg, Luke O. ;
Dernburg, Abby F. ;
Desai, Arshad ;
Dick, Lindsay ;
Dose, Andrea C. ;
Du, Jiang ;
Egelhofer, Thea ;
Ercan, Sevinc ;
Euskirchen, Ghia ;
Ewing, Brent ;
Feingold, Elise A. ;
Gassmann, Reto ;
Good, Peter J. ;
Green, Phil ;
Gullier, Francois ;
Gutwein, Michelle ;
Guyer, Mark S. ;
Habegger, Lukas ;
Han, Ting ;
Henikoff, Jorja G. .
SCIENCE, 2010, 330 (6012) :1775-1787
[7]   Transcriptional regulatory code of a eukaryotic genome [J].
Harbison, CT ;
Gordon, DB ;
Lee, TI ;
Rinaldi, NJ ;
Macisaac, KD ;
Danford, TW ;
Hannett, NM ;
Tagne, JB ;
Reynolds, DB ;
Yoo, J ;
Jennings, EG ;
Zeitlinger, J ;
Pokholok, DK ;
Kellis, M ;
Rolfe, PA ;
Takusagawa, KT ;
Lander, ES ;
Gifford, DK ;
Fraenkel, E ;
Young, RA .
NATURE, 2004, 431 (7004) :99-104
[8]   An integrated software system for analyzing ChIP-chip and ChIP-seq data [J].
Ji, Hongkai ;
Jiang, Hui ;
Ma, Wenxiu ;
Johnson, David S. ;
Myers, Richard M. ;
Wong, Wing H. .
NATURE BIOTECHNOLOGY, 2008, 26 (11) :1293-1300
[9]   Genome-wide mapping of in vivo protein-DNA interactions [J].
Johnson, David S. ;
Mortazavi, Ali ;
Myers, Richard M. ;
Wold, Barbara .
SCIENCE, 2007, 316 (5830) :1497-1502
[10]   Quantitative Models of the Mechanisms That Control Genome-Wide Patterns of Transcription Factor Binding during Early Drosophila Development [J].
Kaplan, Tommy ;
Li, Xiao-Yong ;
Sabo, Peter J. ;
Thomas, Sean ;
Stamatoyannopoulos, John A. ;
Biggin, Mark D. ;
Eisen, Michael B. .
PLOS GENETICS, 2011, 7 (02)