On the detection and refinement of transcription factor binding sites using ChIP-Seq data

被引:76
作者
Hu, Ming [1 ,2 ]
Yu, Jindan [3 ,4 ,5 ,6 ]
Taylor, Jeremy M. G. [2 ,5 ]
Chinnaiyan, Arul M. [3 ,4 ,5 ,7 ,8 ]
Qin, Zhaohui S. [1 ,2 ]
机构
[1] Univ Michigan, Ctr Stat Genet, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USA
[3] Univ Michigan, Michigan Ctr Translat Pathol, Ann Arbor, MI 48109 USA
[4] Univ Michigan, Dept Pathol, Ann Arbor, MI 48109 USA
[5] Univ Michigan, Ctr Comprehens Canc, Ann Arbor, MI 48109 USA
[6] Northwestern Univ, Dept Med, Div Hematol Oncol, Chicago, IL 60660 USA
[7] Univ Michigan, Howard Hughes Med Inst, Ann Arbor, MI 48109 USA
[8] Univ Michigan, Dept Urol, Sch Med, Ann Arbor, MI 48109 USA
关键词
PROTEIN-DNA INTERACTIONS; HUMAN GENOME; CHROMATIN-IMMUNOPRECIPITATION; MOTIF DISCOVERY; SEQUENCES; MODEL; EXPRESSION; IDENTIFICATION; PATTERNS; GENE;
D O I
10.1093/nar/gkp1180
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Coupling chromatin immunoprecipitation (ChIP) with recently developed massively parallel sequencing technologies has enabled genome-wide detection of protein-DNA interactions with unprecedented sensitivity and specificity. This new technology, ChIP-Seq, presents opportunities for in-depth analysis of transcription regulation. In this study, we explore the value of using ChIP-Seq data to better detect and refine transcription factor binding sites (TFBS). We introduce a novel computational algorithm named Hybrid Motif Sampler (HMS), specifically designed for TFBS motif discovery in ChIP-Seq data. We propose a Bayesian model that incorporates sequencing depth information to aid motif identification. Our model also allows intra-motif dependency to describe more accurately the underlying motif pattern. Our algorithm combines stochastic sampling and deterministic 'greedy' search steps into a novel hybrid iterative scheme. This combination accelerates the computation process. Simulation studies demonstrate favorable performance of HMS compared to other existing methods. When applying HMS to real ChIP-Seq datasets, we find that (i) the accuracy of existing TFBS motif patterns can be significantly improved; and (ii) there is significant intra-motif dependency inside all the TFBS motifs we tested; modeling these dependencies further improves the accuracy of these TFBS motif patterns. These findings may offer new biological insights into the mechanisms of transcription factor regulation.
引用
收藏
页码:2154 / 2167
页数:14
相关论文
共 51 条
[31]   Non-independence of Mnt repressor-operator interaction determined by a new quantitative multiple fluorescence relative affinity (QuMFRA) assay [J].
Man, TK ;
Stormo, GD .
NUCLEIC ACIDS RESEARCH, 2001, 29 (12) :2471-2478
[32]   Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes [J].
McCue, LA ;
Thompson, W ;
Carmack, CS ;
Ryan, MP ;
Liu, JS ;
Derbyshire, V ;
Lawrence, CE .
NUCLEIC ACIDS RESEARCH, 2001, 29 (03) :774-782
[33]   Genome-wide maps of chromatin state in pluripotent and lineage-committed cells [J].
Mikkelsen, Tarjei S. ;
Ku, Manching ;
Jaffe, David B. ;
Issac, Biju ;
Lieberman, Erez ;
Giannoukos, Georgia ;
Alvarez, Pablo ;
Brockman, William ;
Kim, Tae-Kyung ;
Koche, Richard P. ;
Lee, William ;
Mendenhall, Eric ;
O'Donovan, Aisling ;
Presser, Aviva ;
Russ, Carsten ;
Xie, Xiaohui ;
Meissner, Alexander ;
Wernig, Marius ;
Jaenisch, Rudolf ;
Nusbaum, Chad ;
Lander, Eric S. ;
Bernstein, Bradley E. .
NATURE, 2007, 448 (7153) :553-U2
[34]   GIBBS MOTIF SAMPLING - DETECTION OF BACTERIAL OUTER-MEMBRANE PROTEIN REPEATS [J].
NEUWALD, AF ;
LIU, JS ;
LAWRENCE, CE .
PROTEIN SCIENCE, 1995, 4 (08) :1618-1632
[35]   Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks [J].
Nix, David A. ;
Courdy, Samir J. ;
Boucher, Kenneth M. .
BMC BIOINFORMATICS, 2008, 9 (1)
[36]   MAPPING POLYCOMB-REPRESSED DOMAINS IN THE BITHORAX COMPLEX USING IN-VIVO FORMALDEHYDE CROSS-LINKED CHROMATIN [J].
ORLANDO, V ;
PARO, R .
CELL, 1993, 75 (06) :1187-1198
[37]   Genome-wide location and function of DNA binding proteins [J].
Ren, B ;
Robert, F ;
Wyrick, JJ ;
Aparicio, O ;
Jennings, EG ;
Simon, I ;
Zeitlinger, J ;
Schreiber, J ;
Hannett, N ;
Kanin, E ;
Volkert, TL ;
Wilson, CJ ;
Bell, SP ;
Young, RA .
SCIENCE, 2000, 290 (5500) :2306-+
[38]   Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing [J].
Robertson, Gordon ;
Hirst, Martin ;
Bainbridge, Matthew ;
Bilenky, Misha ;
Zhao, Yongjun ;
Zeng, Thomas ;
Euskirchen, Ghia ;
Bernier, Bridget ;
Varhol, Richard ;
Delaney, Allen ;
Thiessen, Nina ;
Griffith, Obi L. ;
He, Ann ;
Marra, Marco ;
Snyder, Michael ;
Jones, Steven .
NATURE METHODS, 2007, 4 (08) :651-657
[39]   Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation [J].
Roth, FP ;
Hughes, JD ;
Estep, PW ;
Church, GM .
NATURE BIOTECHNOLOGY, 1998, 16 (10) :939-945
[40]   PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls [J].
Rozowsky, Joel ;
Euskirchen, Ghia ;
Auerbach, Raymond K. ;
Zhang, Zhengdong D. ;
Gibson, Theodore ;
Bjornson, Robert ;
Carriero, Nicholas ;
Snyder, Michael ;
Gerstein, Mark B. .
NATURE BIOTECHNOLOGY, 2009, 27 (01) :66-75