On the detection and refinement of transcription factor binding sites using ChIP-Seq data

被引:76
作者
Hu, Ming [1 ,2 ]
Yu, Jindan [3 ,4 ,5 ,6 ]
Taylor, Jeremy M. G. [2 ,5 ]
Chinnaiyan, Arul M. [3 ,4 ,5 ,7 ,8 ]
Qin, Zhaohui S. [1 ,2 ]
机构
[1] Univ Michigan, Ctr Stat Genet, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USA
[3] Univ Michigan, Michigan Ctr Translat Pathol, Ann Arbor, MI 48109 USA
[4] Univ Michigan, Dept Pathol, Ann Arbor, MI 48109 USA
[5] Univ Michigan, Ctr Comprehens Canc, Ann Arbor, MI 48109 USA
[6] Northwestern Univ, Dept Med, Div Hematol Oncol, Chicago, IL 60660 USA
[7] Univ Michigan, Howard Hughes Med Inst, Ann Arbor, MI 48109 USA
[8] Univ Michigan, Dept Urol, Sch Med, Ann Arbor, MI 48109 USA
关键词
PROTEIN-DNA INTERACTIONS; HUMAN GENOME; CHROMATIN-IMMUNOPRECIPITATION; MOTIF DISCOVERY; SEQUENCES; MODEL; EXPRESSION; IDENTIFICATION; PATTERNS; GENE;
D O I
10.1093/nar/gkp1180
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Coupling chromatin immunoprecipitation (ChIP) with recently developed massively parallel sequencing technologies has enabled genome-wide detection of protein-DNA interactions with unprecedented sensitivity and specificity. This new technology, ChIP-Seq, presents opportunities for in-depth analysis of transcription regulation. In this study, we explore the value of using ChIP-Seq data to better detect and refine transcription factor binding sites (TFBS). We introduce a novel computational algorithm named Hybrid Motif Sampler (HMS), specifically designed for TFBS motif discovery in ChIP-Seq data. We propose a Bayesian model that incorporates sequencing depth information to aid motif identification. Our model also allows intra-motif dependency to describe more accurately the underlying motif pattern. Our algorithm combines stochastic sampling and deterministic 'greedy' search steps into a novel hybrid iterative scheme. This combination accelerates the computation process. Simulation studies demonstrate favorable performance of HMS compared to other existing methods. When applying HMS to real ChIP-Seq datasets, we find that (i) the accuracy of existing TFBS motif patterns can be significantly improved; and (ii) there is significant intra-motif dependency inside all the TFBS motifs we tested; modeling these dependencies further improves the accuracy of these TFBS motif patterns. These findings may offer new biological insights into the mechanisms of transcription factor regulation.
引用
收藏
页码:2154 / 2167
页数:14
相关论文
共 51 条
[51]   Modeling within-motif dependence for transcription factor binding site predictions [J].
Zhou, Q ;
Liu, JS .
BIOINFORMATICS, 2004, 20 (06) :909-916