共 51 条
On the detection and refinement of transcription factor binding sites using ChIP-Seq data
被引:76
作者:
Hu, Ming
[1
,2
]
Yu, Jindan
[3
,4
,5
,6
]
Taylor, Jeremy M. G.
[2
,5
]
Chinnaiyan, Arul M.
[3
,4
,5
,7
,8
]
Qin, Zhaohui S.
[1
,2
]
机构:
[1] Univ Michigan, Ctr Stat Genet, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USA
[3] Univ Michigan, Michigan Ctr Translat Pathol, Ann Arbor, MI 48109 USA
[4] Univ Michigan, Dept Pathol, Ann Arbor, MI 48109 USA
[5] Univ Michigan, Ctr Comprehens Canc, Ann Arbor, MI 48109 USA
[6] Northwestern Univ, Dept Med, Div Hematol Oncol, Chicago, IL 60660 USA
[7] Univ Michigan, Howard Hughes Med Inst, Ann Arbor, MI 48109 USA
[8] Univ Michigan, Dept Urol, Sch Med, Ann Arbor, MI 48109 USA
关键词:
PROTEIN-DNA INTERACTIONS;
HUMAN GENOME;
CHROMATIN-IMMUNOPRECIPITATION;
MOTIF DISCOVERY;
SEQUENCES;
MODEL;
EXPRESSION;
IDENTIFICATION;
PATTERNS;
GENE;
D O I:
10.1093/nar/gkp1180
中图分类号:
Q5 [生物化学];
Q7 [分子生物学];
学科分类号:
071010 ;
081704 ;
摘要:
Coupling chromatin immunoprecipitation (ChIP) with recently developed massively parallel sequencing technologies has enabled genome-wide detection of protein-DNA interactions with unprecedented sensitivity and specificity. This new technology, ChIP-Seq, presents opportunities for in-depth analysis of transcription regulation. In this study, we explore the value of using ChIP-Seq data to better detect and refine transcription factor binding sites (TFBS). We introduce a novel computational algorithm named Hybrid Motif Sampler (HMS), specifically designed for TFBS motif discovery in ChIP-Seq data. We propose a Bayesian model that incorporates sequencing depth information to aid motif identification. Our model also allows intra-motif dependency to describe more accurately the underlying motif pattern. Our algorithm combines stochastic sampling and deterministic 'greedy' search steps into a novel hybrid iterative scheme. This combination accelerates the computation process. Simulation studies demonstrate favorable performance of HMS compared to other existing methods. When applying HMS to real ChIP-Seq datasets, we find that (i) the accuracy of existing TFBS motif patterns can be significantly improved; and (ii) there is significant intra-motif dependency inside all the TFBS motifs we tested; modeling these dependencies further improves the accuracy of these TFBS motif patterns. These findings may offer new biological insights into the mechanisms of transcription factor regulation.
引用
收藏
页码:2154 / 2167
页数:14
相关论文