DREME: motif discovery in transcription factor ChIP-seq data

被引:779
作者
Bailey, Timothy L. [1 ]
机构
[1] Univ Queensland, Inst Mol Biosci, Brisbane, Qld 4072, Australia
关键词
EMBRYONIC STEM-CELLS; FACTOR-BINDING SITES; DNA-BINDING; EVOLUTION; ERYTHROPOIESIS; EXPRESSION; SEQUENCE; COMPLEX; GENOME; GENES;
D O I
10.1093/bioinformatics/btr261
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Transcription factor (TF) ChIP-seq datasets have particular characteristics that provide unique challenges and opportunities for motif discovery. Most existing motif discovery algorithms do not scale well to such large datasets, or fail to report many motifs associated with cofactors of the ChIP-ed TF. Results: We present DREME, a motif discovery algorithm specifically designed to find the short, core DNA-binding motifs of eukaryotic TFs, and optimized to analyze very large ChIP-seq datasets in minutes. Using DREME, we discover the binding motifs of the the ChIP-ed TF and many cofactors in mouse ES cell (mESC), mouse erythrocyte and human cell line ChIP-seq datasets. For example, in mESC ChIP-seq data for the TF Esrrb, we discover the binding motifs for eight cofactor TFs important in the maintenance of pluripotency. Several other commonly used algorithms find at most two cofactor motifs in this same dataset. DREME can also perform discriminative motif discovery, and we use this feature to provide evidence that Sox2 and Oct4 do not bind in mES cells as an obligate heterodimer. DREME is much faster than many commonly used algorithms, scales linearly in dataset size, finds multiple, non-redundant motifs and reports a reliable measure of statistical significance for each motif found. DREME is available as part of the MEME Suite of motif-based sequence analysis tools (http://meme.nbcr.net).
引用
收藏
页码:1653 / 1659
页数:7
相关论文
共 34 条
[1]  
Bailey T L, 1995, Proc Int Conf Intell Syst Mol Biol, V3, P21
[2]  
BARASH Y, 2001, LNCS, V2149, P278
[3]   Universal protein-binding microarrays for the comprehensive characterization of the DNA-binding specificities of transcription factors [J].
Berger, Michael F. ;
Bulyk, Martha L. .
NATURE PROTOCOLS, 2009, 4 (03) :393-411
[4]   Unbiased location analysis of E2F1-binding sites suggests a widespread role for E2F1 in the human genome [J].
Bieda, M ;
Xu, XQ ;
Singer, MA ;
Green, R ;
Farnham, PJ .
GENOME RESEARCH, 2006, 16 (05) :595-605
[5]   High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells [J].
Boyle, Alan P. ;
Song, Lingyun ;
Lee, Bum-Kyu ;
London, Darin ;
Keefe, Damian ;
Birney, Ewan ;
Iyer, Vishwanath R. ;
Crawford, Gregory E. ;
Furey, Terrence S. .
GENOME RESEARCH, 2011, 21 (03) :456-464
[6]   Integration of external signaling pathways with the core transcriptional network in embryonic stem cells [J].
Chen, Xi ;
Xu, Han ;
Yuan, Ping ;
Fang, Fang ;
Huss, Mikael ;
Vega, Vinsensius B. ;
Wong, Eleanor ;
Orlov, Yuriy L. ;
Zhang, Weiwei ;
Jiang, Jianming ;
Loh, Yuin-Han ;
Yeo, Hock Chuan ;
Yeo, Zhen Xuan ;
Narang, Vipin ;
Govindarajan, Kunde Ramamoorthy ;
Leong, Bernard ;
Shahab, Atif ;
Ruan, Yijun ;
Bourque, Guillaume ;
Sung, Wing-Kin ;
Clarke, Neil D. ;
Wei, Chia-Lin ;
Ng, Huck-Hui .
CELL, 2008, 133 (06) :1106-1117
[7]   Erythroid GATA1 function revealed by genome-wide analysis of transcription factor occupancy, histone modifications, and mRNA expression [J].
Cheng, Yong ;
Wu, Weisheng ;
Kumar, Swathi Ashok ;
Yu, Duonan ;
Deng, Wulan ;
Tripic, Tamara ;
King, David C. ;
Chen, Kuan-Bei ;
Zhang, Ying ;
Drautz, Daniela ;
Giardine, Belinda ;
Schuster, Stephan C. ;
Miller, Webb ;
Chiaromonte, Francesca ;
Zhang, Yu ;
Blobel, Gerd A. ;
Weiss, Mitchell J. ;
Hardison, Ross C. .
GENOME RESEARCH, 2009, 19 (12) :2172-2184
[8]   NestedMICA: sensitive inference of over-represented motifs in nucleic acid sequence [J].
Down, TA ;
Hubbard, TJP .
NUCLEIC ACIDS RESEARCH, 2005, 33 (05) :1445-1453
[9]   Trawler:: de novo regulatory motif discovery pipeline for chromatin immunoprecipitation [J].
Ettwiller, Laurence ;
Paten, Benedict ;
Ramialison, Mirana ;
Birney, Ewan ;
Wittbrodt, Joachim .
NATURE METHODS, 2007, 4 (07) :563-565
[10]   Quantifying similarity between motifs [J].
Gupta, Shobhit ;
Stamatoyannopoulos, John A. ;
Bailey, Timothy L. ;
Noble, William Stafford .
GENOME BIOLOGY, 2007, 8 (02)