The value of position-specific priors in motif discovery using MEME

被引:75
作者
Bailey, Timothy L. [1 ]
Boden, Mikael [1 ]
Whitington, Tom [1 ]
Machanick, Philip [1 ]
机构
[1] Univ Queensland, Inst Mol Biosci, Brisbane, Qld 4072, Australia
来源
BMC BIOINFORMATICS | 2010年 / 11卷
关键词
DNA; SEQUENCES; GENOME; ALGORITHM;
D O I
10.1186/1471-2105-11-179
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Position-specific priors have been shown to be a flexible and elegant way to extend the power of Gibbs sampler-based motif discovery algorithms. Information of many types-including sequence conservation, nucleosome positioning, and negative examples-can be converted into a prior over the location of motif sites, which then guides the sequence motif discovery algorithm. This approach has been shown to confer many of the benefits of conservation-based and discriminative motif discovery approaches on Gibbs sampler-based motif discovery methods, but has not previously been studied with methods based on expectation maximization (EM). Results: We extend the popular EM-based MEME algorithm to utilize position-specific priors and demonstrate their effectiveness for discovering transcription factor (TF) motifs in yeast and mouse DNA sequences. Utilizing a discriminative, conservation-based prior dramatically improves MEME's ability to discover motifs in 156 yeast TF ChIP-chip datasets, more than doubling the number of datasets where it finds the correct motif. On these datasets, MEME using the prior has a higher success rate than eight other conservation-based motif discovery approaches. We also show that the same type of prior improves the accuracy of motifs discovered by MEME in mouse TF ChIP-seq data, and that the motifs tend to be of slightly higher quality those found by a Gibbs sampling algorithm using the same prior. Conclusions: We conclude that using position-specific priors can substantially increase the power of EM-based motif discovery algorithms such as MEME algorithm.
引用
收藏
页数:14
相关论文
共 24 条
[1]  
Bailey T L, 1995, Proc Int Conf Intell Syst Mol Biol, V3, P21
[2]   Combining evidence using p-values: application to sequence homology searches [J].
Bailey, TL ;
Gribskov, M .
BIOINFORMATICS, 1998, 14 (01) :48-54
[3]  
BAILEY TL, 2009, NUCLEIC ACIDS RES, V202, pW208
[4]  
BARASH Y, 2001, LNCS, V2149, P278
[5]   Studying the functional conservation of cis-regulatory modules and their transcriptional output [J].
Bauer, Denis C. ;
Bailey, Timothy L. .
BMC BIOINFORMATICS, 2008, 9 (1)
[6]   Assigning roles to DNA regulatory motifs using comparative genomics [J].
Buske, Fabian A. ;
Boden, Mikael ;
Bauer, Denis C. ;
Bailey, Timothy L. .
BIOINFORMATICS, 2010, 26 (07) :860-866
[7]   Integration of external signaling pathways with the core transcriptional network in embryonic stem cells [J].
Chen, Xi ;
Xu, Han ;
Yuan, Ping ;
Fang, Fang ;
Huss, Mikael ;
Vega, Vinsensius B. ;
Wong, Eleanor ;
Orlov, Yuriy L. ;
Zhang, Weiwei ;
Jiang, Jianming ;
Loh, Yuin-Han ;
Yeo, Hock Chuan ;
Yeo, Zhen Xuan ;
Narang, Vipin ;
Govindarajan, Kunde Ramamoorthy ;
Leong, Bernard ;
Shahab, Atif ;
Ruan, Yijun ;
Bourque, Guillaume ;
Sung, Wing-Kin ;
Clarke, Neil D. ;
Wei, Chia-Lin ;
Ng, Huck-Hui .
CELL, 2008, 133 (06) :1106-1117
[8]   FootPrinter3: phylogenetic footprinting in partially alignable sequences [J].
Fang, Fei ;
Blanchette, Mathieu .
NUCLEIC ACIDS RESEARCH, 2006, 34 :W617-W620
[9]   Profiling condition-specific, genome-wide regulation of mRNA stability in yeast [J].
Foat, BC ;
Houshmandi, SS ;
Olivas, WM ;
Bussemaker, HJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (49) :17675-17680
[10]   Detection of functional DNA motifs via statistical over-representation [J].
Frith, MC ;
Fu, YT ;
Yu, LQ ;
Chen, JF ;
Hansen, U ;
Weng, ZP .
NUCLEIC ACIDS RESEARCH, 2004, 32 (04) :1372-1381