A boosting approach for motif modeling using ChIP-chip data

被引:41
作者
Hong, PY
Liu, XS
Zhou, Q
Lu, X
Liu, JS
Wong, WH [1 ]
机构
[1] Harvard Univ, Dept Stat, Cambridge, MA 02138 USA
[2] Harvard Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
关键词
D O I
10.1093/bioinformatics/bti402
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Building an accurate binding model for a transcription factor (TF) is essential to differentiate its true binding targets from those spurious ones. This is an important step toward understanding gene regulation. Results: This paper describes a boosting approach to modeling TF-DNA binding. Different from the widely used weight matrix model, which predicts TF-DNA binding based on a linear combination of position-specific contributions, our approach builds a TF binding classifier by combining a set of weight matrix based classifiers, thus yielding a non-linear binding decision rule. The proposed approach was applied to the ChIP-chip data of Saccharomyces cerevisiae. When compared with the weight matrix method, our new approach showed significant improvements on the specificity in a majority of cases.
引用
收藏
页码:2636 / 2643
页数:8
相关论文
共 27 条
[1]  
Agarwal P., 1998, Proceedings of the Second Annual International Conference on Computational Molecular Biology, RECOMB '98, P2
[2]  
Bailey T., 1994, P 2 INT C INT SYST M, P28
[3]  
Barash Y., 2003, P 7 ANN INT C COMP M, P28
[4]  
BARASH Y, 2001, LNCS, V2149, P278
[5]   Predicting gene expression from sequence [J].
Beer, MA ;
Tavazoie, S .
CELL, 2004, 117 (02) :185-198
[6]   Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors [J].
Bulyk, ML ;
Johnson, PLF ;
Church, GM .
NUCLEIC ACIDS RESEARCH, 2002, 30 (05) :1255-1261
[7]   Exploring the DNA-binding specificities of zinc fingers with DNA microarrays [J].
Bulyk, ML ;
Huang, XH ;
Choo, Y ;
Church, GM .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (13) :7158-7163
[8]   Regulatory element detection using correlation with expression [J].
Bussemaker, HJ ;
Li, H ;
Siggia, ED .
NATURE GENETICS, 2001, 27 (02) :167-171
[9]   Integrating regulatory motif discovery and genome-wide expression analysis [J].
Conlon, EM ;
Liu, XS ;
Lieb, JD ;
Liu, JS .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (06) :3339-3344
[10]   Additive logistic regression: A statistical view of boosting - Rejoinder [J].
Friedman, J ;
Hastie, T ;
Tibshirani, R .
ANNALS OF STATISTICS, 2000, 28 (02) :400-407