CONTRAST:: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction

被引:57
作者
Gross, Samuel S. [1 ]
Do, Chuong B. [1 ]
Sirota, Marina [1 ]
Batzoglou, Serafim [1 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
关键词
D O I
10.1186/gb-2007-8-12-r269
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
We describe CONTRAST, a gene predictor which directly incorporates information from multiple alignments rather than employing phylogenetic models. This is accomplished through the use of discriminative machine learning techniques, including a novel training algorithm. We use a two-stage approach, in which a set of binary classifiers designed to recognize coding region boundaries is combined with a global model of gene structure. CONTRAST predicts exact coding region structures for 65% more human genes than the previous state-of-the-art method, misses 46% fewer exons and displays comparable gains in specificity.
引用
收藏
页数:16
相关论文
共 45 条
[41]  
Vetterling W. T., 1992, NUMERICAL RECIPES C
[42]   The effects of evolutionary distance on TWINSCAN, an algorithm for pair-wise comparative gene prediction [J].
Wang, M ;
Buhler, J ;
Brent, MR .
COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY, 2003, 68 :125-130
[43]   Using ESTs to improve the accuracy of de novo gene prediction [J].
Wei, Chaochun ;
Brent, Michael R. .
BMC BIOINFORMATICS, 2006, 7 (1)
[44]   Large-scale RT-PCR recovery of full-length cDNA clones [J].
Wu, JQ ;
Garcia, AM ;
Hulyk, S ;
Sneed, A ;
Kowis, C ;
Yuan, Y ;
Steffen, D ;
McPherson, JD ;
Gunaratne, PH ;
Gibbs, RA .
BIOTECHNIQUES, 2004, 36 (04) :690-+
[45]  
2007, CCDS REPORT CONSENSU