CONTRAST:: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction

被引:57
作者
Gross, Samuel S. [1 ]
Do, Chuong B. [1 ]
Sirota, Marina [1 ]
Batzoglou, Serafim [1 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
关键词
D O I
10.1186/gb-2007-8-12-r269
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
We describe CONTRAST, a gene predictor which directly incorporates information from multiple alignments rather than employing phylogenetic models. This is accomplished through the use of discriminative machine learning techniques, including a novel training algorithm. We use a two-stage approach, in which a set of binary classifiers designed to recognize coding region boundaries is combined with a global model of gene structure. CONTRAST predicts exact coding region structures for 65% more human genes than the previous state-of-the-art method, misses 46% fewer exons and displays comparable gains in specificity.
引用
收藏
页数:16
相关论文
共 45 条
[1]   SLAM: Cross-species gene finding and alignment with a generalized pair hidden Markov model [J].
Alexandersson, M ;
Cawley, S ;
Pachter, L .
GENOME RESEARCH, 2003, 13 (03) :496-502
[2]  
[Anonymous], LIBSVM LIB SUPPORT V
[3]   Pairagon plus N-SCAN_EST: a model-based gene annotation pipeline [J].
Arumugam, Manimozhiyan ;
Wei, Chaochun ;
Brown, Randall H. ;
Brent, Michael R. .
GENOME BIOLOGY, 2006, 7 (Suppl 1)
[4]  
Bafna V, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P3
[5]   Systematic recovery and analysis of full-ORF human cDNA clones [J].
Baross, A ;
Butterfield, YSN ;
Coughlin, SM ;
Zeng, T ;
Griffith, M ;
Griffith, OL ;
Petrescu, AS ;
Smailus, DE ;
Khattra, J ;
McDonald, HL ;
McKay, SJ ;
Moksa, M ;
Holt, RA ;
Marra, MA .
GENOME RESEARCH, 2004, 14 (10B) :2083-2092
[6]   Human and mouse gene structure: Comparative analysis and application to exon prediction [J].
Batzoglou, S ;
Pachter, L ;
Mesirov, JP ;
Berger, B ;
Lander, ES .
GENOME RESEARCH, 2000, 10 (07) :950-958
[7]  
Benson Dennis A, 2005, Nucleic Acids Res, V33, pD34
[8]   Global discriminative learning for higher-accuracy computational gene prediction [J].
Bernal, Axel ;
Crammer, Koby ;
Hatzigeorgiou, Artemis ;
Pereira, Fernando .
PLOS COMPUTATIONAL BIOLOGY, 2007, 3 (03) :488-497
[9]   GeneWise and genomewise [J].
Birney, E ;
Clamp, M ;
Durbin, R .
GENOME RESEARCH, 2004, 14 (05) :988-995
[10]   Aligning multiple genomic sequences with the threaded blockset aligner [J].
Blanchette, M ;
Kent, WJ ;
Riemer, C ;
Elnitski, L ;
Smit, AFA ;
Roskin, KM ;
Baertsch, R ;
Rosenbloom, K ;
Clawson, H ;
Green, ED ;
Haussler, D ;
Miller, W .
GENOME RESEARCH, 2004, 14 (04) :708-715