Las Vegas algorithms for gene recognition: Suboptimal and error-tolerant spliced alignment

被引:23
作者
Sze, SH [1 ]
Pevzner, PA [1 ]
机构
[1] UNIV SO CALIF,DEPT MATH,LOS ANGELES,CA 90089
关键词
D O I
10.1089/cmb.1997.4.297
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Recently, Gelfand, Mironov and Pevzner (1996) proposed a spliced alignment approach to gene recognition that provides 99% accurate recognition of human genes if a related mammalian protein is available, However, even 99% accurate gene predictions are insufficient for automated sequence annotation in large-scale sequencing projects and therefore have to be complemented by experimental gene verification, One hundred percent accurate gene predictions would lead to a substantial reduction of experimental work on gene identification, Our goal is to develop an algorithm that either predicts an exon assembly with accuracy sufficient for sequence annotation or warns a biologist that the accuracy of a prediction is insufficient and further experimental work is required, We study suboptimal and error-tolerant spliced alignment problems as the first steps towards such an algorithm, and report an algorithm which provides 100% accurate recognition of human genes in 37% of cases (if a related mammalian protein is available), In 52% of genes, the algorithm predicts at least one exon with 100% accuracy.
引用
收藏
页码:297 / 309
页数:13
相关论文
共 30 条
[1]   AMINO-ACID SUBSTITUTION MATRICES FROM AN INFORMATION THEORETIC PERSPECTIVE [J].
ALTSCHUL, SF .
JOURNAL OF MOLECULAR BIOLOGY, 1991, 219 (03) :555-565
[2]   THE TURNING-POINT IN GENOME RESEARCH [J].
BOGUSKI, MS .
TRENDS IN BIOCHEMICAL SCIENCES, 1995, 20 (08) :295-296
[3]  
Brassard G, 1996, FUNDAMENTALS ALGORIT
[4]   Evaluation of gene structure prediction programs [J].
Burset, M ;
Guigo, R .
GENOMICS, 1996, 34 (03) :353-367
[5]  
Chao K M, 1994, J Comput Biol, V1, P271, DOI 10.1089/cmb.1994.1.271
[6]   The gene identification problem: An overview for developers [J].
Fickett, JW .
COMPUTERS & CHEMISTRY, 1996, 20 (01) :103-118
[7]  
Gelfand M S, 1995, J Comput Biol, V2, P87, DOI 10.1089/cmb.1995.2.87
[8]   PREDICTION OF THE EXON-INTRON STRUCTURE BY A DYNAMIC-PROGRAMMING APPROACH [J].
GELFAND, MS ;
ROYTBERG, MA .
BIOSYSTEMS, 1993, 30 (1-3) :173-182
[9]   COMPUTER-PREDICTION OF THE EXON-INTRON STRUCTURE OF MAMMALIAN PRE-MESSENGER-RNAS [J].
GELFAND, MS .
NUCLEIC ACIDS RESEARCH, 1990, 18 (19) :5865-5869
[10]   Gene recognition via spliced sequence alignment [J].
Gelfand, MS ;
Mironov, AA ;
Pevzner, PA .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1996, 93 (17) :9061-9066