SGP-1:: Prediction and validation of homologous genes based on sequence alignments

被引:71
作者
Wiehe, T [1 ]
Gebauer-Jung, S
Mitchell-Olds, T
Guigó, R
机构
[1] Max Planck Inst Chem Ecol, Jena, Germany
[2] Univ Pompeu Fabra, Inst Municipal Invest Med, Grp Recerca Informat Biomed, Barcelona, Spain
关键词
D O I
10.1101/gr.177401
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based oil human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based oil the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy Of SGP-1 depends little oil species-specific properties such as codon usage or the nucleotide distribution. SGP-1 may therefore be applied to nonstandard model organisms in vertebrates as well as in Plants, Without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 Output also contains comparisons between predicted and annotated gene structures in HTML format. The program call be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available oil request from the authors.
引用
收藏
页码:1574 / 1583
页数:10
相关论文
共 31 条
[1]  
ABRIL J, 1999, APLOT 2D VISUALIZATI
[2]   EFFICIENT STRING MATCHING - AID TO BIBLIOGRAPHIC SEARCH [J].
AHO, AV ;
CORASICK, MJ .
COMMUNICATIONS OF THE ACM, 1975, 18 (06) :333-340
[3]  
ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
[4]  
Bafna V, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P3
[5]   Human and mouse gene structure: Comparative analysis and application to exon prediction [J].
Batzoglou, S ;
Pachter, L ;
Mesirov, JP ;
Berger, B ;
Lander, ES .
GENOME RESEARCH, 2000, 10 (07) :950-958
[6]   Extensive duplication and reshuffling in the arabidopsis genome [J].
Blanc, G ;
Barakat, A ;
Guyot, R ;
Cooke, R ;
Delseny, I .
PLANT CELL, 2000, 12 (07) :1093-1101
[7]   RATES OF DNA-SEQUENCE EVOLUTION DIFFER BETWEEN TAXONOMIC GROUPS [J].
BRITTEN, RJ .
SCIENCE, 1986, 231 (4744) :1393-1398
[8]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94
[9]   Finding the genes in genomic DNA [J].
Burge, CB ;
Karlin, S .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1998, 8 (03) :346-354
[10]   Evaluation of gene structure prediction programs [J].
Burset, M ;
Guigo, R .
GENOMICS, 1996, 34 (03) :353-367