Gene recognition in eukaryotic DNA by comparison of genomic sequences

被引:23
作者
Novichkov, PS [1 ]
Gelfand, MS [1 ]
Mironov, AA [1 ]
机构
[1] State Sci Ctr GosNIIGenet, Moscow 113545, Russia
关键词
D O I
10.1093/bioinformatics/17.11.1011
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Sequencing of complete eukaryotic genomes and large syntenic fragments of genomes makes it possible to apply genomic comparison for gene recognition. Results: This paper describes a spliced alignment algorithm that aligns candidate exon chains of two homologous genomic sequence fragments from different species. The algorithm is implemented in Pro-Gen software. Unlike other algorithms, Pro-Gen does not assume conservation of the exon-intron structure. Amino acid sequences obtained by the formal translation of candidate exons are aligned instead of nucleotide sequences, which allows for distant comparisons. The algorithm was tested on a sample of human-mammal (mouse), human-vertebrate (Xenopus) and human-invertebrate (Drosophila) gene pairs. Surprisingly, the best results, 97-98% correlation between the actual and predicted genes, were obtained for more distant comparisons, whereas the correlation on the human-mouse sample was only 93%. The latter value increases to 95% if conservation of the exon-intron structure is assumed. This is caused by a large amount of sequence conservation in non-coding regions of the human and mouse genes probably due to regulatory elements.
引用
收藏
页码:1011 / 1018
页数:8
相关论文
共 43 条
[1]   ISSUES IN SEARCHING MOLECULAR SEQUENCE DATABASES [J].
ALTSCHUL, SF ;
BOGUSKI, MS ;
GISH, W ;
WOOTTON, JC .
NATURE GENETICS, 1994, 6 (02) :119-129
[2]  
Ansari-Lari MA, 1998, GENOME RES, V8, P29
[3]  
Bafna V, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P3
[4]   Human and mouse gene structure: Comparative analysis and application to exon prediction [J].
Batzoglou, S ;
Pachter, L ;
Mesirov, JP ;
Berger, B ;
Lander, ES .
GENOME RESEARCH, 2000, 10 (07) :950-958
[5]  
BATZOGLOU S, 2000, P 4 ANN INT C COMP M, P46
[6]  
BIRNEY E, 1997, P 5 INT C INT SYST M, P56
[7]   The future of reproductive medicine [J].
Bouchard, P .
M S-MEDECINE SCIENCES, 1999, 15 (02) :139-140
[8]  
Brunner B, 1999, GENOME RES, V9, P437
[9]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94
[10]   Finding the genes in genomic DNA [J].
Burge, CB ;
Karlin, S .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1998, 8 (03) :346-354