Orphan gene finding -: an exon assembly approach

被引:10
作者
Blayo, P
Rouzé, P
Sagot, MF
机构
[1] Univ Lyon 1, INRIA Rhone Alpes, Lab Biometrie & Biol Evolut, F-69622 Villeurbanne, France
[2] State Univ Ghent, Lab Associe, INRA, B-9000 Ghent, Belgium
[3] Univ Marne la Vallee, Inst Gaspard Monge, F-77454 Marne La Vallee, France
关键词
orphan gene; gene finding; exon assembly; DNA/DNA and DNA/protein comparison; coding DNA comparison models; dynamic programming;
D O I
10.1016/S0304-3975(02)00043-9
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper introduces an algorithm for finding eukaryotic genes. It particularly addresses the problem of orphan genes, that is of genes that cannot, based on homology alone, be connected to any known gene family and to which it is therefore not possible to apply traditional gene finding methods. To the best of our knowledge, this is also the first algorithm that attempts to compare in an exact way two DNA sequences that contain both coding (i.e. exonic) and non-coding (i.e. intronic and, possibly, intergenic) parts. The comparison is performed following an algorithmical model of a gene that is as close as possible to the biological one (we consider in this paper the "one ORF, one gene" problem only). A gene is seen as a set of exons that are pieces of an assembly and are not independent. The algorithm is efficient enough: although the constants are higher than for usual sequence comparison, its time complexity is proportional to the product of the sequences lengths while its space complexity scales linearly with the length of the smallest sequence. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:1407 / 1431
页数:25
相关论文
共 36 条
[1]  
ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
[2]  
[Anonymous], 1997, THESIS STANFORD U ST
[3]  
[Anonymous], 1978, Atlas of protein sequence and structure
[4]  
Arvestad L, 1997, LECT NOTES COMPUT SC, V1264, P180
[5]  
BAFNA V, 2000, 8 INT C INT SYST MOL
[6]   Human and mouse gene structure: Comparative analysis and application to exon prediction [J].
Batzoglou, S ;
Pachter, L ;
Mesirov, JP ;
Berger, B ;
Lander, ES .
GENOME RESEARCH, 2000, 10 (07) :950-958
[7]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94
[8]   Evaluation of gene structure prediction programs [J].
Burset, M ;
Guigo, R .
GENOMICS, 1996, 34 (03) :353-367
[9]   Computational methods for the identification of genes in vertebrate genomic sequences [J].
Claverie, JM .
HUMAN MOLECULAR GENETICS, 1997, 6 (10) :1735-1744
[10]   The yeast genome project: What did we learn? [J].
Dujon, B .
TRENDS IN GENETICS, 1996, 12 (07) :263-270