Translation initiation start prediction in human cDNAs with high accuracy

被引:57
作者
Hatzigeorgiou, AG
机构
[1] Metagen GmbH, D-14195 Berlin, Germany
[2] Synapt Ltd, Iraklion 71110, Greece
关键词
D O I
10.1093/bioinformatics/18.2.343
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Correct identification of the Translation Initiation Start (TIS) in cDNA sequences is an important issue for genome annotation. The aim of this work is to improve upon current methods and provide a performance guaranteed prediction. Methods: This is achieved by using two modules, One sensitive to the conserved motif and the other sensitive to the coding/non-coding potential around the start codon. Both modules are based on Artificial Neural Networks (ANNs). By applying the simplified method of the ribosome scanning model, the algorithm starts a linear search at, the beginning of the coding ORF and stops once the combination of the two modules predicts a positive score. Results: According to the results of the test group, 94% of the TIS were correctly predicted. A confident decision is obtained through the use of the Las Vegas algorithm idea. The incorporation of this algorithm leads to a highly accurate recognition of the TIS in human cDNAs for 60% of the cases.
引用
收藏
页码:343 / 350
页数:8
相关论文
共 21 条
[11]   Initial sequencing and analysis of the human genome [J].
Lander, ES ;
Int Human Genome Sequencing Consortium ;
Linton, LM ;
Birren, B ;
Nusbaum, C ;
Zody, MC ;
Baldwin, J ;
Devon, K ;
Dewar, K ;
Doyle, M ;
FitzHugh, W ;
Funke, R ;
Gage, D ;
Harris, K ;
Heaford, A ;
Howland, J ;
Kann, L ;
Lehoczky, J ;
LeVine, R ;
McEwan, P ;
McKernan, K ;
Meldrim, J ;
Mesirov, JP ;
Miranda, C ;
Morris, W ;
Naylor, J ;
Raymond, C ;
Rosetti, M ;
Santos, R ;
Sheridan, A ;
Sougnez, C ;
Stange-Thomann, N ;
Stojanovic, N ;
Subramanian, A ;
Wyman, D ;
Rogers, J ;
Sulston, J ;
Ainscough, R ;
Beck, S ;
Bentley, D ;
Burton, J ;
Clee, C ;
Carter, N ;
Coulson, A ;
Deadman, R ;
Deloukas, P ;
Dunham, A ;
Dunham, I ;
Durbin, R ;
French, L .
NATURE, 2001, 409 (6822) :860-921
[12]   Prediction whether a human cDNA sequence contains initiation codon by combining statistical information and similarity with protein sequences [J].
Nishikawa, T ;
Ota, T ;
Isogai, T .
BIOINFORMATICS, 2000, 16 (11) :960-967
[13]   Initiation of protein synthesis in eukaryotic cells [J].
Pain, VM .
EUROPEAN JOURNAL OF BIOCHEMISTRY, 1996, 236 (03) :747-771
[14]  
Pedersen AG, 1997, ISMB-97 - FIFTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS FOR MOLECULAR BIOLOGY, PROCEEDINGS, P226
[15]  
RIEDMILLER M, 1993, 1993 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS 1-3, P586, DOI 10.1109/ICNN.1993.298623
[16]   Assessing protein coding region integrity in cDNA sequencing projects [J].
Salamov, AA ;
Nishikawa, T ;
Swindells, MB .
BIOINFORMATICS, 1998, 14 (05) :384-390
[17]  
Salzberg SL, 1997, COMPUT APPL BIOSCI, V13, P365
[18]   The ubiquitously expressed human CYP51 encodes lanosterol 14 alpha-demethylase, a cytochrome P450 whose expression is regulated by oxysterols [J].
Stromstedt, M ;
Rozman, D ;
Waterman, MR .
ARCHIVES OF BIOCHEMISTRY AND BIOPHYSICS, 1996, 329 (01) :73-81
[19]   Las Vegas algorithms for gene recognition: Suboptimal and error-tolerant spliced alignment [J].
Sze, SH ;
Pevzner, PA .
JOURNAL OF COMPUTATIONAL BIOLOGY, 1997, 4 (03) :297-309
[20]  
ZELL A, 1993, SNNS USER MANUAL VER