Automatic prediction of protein function

被引:169
作者
Rost, B
Liu, J
Nair, R
Wrzeszczynski, KO
Ofran, Y
机构
[1] Columbia Univ, Dept Biochem & Mol Biophys, NE Struct Genom Consortium, New York, NY 10032 USA
[2] Columbia Univ, Ctr Computat Biol & Bioinformat C2B2, New York, NY 10032 USA
[3] Columbia Univ, Dept Pharmacol, New York, NY 10032 USA
[4] Columbia Univ, Dept Phys, New York, NY 10027 USA
[5] Columbia Univ, Dept Biomed Informat, New York, NY 10032 USA
关键词
genome analysis; protein function prediction; ab initio prediction; neural networks; multiple alignments; sequence analysis; subcellular localization; post-translational modifications; protein-protein interactions; bioinformatics;
D O I
10.1007/s00018-003-3114-8
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Most methods annotating protein function utilise sequence homology to proteins of experimentally known function. Such a homology-based annotation transfer is problematic and limited in scope. Therefore, computational biologists have begun to develop ab initio methods that predict aspects of function, including subcellular localization, post-translational modifications, functional type and protein-protein interactions. For the first two cases, the most accurate approaches rely on identifying short signalling motifs, while the most general methods utilise tools of artificial intelligence. An outstanding new method predicts classes of cellular function directly from sequence. Similarly, promising methods have been developed predicting protein-protein interaction partners at acceptable levels of accuracy for some pairs in entire proteomes. No matter how difficult the task, successes over the last few years have clearly paved the way for ab initio prediction of protein function.
引用
收藏
页码:2637 / 2650
页数:14
相关论文
共 180 条
[1]  
AIROZO D, 1999, MEDLINE
[2]   InterPreTS: protein Interaction Prediction through Tertiary Structure [J].
Aloy, P ;
Russell, RB .
BIOINFORMATICS, 2003, 19 (01) :161-162
[3]   Interrogating protein interaction networks through structural biology [J].
Aloy, P ;
Russell, RB .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (09) :5896-5901
[4]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[5]   Continuum secondary structure captures protein flexibility [J].
Anderson, CAF ;
Palmer, AG ;
Brunak, S ;
Rost, B .
STRUCTURE, 2002, 10 (02) :175-184
[6]   Automated extraction of information in molecular biology [J].
Andrade, MA ;
Bork, P .
FEBS LETTERS, 2000, 476 (1-2) :12-17
[7]   Automated genome sequence analysis and annotation [J].
Andrade, MA ;
Brown, NP ;
Leroy, C ;
Hoersch, S ;
de Daruvar, A ;
Reich, C ;
Franchini, A ;
Tamames, J ;
Valencia, A ;
Ouzounis, C ;
Sander, C .
BIOINFORMATICS, 1999, 15 (05) :391-412
[8]   Protein sequence databases [J].
Apweiler, R .
ADVANCES IN PROTEIN CHEMISTRY, VOL 54, 2000, 54 :31-71
[9]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[10]  
Bader GD, 2003, NUCLEIC ACIDS RES, V31, P248, DOI 10.1093/nar/gkg056