Functionality of system components: Conservation of protein function in protein feature space

被引:32
作者
Jensen, LJ [1 ]
Ussery, DW [1 ]
Brunak, S [1 ]
机构
[1] Tech Univ Denmark, Ctr Biol Sequence Anal, Biocentrum DTU, DK-2800 Lyngby, Denmark
关键词
D O I
10.1101/gr.1190803
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Many protein features useful for prediction of protein function can be predicted from sequence, including posttranslational modifications, subcellular localization, and physical/chemical properties. We show here that such protein features are more conserved among orthologs than paralogs, indicating they are crucial for protein function and thus subject to selective pressure. This means that a function prediction method based on sequence-derived features may be able to discriminate between proteins with different function even when they have highly similar structure. Also, such a method is likely to perform well on organisms other than the one on which it was trained. We evaluate the performance of such a method, ProtFun, which relies on protein features as its sole input, and show that the method gives similar performance for most eukaryotes and performs much better than anticipated on archaea and bacteria. From this analysis, we conclude that for the posttranslational modifications studied, both the cellular use and the sequence motifs are conserved within Eukarya.
引用
收藏
页码:2444 / 2449
页数:6
相关论文
共 25 条
[1]   Automated genome sequence analysis and annotation [J].
Andrade, MA ;
Brown, NP ;
Leroy, C ;
Hoersch, S ;
de Daruvar, A ;
Reich, C ;
Franchini, A ;
Tamames, J ;
Valencia, A ;
Ouzounis, C ;
Sander, C .
BIOINFORMATICS, 1999, 15 (05) :391-412
[2]   Assessing the accuracy of prediction algorithms for classification: an overview [J].
Baldi, P ;
Brunak, S ;
Chauvin, Y ;
Andersen, CAF ;
Nielsen, H .
BIOINFORMATICS, 2000, 16 (05) :412-424
[3]   GenBank [J].
Benson, DA ;
Karsch-Mizrachi, I ;
Lipman, DJ ;
Ostell, J ;
Rapp, BA ;
Wheeler, DL .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :17-20
[4]   Sequence and structure-based prediction of eukaryotic protein phosphorylation sites [J].
Blom, N ;
Gammeltoft, S ;
Brunak, S .
JOURNAL OF MOLECULAR BIOLOGY, 1999, 294 (05) :1351-1362
[5]  
Gupta R, 2002, E SCHERING RES FDN W, V38, P275
[6]   NetOglyc: Prediction of mucin type O-glycosylation sites based on sequence context and surface accessibility [J].
Hansen, JE ;
Lund, O ;
Tolstrup, N ;
Gooley, AA ;
Williams, KL ;
Brunak, S .
GLYCOCONJUGATE JOURNAL, 1998, 15 (02) :115-130
[7]   The Arabidopsis Information Resource (TAIR): a comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant [J].
Huala, E ;
Dickerman, AW ;
Garcia-Hernandez, M ;
Weems, D ;
Reiser, L ;
LaFond, F ;
Hanley, D ;
Kiphart, D ;
Zhuang, MZ ;
Huang, W ;
Mueller, LA ;
Bhattacharyya, D ;
Bhaya, D ;
Sobral, BW ;
Beavis, W ;
Meinke, DW ;
Town, CD ;
Somerville, C ;
Rhee, SY .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :102-105
[8]   The Ensembl genome database project [J].
Hubbard, T ;
Barker, D ;
Birney, E ;
Cameron, G ;
Chen, Y ;
Clark, L ;
Cox, T ;
Cuff, J ;
Curwen, V ;
Down, T ;
Durbin, R ;
Eyras, E ;
Gilbert, J ;
Hammond, M ;
Huminiecki, L ;
Kasprzyk, A ;
Lehvaslaiho, H ;
Lijnzaad, P ;
Melsopp, C ;
Mongin, E ;
Pettett, R ;
Pocock, M ;
Potter, S ;
Rust, A ;
Schmidt, E ;
Searle, S ;
Slater, G ;
Smith, J ;
Spooner, W ;
Stabenau, A ;
Stalker, J ;
Stupka, E ;
Ureta-Vidal, A ;
Vastrik, I ;
Clamp, M .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :38-41
[9]   Prediction of human protein function from post-translational modifications and localization features [J].
Jensen, LJ ;
Gupta, R ;
Blom, N ;
Devos, D ;
Tamames, J ;
Kesmir, C ;
Nielsen, H ;
Stærfeldt, HH ;
Rapacki, K ;
Workman, C ;
Andersen, CAF ;
Knudsen, S ;
Krogh, A ;
Valencia, A ;
Brunak, S .
JOURNAL OF MOLECULAR BIOLOGY, 2002, 319 (05) :1257-1265
[10]   Protein secondary structure prediction based on position-specific scoring matrices [J].
Jones, DT .
JOURNAL OF MOLECULAR BIOLOGY, 1999, 292 (02) :195-202