Prediction of subcellular protein localization based on functional domain composition

被引:33
作者
Jia, Peilin
Qian, Ziliang
Zeng, ZhenBin
Cai, Yudong
Li, Yixue
机构
[1] Chinese Acad Sci, Shanghai Inst Biol Sci, Key Lab Syst Biol, Bioinformat Ctr, Shanghai 200031, Peoples R China
[2] Chinese Acad Sci, Grad Sch, Beijing 100039, Peoples R China
[3] Chinese Acad Sci, Shanghai Inst Biol Sci, MPG Partner Inst Computat Biol, Shanghai, Peoples R China
[4] Univ Manchester, Dept Math, Manchester M60 1QD, Lancs, England
[5] Shanghai Ctr Bioinformat Technol, Shanghai 200235, Peoples R China
[6] Shanghai Jiao Tong Univ, Life Sci Sch, Shanghai 200030, Peoples R China
[7] E China Normal Univ, Software Engn Inst, Shanghai 200062, Peoples R China
关键词
protein subcellular localization; pfam; nearest neighbor algorithm;
D O I
10.1016/j.bbrc.2007.03.139
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Assigning subcellular localization (SL) to proteins is one of the major tasks of functional proteomics. Despite the impressive technical advances of the past decades, it is still time-consuming and laborious to experimentally determine SL on a high throughput scale. Thus, computational predictions are the preferred method for large-scale assignment of protein SL, and if appropriate, followed up by experimental studies. In this report, using a machine learning approach, the Nearest Neighbor Algorithm (NNA), we developed a prediction system for protein SL in which we incorporated a protein functional domain profile. The overall accuracy achieved by this system is 93.96%. Furthermore, comparisons with other methods have been conducted to demonstrate the validity and efficiency of our prediction system. We also provide an implementation of our Subcellular Location Prediction System (SLPS), which is available at http://pcal.biosino.org. (c) 2007 Elsevier Inc. All rights reserved.
引用
收藏
页码:366 / 370
页数:5
相关论文
共 20 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Adaptation of protein surfaces to subcellular location [J].
Andrade, MA ;
O'Donoghue, SI ;
Rost, B .
JOURNAL OF MOLECULAR BIOLOGY, 1998, 276 (02) :517-525
[3]  
[Anonymous], 2011, Pei. data mining concepts and techniques
[4]   The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 [J].
Boeckmann, B ;
Bairoch, A ;
Apweiler, R ;
Blatter, MC ;
Estreicher, A ;
Gasteiger, E ;
Martin, MJ ;
Michoud, K ;
O'Donovan, C ;
Phan, I ;
Pilbout, S ;
Schneider, M .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :365-370
[5]   Protein function classification via support vector machine approach [J].
Cai, CZ ;
Wang, WL ;
Sun, LZ ;
Chen, YZ .
MATHEMATICAL BIOSCIENCES, 2003, 185 (02) :111-122
[6]   Predicting enzyme family classes by hybridizing gene product composition and pseudo-amino acid composition [J].
Cai, YD ;
Zhou, GP ;
Chou, KC .
JOURNAL OF THEORETICAL BIOLOGY, 2005, 234 (01) :145-149
[7]   Predicting 22 protein localizations in budding yeast [J].
Cai, YD ;
Chou, KC .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2004, 323 (02) :425-428
[8]   Predicting protein localization in budding yeast [J].
Chou, KC ;
Cai, YD .
BIOINFORMATICS, 2005, 21 (07) :944-950
[9]   Prediction of protein subcellular locations by GO-FunD-PseAA predictor [J].
Chou, KC ;
Cai, YD .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2004, 320 (04) :1236-1239
[10]   Multi-class protein fold recognition using support vector machines and neural networks [J].
Ding, CHQ ;
Dubchak, I .
BIOINFORMATICS, 2001, 17 (04) :349-358