Prediction of protease types in a hybridization space

被引:66
作者
Chou, KC
Cai, YD
机构
[1] Gordon Life Sci Inst, San Diego, CA 92130 USA
[2] Shanghai Univ, Coll Sci, Dept Chem, Shanghai 200436, Peoples R China
[3] Univ Manchaster Sci & Technol, Dept Biomed Sci, Manchester M60 1QD, Lancs, England
关键词
protease; FunD-PseAA predictor; functional domain; ISort predictor; hybridization space; proteomics; bioinformatics; jackknife cross-validation;
D O I
10.1016/j.bbrc.2005.10.196
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Regulating most physiological processes by controlling the activation, synthesis, and turnover of proteins, proteases play pivotal regulatory roles in conception, birth, digestion, growth, maturation, ageing, and death of all organisms. Different types of proteases have different functions and biological processes. Therefore, it is important for both basic research and drug discovery to consider the following two problems. (1) Given the sequence of a protein, can we identify whether it is a protease or non-protease? (2) If it is, what protease type does it belong to? Although the two problems can be solved by various experimental means, it is both time-consuming and costly to do so. The avalanche of protein sequences generated in the post-genetic era has challenged us to develop an automated method for making a fast and reliable identification. By hybridizing the functional domain composition and pseudo-amino acid composition, we have introduced a new method called "FunD-PseAA(1) predictor" that is operated in a hybridization space. To avoid redundancy and bias, demonstrations were performed on a dataset where none of the proteins has >= 25% sequence identity to any other. The overall success rate thus obtained by the jackknife cross-validation test in identifying protease and non-protease was 92.95%, and that in identifying the protease type was 94.75% among the following six types: (1) aspartic, (2) cysteine, (3) glutamic, (4) metallo, (5) serine, and (6) threonine. Demonstration was also made on an independent dataset, and the corresponding overall success rates were 98.36% and 97.11%, respectively, suggesting the FunD-PseAA predictor is very powerful and may become a useful tool in bioinformatics and proteomics. (c) 2005 Elsevier Inc. All rights reserved.
引用
收藏
页码:1015 / 1020
页数:6
相关论文
共 52 条
[1]   The InterPro database, an integrated documentation resource for protein families, domains and functional sites [J].
Apweiler, R ;
Attwood, TK ;
Bairoch, A ;
Bateman, A ;
Birney, E ;
Biswas, M ;
Bucher, P ;
Cerutti, T ;
Corpet, F ;
Croning, MDR ;
Durbin, R ;
Falquet, L ;
Fleischmann, W ;
Gouzy, J ;
Hermjakob, H ;
Hulo, N ;
Jonassen, I ;
Kahn, D ;
Kanapin, A ;
Karavidopoulou, Y ;
Lopez, R ;
Marx, B ;
Mulder, NJ ;
Oinn, TM ;
Pagni, M ;
Servant, F ;
Sigrist, CJA ;
Zdobnov, EM .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :37-40
[2]  
Bahar I, 1997, PROTEINS, V29, P172, DOI 10.1002/(SICI)1097-0134(199710)29:2<172::AID-PROT5>3.3.CO
[3]  
2-D
[4]   The SWISS-PROT protein sequence data bank and its supplement TrEMBL [J].
Bairoch, A ;
Apweller, R .
NUCLEIC ACIDS RESEARCH, 1997, 25 (01) :31-36
[5]   Relation between amino acid composition and cellular location of proteins [J].
Cedano, J ;
Aloy, P ;
PerezPons, JA ;
Querol, E .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 266 (03) :594-600
[6]  
CHANDONIA JM, 1995, PROTEIN SCI, V4, P275
[7]   Solution structure of BID, an intracellular amplifier of apoptotic signaling [J].
Chou, JJ ;
Li, HL ;
Salvesen, GS ;
Yuan, JY ;
Wagner, G .
CELL, 1999, 96 (05) :615-624
[8]   Solution structure of the RAIDD CARD and model for CARD/CARD interaction in caspase-2 and caspase-9 recruitment [J].
Chou, JJ ;
Matsuo, H ;
Duan, H ;
Wagner, G .
CELL, 1998, 94 (02) :171-180
[9]   A JOINT PREDICTION OF THE FOLDING TYPES OF 1490 HUMAN PROTEINS FROM THEIR GENETIC CODONS [J].
CHOU, JJW ;
ZHANG, CT .
JOURNAL OF THEORETICAL BIOLOGY, 1993, 161 (02) :251-262
[10]   Binding mechanism of coronavirus main proteinase with ligands and its implication to drug design against SARS [J].
Chou, KC ;
Wei, DQ ;
Zhong, WZ .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2003, 308 (01) :148-151