Loop-Length-Dependent SVM Prediction of Domain Linkers for High-Throughput Structural Proteomics

被引:35
作者
Ebina, Teppei [2 ]
Toh, Hiroyuki [1 ]
Kuroda, Yutaka [2 ]
机构
[1] Kyushu Univ, Med Inst Bioregulat, Div Bioinformat, Higashi Ku, Fukuoka 8128582, Japan
[2] Tokyo Univ Agr & Technol, Dept Biotechnol & Life Sci, Koganei, Tokyo 1848588, Japan
基金
日本学术振兴会;
关键词
support vector machine; high throughput protein dissection; structural domains; proteomics; PROTEIN SECONDARY STRUCTURE; BOUNDARY PREDICTION; IDENTIFICATION; REGIONS; ASSIGNMENT;
D O I
10.1002/bip.21105
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The prediction of structural domain s in novel protein sequences is becoming of practical importance. One important area of application is the development of computer-aided techniques for identifying, at a low cost, novel protein domain targets for large-scale functional and structural proteomics. Here, we report a loop-length-dependent support vector machine (SVM) prediction of domain linkers, which are loops separating two structural domains. (DLP-SVM is freely available at: http://www.tuat.ac.jp/similar to domserv/cgi-bin/DLP-SVM.cgi.) We constructed three loop-length-dependent SVM predictors of domain linkers (SVM-All, SVM-Long and SVM-Short), and also built SVM-Joint, which combines the results of SVM-Short and SVM-Long into a single consolidated prediction. The performances of SVM-Joint were, in most aspects, the highest, with a sensitivity of 59.7% and a specificity of 43.6%, which indicated that the specificity and the sensitivity were improved by over 2 and 3% respectively, when loop-length-dependent characteristics were taken into account. Furthermore, the sensitivity and specificity of SVM-Joint were, respectively, 37.6 and 17.4% higher than those of a random guess, and also superior to those of previously reported domain linker predictors. These results indicate that SVMs can be used to predict domain linkers, and that loop-length-dependent characteristics are useful for improving SVM prediction performances. (c) 2008 Wiley Periodicals, Inc. Biopolymers (Pept Sci) 92: 1-8, 2009.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 39 条
[1]   Prediction of protein interdomain linker regions by a hidden Markov model [J].
Bae, KW ;
Mallick, BK ;
Elsik, CG .
BIOINFORMATICS, 2005, 21 (10) :2264-2270
[2]  
Bateman A, 2002, NUCLEIC ACIDS RES, V30, P276, DOI [10.1093/nar/gkh121, 10.1093/nar/gkr1065, 10.1093/nar/gkp985]
[3]   Target selection for structural genomics [J].
Brenner, SE .
NATURE STRUCTURAL BIOLOGY, 2000, 7 (Suppl 11) :967-969
[4]  
Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482
[5]   Identification of protein domains by shotgun proteolysis [J].
Christ, D ;
Winter, G .
JOURNAL OF MOLECULAR BIOLOGY, 2006, 358 (02) :364-371
[6]   Structural proteomics: prospects for high throughput sample preparation [J].
Christendat, D ;
Yee, A ;
Dharamsi, A ;
Kluger, Y ;
Gerstein, M ;
Arrowsmith, CH ;
Edwards, AM .
PROGRESS IN BIOPHYSICS & MOLECULAR BIOLOGY, 2000, 73 (05) :339-345
[7]  
DeLano W.L., 2004, PYMOL USERS MANUAL
[8]   Domain boundary prediction based on profile domain linker propensity index [J].
Dong, QW ;
Wang, XL ;
Lin, L ;
Xu, ZM .
COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2006, 30 (02) :127-133
[9]   Armadillo: Domain boundary prediction by amino acid composition [J].
Dumontier, M ;
Yao, R ;
Feldman, HJ ;
Hogue, CWV .
JOURNAL OF MOLECULAR BIOLOGY, 2005, 350 (05) :1061-1073
[10]  
Folkers G. E., 2004, Journal of Structural and Functional Genomics, V5, P119, DOI 10.1023/B:JSFG.0000029200.66197.0c