Prediction of protein secondary structure content for the twilight zone sequences

被引:28
作者
Homaeian, Leila
Kurgan, Lukasz A.
Ruan, Jishou
Cios, Krzysztof J.
Chen, Ke
机构
[1] Univ Alberta, Dept Elect & Comp Engn, Edmonton, AB T6G 2V4, Canada
[2] Nankai Univ, Coll Math Sci, Chern Inst Math, Tianjin 300071, Peoples R China
[3] Nankai Univ, LPMC, Tianjin 300071, Peoples R China
[4] Univ Colorado, Dept Comp Sci & Engn, Denver, CO USA
[5] Univ Colorado, Hlth Sci Ctr, Denver, CO USA
关键词
AMINO-ACID-COMPOSITION; ACCURATE PREDICTION; HELIX/STRAND CONTENT; HOMOLOGY;
D O I
10.1002/prot.21527
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Secondary protein structure carries information about local structural arrangements, which include three major conformations: x-helices, P-strands, and coils. Significant majority of successfull methods for prediction of the secondary structure is based on multiple sequence alignment. However, multiple alignment fails to provide accurate results when a sequence comes from the twilight zone, that is, it is characterized by low (< 3096) homology. To this end, we propose a novel method for prediction of secondary structure content through comprehensive sequence representation, called PSSC-core. The method uses a multiple linear regression model and introduces a comprehensive feature-based sequence representation to predict amount of helices and strands for sequences from the twilight zone. The PSSC-core method was tested and compared with two other state-of-the-art prediction methods on a set of 2187 twilight zone sequences. The results indicate that our method provides better predictions for both helix and strand content. The PSSC-core is shown to provide statistically significantly better results when compared with the competing methods, reducing the prediction error by 5-7,% for helix and 7-9% for strand content predictions. The proposed feature-based sequence representation uses a comprehensive set of physicochemical properties that are custom-designed for each of the helix and strand content predictions. it includes composition and composition moment vectors, frequency of tetra-peptides associated with helical and strand conformations, various property-based groups like exchange groups, chemical groups of the side chains and hydrophobic group, auto-correlations based on hydrophobicity, side-chain masses, hydropathy, and conformational patterns for beta-sheets. The PSSC-core method provides an alternative for predicting the secondary structure content that can be used to validate and constrain results Of other structure prediction methods. At the same time, it also provides useful insight into design of successful protein sequence representations that can be used in developing new methods related to prediction of different aspects of the secondary protein structure. (C) 2007 Wiley-Liss, Inc.
引用
收藏
页码:486 / 498
页数:13
相关论文
共 46 条
[1]   Support vector machines for prediction of protein domain structural class [J].
Cai, YD ;
Liu, XJ ;
Xu, XB ;
Chou, KC .
JOURNAL OF THEORETICAL BIOLOGY, 2003, 221 (01) :115-120
[2]   Prediction of protein secondary structure content by artificial neural network [J].
Cai, YD ;
Liu, XJ ;
Chou, KC .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2003, 24 (06) :727-731
[3]   Prediction of protein structural class with Rough Sets [J].
Cao, YF ;
Liu, S ;
Zhang, LD ;
Qin, J ;
Wang, J ;
Tang, KX .
BMC BIOINFORMATICS, 2006, 7 (1)
[4]  
Chandonia JM, 1999, PROTEINS, V35, P293
[5]   Predicting protein structural class by functional domain composition [J].
Chou, KC ;
Cai, YD .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2004, 321 (04) :1007-1009
[6]   Using pair-coupled amino acid composition to predict protein secondary structure content [J].
Chou, KC .
JOURNAL OF PROTEIN CHEMISTRY, 1999, 18 (04) :473-480
[7]   THE HYDROPHOBIC MOMENT DETECTS PERIODICITY IN PROTEIN HYDROPHOBICITY [J].
EISENBERG, D ;
WEISS, RM ;
TERWILLIGER, TC .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA-BIOLOGICAL SCIENCES, 1984, 81 (01) :140-144
[8]  
Eisenhaber F, 1996, PROTEINS, V25, P157, DOI 10.1002/(SICI)1097-0134(199606)25:2<157::AID-PROT2>3.0.CO
[9]  
2-F
[10]   CAFASP3 in the spotlight of EVA [J].
Eyrich, VA ;
Przybylski, D ;
Koh, IYY ;
Grana, O ;
Pazos, F ;
Valencia, A ;
Rost, B .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 53 :548-560