Protein linear indices of the 'macromolecular pseudograph α-carbon atom adjacency matrix' in bioinformatics.: Part 1:: Prediction of protein stability effects of a complete set of alanine substitutions in Arc repressor

被引:66
作者
Marrero-Ponce, Y [1 ]
Medina-Marrero, R
Castillo-Garit, JA
Romero-Zaldivar, V
Torrens, F
Castro, EA
机构
[1] Cent Univ Las Villas, Fac Chem Pharm, Dept Pharm, Santa Clara 54830, Villa Clara, Cuba
[2] Cent Univ Las Villas, Chem Bioact Ctr, Dept Drug Design, Santa Clara 54830, Villa Clara, Cuba
[3] Cent Univ Las Villas, Chem Bioact Ctr, Dept Microbiol, Santa Clara 54830, Villa Clara, Cuba
[4] Cent Univ Las Villas, Appl Chem Res Ctr, Santa Clara 54830, Villa Clara, Cuba
[5] Univ Cienfuegos, Fac Informat, Cienfuegos 55500, Cuba
[6] Univ Valencia, Inst Univ Ciencia Mol, E-46100 Burjassot, Valencia, Spain
[7] INIFTA, Div Quim Teor, RA-1900 La Plata, Argentina
关键词
protein stability; arc repressor; alanine-substitution mutant; TOMOCOMD-CAMPS software; protein linear indices; QSAR;
D O I
10.1016/j.bmc.2005.01.062
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A novel approach to bio-macromolecular design from a linear algebra point of view is introduced. A protein's total (whole protein) and local (one or more amino acid) linear indices are a new set of bio-macromolecular descriptors of relevance to protein QSAR/QSPR studies. These amino-acid level biochemical descriptors are based on the calculation of linear maps on R-n[f(k)(x(mi)) : R-n -> R-n] in canonical basis. These bio-macromolecular indices are calculated from the k (th) power of the macromolecular pseudograph alpha-carbon atom adjacency matrix. Total linear indices are linear functional on R-n. That is, the k (th) total linear indices are linear maps from R-n to the scalar R[f (k)(x(m)) : Rn -> R]. Thus, the k th total linear indices are calculated by summing the amino-acid linear indices of all amino acids in the protein molecule. A study of the protein stability effects for a complete set of alanine substitutions in the Arc repressor illustrates this approach. A quantitative model that discriminates near wild-type stability alanine mutants from the reduced-stability ones in a training series was obtained. This model permitted the correct classification of 97.56% (40/41) and 91.67% (11/12) of proteins in the training and test set, respectively. It shows a high Matthews correlation coefficient (MCC = 0.952) for the training set and an MCC = 0.837 for the external prediction set. Additionally, canonical regression analysis corroborated the statistical quality of the classification model (R-canc = 0.824). This analysis was also used to compute biological stability canonical scores for each Are alanine mutant. On the other hand, the linear piecewise regression model compared favorably with respect to the linear regression one on predicting the melting temperature (t(m)) of the Are alanine mutants. The linear model explains almost 81% of the variance of the experimental t(m) (R = 0.90 and s = 4.29) and the LOO press statistics evidenced its predictive ability (q(2) = 0.72 and s(cv) = 4.79). Moreover, the TOMOCOMD-CAMPS method produced a linear piecewise regression (R = 0.97) between protein backbone descriptors and tm values for alanine mutants of the Arc repressor. A break-point value of 51.87 degrees C characterized two mutant clusters and coincided perfectly with the experimental scale. For this reason, we can use the linear discriminant analysis and piecewise models in combination to classify and predict the stability of the mutant Arc homodimers. These models also permitted the interpretation of the driving forces of such folding process, indicating that topologic/topographic protein backbone interactions control the stability profile of wild-type Arc and its alanine mutants. (c) 2005 Elsevier Ltd. All rights reserved.
引用
收藏
页码:3003 / 3015
页数:13
相关论文
共 65 条
[1]  
ALBER T, 1989, ANNU REV BIOCHEM, V58, P765, DOI 10.1146/annurev.biochem.58.1.765
[2]  
Alberts B., 1994, MOL BIOL CELL
[3]   KINETICS OF FORMATION OF NATIVE RIBONUCLEASE DURING OXIDATION OF REDUCED POLYPEPTIDE CHAIN [J].
ANFINSEN, CB ;
HABER, E ;
SELA, M ;
WHITE, FH .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1961, 47 (09) :1309-+
[4]   PRINCIPLES THAT GOVERN FOLDING OF PROTEIN CHAINS [J].
ANFINSEN, CB .
SCIENCE, 1973, 181 (4096) :223-230
[5]  
[Anonymous], CHEMOMETRIC METHODS
[6]  
[Anonymous], MOLECULES
[7]  
AXLER S, 1996, LINEAR ALGEBRA DONE, P37
[8]  
Belsey D.A., 1980, Regression Diagnostics Identifying Influential Data and Sources of Collinearity
[9]   IDENTIFYING DETERMINANTS OF FOLDING AND ACTIVITY FOR A PROTEIN OF UNKNOWN STRUCTURE [J].
BOWIE, JU ;
SAUER, RT .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1989, 86 (07) :2152-2156
[10]   EQUILIBRIUM DISSOCIATION AND UNFOLDING OF THE ARC REPRESSOR DIMER [J].
BOWIE, JU ;
SAUER, RT .
BIOCHEMISTRY, 1989, 28 (18) :7139-7143