Mining α-helix-forming molecular recognition features with cross species sequence alignments

被引:249
作者
Cheng, Yugong
Oldfield, Christopher J.
Meng, Jingwei
Romero, Pedro [1 ]
Uversky, Vladimir N.
Dunker, A. Keith
机构
[1] Indiana Univ, Sch Med, Dept Biochem & Mol Biol, Ctr Computat Biol & Bioinformat, Indianapolis, IN 46202 USA
[2] Mol Kinet Inc, Indianapolis, IN 46268 USA
[3] Indiana Univ Purdue Univ, Sch Informat, Indianapolis, IN 46202 USA
[4] Russian Acad Sci, Inst Biol Instrumentat, Pushchino 142292, Moscow Region, Russia
关键词
D O I
10.1021/bi7012273
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Previously described algorithms for mining (x-helix-forming molecular recognition elements (MoREs), described by Oldfield et al. (Oldfield, C. J., Cheng, Y., Cortese, M. S., Brown, C. J., Uversky, V. N., and Dunker, A. K. (2005) Comparing and combining predictors of mostly disordered proteins, Biochemistry 44, 1989-2000), also known as molecular recognition features (MoRFs) (Mohan, A., Oldfield, C. J., Radivojac, P., Vacic, V., Cortese, M. S., Dunker, A. K., and Uversky, V. N. (2006) Analysis of Molecular Recognition Features (MoRFs), J. Mol. Biol. 362, 1043-1059), revealed that regions undergoing disorder-to-order transition are involved in many molecular recognition events and are crucial for protein-protein interactions. However, these algorithms were developed using a training data set of a limited size. Here we propose to improve the prediction algorithms by (1) including additional (alpha-MoRF examples and their cross species homologues in the positive training set, (2) carefully extracting monomer structure chains from the Protein Data Bank (PDB) as the negative training set, (3) including attributes from recently developed disorder predictors, secondary structure predictions, and amino acid indices, and (4) constructing neural network based predictors and performing validation. Over 50 regions which undergo disorder-to-order transition that were identified in the PDB together with a set of corresponding cross species homologues of each structure-based example were included in a new positive training set. Over 1500 attributes, including disorder predictions, secondary structure predictions, and amino acid indices, were evaluated by the conditional probability method. The top attributes, including VSL2 and VL3 disorder predictions and several physicochemical propensities of amino acid residues, were used to develop the feed forward neural networks. The sensitivity, specificity and accuracy of the resulting predictor, alpha-MoRF-PredII, were 0.87 +/- 0.10, 0.87 +/- 0.11, and 0.87 +/- 0.08 over 10 cross validations, respectively. We present the results of these analyses and validation examples to discuss the potential improvement of the (alpha-MoRF-PredII prediction accuracy.
引用
收藏
页码:13468 / 13477
页数:10
相关论文
共 77 条
  • [1] Anderson C. W., 2004, Handbook of Cell Signaling, P237
  • [2] Protein-protein interactions and cancer: small molecules going in for the kill
    Arkin, M
    [J]. CURRENT OPINION IN CHEMICAL BIOLOGY, 2005, 9 (03) : 317 - 324
  • [3] USE OF CONDITIONAL PROBABILITIES FOR DETERMINING RELATIONSHIPS BETWEEN AMINO-ACID-SEQUENCE AND PROTEIN SECONDARY STRUCTURE
    ARNOLD, GE
    DUNKER, AK
    JOHNS, SJ
    DOUTHART, RJ
    [J]. PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1992, 12 (04): : 382 - 399
  • [4] CHARACTERIZATION OF THE TUMOR SUPPRESSOR PROTEIN-P53 AS A PROTEIN-KINASE-C SUBSTRATE AND A S100B-BINDING PROTEIN
    BAUDIER, J
    DELPHIN, C
    GRUNWALD, D
    KHOCHBIN, S
    LAWRENCE, JJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1992, 89 (23) : 11627 - 11631
  • [5] Functional consequences of preorganized helical structure in the intrinsically disordered cell-cycle inhibitor p27Kip1
    Bienkiewicz, EA
    Adkins, JN
    Lumb, KJ
    [J]. BIOCHEMISTRY, 2002, 41 (03) : 752 - 759
  • [6] The C-terminal domain of measles virus nucleoprotein belongs to the class of intrinsically disordered proteins that fold upon binding to their physiological partner
    Bourhis, JM
    Johansson, K
    Receveur-Bréchot, V
    Oldfield, CJ
    Dunker, KA
    Canard, B
    Longhi, S
    [J]. VIRUS RESEARCH, 2004, 99 (02) : 157 - 167
  • [7] Studies of the RNA degradosome-organizing domain of the Escherichia coli ribonuclease RNase E
    Callaghan, AJ
    Aurikko, JP
    IIag, LL
    Grossmann, JG
    Chandran, V
    Kühnel, K
    Poljak, L
    Carpousis, AJ
    Robinson, CV
    Symmons, MF
    Luisi, BF
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2004, 340 (05) : 965 - 979
  • [8] Recognition of enolase in the Escherichia coli RNA degradosome
    Chandran, V
    Luisi, BF
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2006, 358 (01) : 8 - 15
  • [9] CHENG J, 2005, DATA MIN KNOWL DISCO, V11
  • [10] Structural basis for recruitment of glycogen synthase kinase 3β to the axin-APC scaffold complex
    Dajani, R
    Fraser, E
    Roe, SM
    Yeo, M
    Good, VM
    Thompson, V
    Dale, TC
    Pearl, LH
    [J]. EMBO JOURNAL, 2003, 22 (03) : 494 - 501