De novo prediction of three-dimensional structures for major protein families

被引:191
作者
Bonneau, R
Strauss, CEM
Rohl, CA
Chivian, D
Bradley, P
Malmström, L
Robertson, T
Baker, D
机构
[1] Univ Washington, Dept Biochem, Seattle, WA 98195 USA
[2] Los Alamos Natl Lab, Biosci Div, Los Alamos, NM 87544 USA
关键词
Rosetta; structure prediction; gene annotation; structural genomics; Pfam;
D O I
10.1016/S0022-2836(02)00698-8
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We use the Rosetta de novo structure prediction method to produce three-dimensional structure models for all Pfam-A sequence families with average length under 150 residues and no link to any protein of known structure. To estimate the reliability of the predictions, the method was calibrated on 131 proteins of known structure. For approximately 60% of the proteins one of the top five models was correctly predicted for 50 or more residues, and for approximately 35%, the correct SCOP superfamily was identified in a structure-based search of the Protein Data Bank using one of the models. This performance is consistent with results from the fourth critical assessment of structure prediction (CASP4). Correct and incorrect predictions could be partially distinguished using a confidence function based on a combination of simulation convergence, protein length and the similarity of a given structure prediction to known protein structures. While the limited accuracy and reliability of the method precludes definitive conclusions, the Pfam models provide the only tertiary structure information available for the 12% of publicly available sequences represented by these large protein families. (C) 2002 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:65 / 78
页数:14
相关论文
共 55 条
  • [1] AB E, 1997, PROTEIN SCI, V6, P304
  • [2] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [3] Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins
    Bateman, A
    Birney, E
    Durbin, R
    Eddy, SR
    Finn, RD
    Sonnhammer, ELL
    [J]. NUCLEIC ACIDS RESEARCH, 1999, 27 (01) : 260 - 262
  • [4] Bonneau R, 2001, PROTEINS, P119
  • [5] Bonneau R, 2001, PROTEINS, V43, P1, DOI 10.1002/1097-0134(20010401)43:1<1::AID-PROT1012>3.0.CO
  • [6] 2-A
  • [7] Functional inferences from blind ab initio protein structure predictions
    Bonneau, R
    Tsai, J
    Ruczinski, I
    Baker, D
    [J]. JOURNAL OF STRUCTURAL BIOLOGY, 2001, 134 (2-3) : 186 - 190
  • [8] The solution structure of the S1 RNA binding domain: A member of an ancient nucleic acid-binding fold
    Bycroft, M
    Hubbard, TJP
    Proctor, M
    Freund, SMV
    Murzin, AG
    [J]. CELL, 1997, 88 (02) : 235 - 242
  • [9] Crystal structure determination of aristolochene synthase from the blue cheese mold, Penicillium roqueforti
    Caruthers, JM
    Kang, I
    Rynkiewicz, MJ
    Cane, DE
    Christianson, DW
    [J]. JOURNAL OF BIOLOGICAL CHEMISTRY, 2000, 275 (33) : 25533 - 25539
  • [10] Functional analysis of the Escherichia coli genome using the sequence-to-structure-to-function paradigm:: Identification of proteins exhibiting the glutaredoxin/thioredoxin disulfide oxidoreductase activity
    Fetrow, JS
    Godzik, A
    Skolnick, J
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1998, 282 (04) : 703 - 711