Selecting protein targets for structural genomics of Pyrobaculum aerophilum:: Validating automated fold assignment methods by using binary hypothesis testing

被引:23
作者
Mallick, P [1 ]
Goodwill, KE [1 ]
Fitz-Gibbon, S [1 ]
Miller, JH [1 ]
Eisenberg, D [1 ]
机构
[1] Univ Calif Los Angeles, DOE, Lab Struct Biol & Mol Med, Dept Chem & Biochem,Mol Biol Inst, Los Angeles, CA 90095 USA
关键词
D O I
10.1073/pnas.050589297
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Three-dimensional protein folds were assigned to all ORFs of the recently sequenced genome of the hyperthermophilic archaeon Pyrobaculum aerophilum, Binary hypothesis testing was used to estimate a confidence level for each assignment. A separate test was conducted to assign a probability for whether each sequence has a novel fold - i.e., one that is not yet represented in the experimental database of known structures. Of the 2,130 predicted nontransmembrane proteins in this organism, 916 matched a fold at a cumulative 90% confidence level, and 245 could be assigned at a 99% confidence level. Likewise, 286 proteins were predicted to have a previously unobserved fold with a 90% confidence level, and 14 at a 99% confidence level. These statistically based tools are combined with homology searches against the Online Mendelian Inheritance in Man (OMIM) human genetics database and other protein databases for the selection of attractive targets for crystallographic or NMR structure determination. Results of these studies have been collated and placed at http://www.doe-mbi, ucla.edu/people/parag/PA_HOME/, the University of California, Los Angeles-Department of Energy Pyrobaculum aerophilum web site.
引用
收藏
页码:2450 / 2455
页数:6
相关论文
共 29 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
[3]   A METHOD TO IDENTIFY PROTEIN SEQUENCES THAT FOLD INTO A KNOWN 3-DIMENSIONAL STRUCTURE [J].
BOWIE, JU ;
LUTHY, R ;
EISENBERG, D .
SCIENCE, 1991, 253 (5016) :164-170
[4]   INTERNAL DUPLICATION AND HOMOLOGY WITH BACTERIAL TRANSPORT PROTEINS IN THE MDR1 (P-GLYCOPROTEIN) GENE FROM MULTIDRUG-RESISTANT HUMAN-CELLS [J].
CHEN, CJ ;
CHIN, JE ;
UEDA, K ;
CLARK, DP ;
PASTAN, I ;
GOTTESMAN, MM ;
RONINSON, IB .
CELL, 1986, 47 (03) :381-389
[5]   THE HYDROPHOBIC MOMENT DETECTS PERIODICITY IN PROTEIN HYDROPHOBICITY [J].
EISENBERG, D ;
WEISS, RM ;
TERWILLIGER, TC .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA-BIOLOGICAL SCIENCES, 1984, 81 (01) :140-144
[6]   Assigning folds to the proteins encoded by the genome of Mycoplasma genitalium [J].
Fischer, D ;
Eisenberg, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1997, 94 (22) :11929-11934
[7]  
Fischer D, 1996, PROTEIN SCI, V5, P947
[8]   A fosmid-based genomic map and identification of 474 genes of the hyperthermophilic archaeon Pyrobaculum aerophilum [J].
FitzGibbon, S ;
Choi, AJ ;
Miller, JH ;
Stetter, KO ;
Simon, MI ;
Swanson, R ;
Kim, UJ .
EXTREMOPHILES, 1997, 1 (01) :36-51
[9]   IDENTIFICATION OF PROTEIN CODING REGIONS BY DATABASE SIMILARITY SEARCH [J].
GISH, W ;
STATES, DJ .
NATURE GENETICS, 1993, 3 (03) :266-272
[10]   CRYSTAL-STRUCTURE OF THE HETERODIMERIC BZIP TRANSCRIPTION FACTOR C-FOS-C-JUN BOUND TO DNA [J].
GLOVER, JNM ;
HARRISON, SC .
NATURE, 1995, 373 (6511) :257-261