GenTHREADER: An efficient and reliable protein fold recognition method for genomic sequences

被引:699
作者
Jones, DT [1 ]
机构
[1] Univ Warwick, Dept Biol Sci, Coventry CV4 7AL, W Midlands, England
关键词
genome; protein structure prediction; fold recognition; threading; sequence alignment;
D O I
10.1006/jmbi.1999.2583
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A new protein fold recognition method is described which is both fast and reliable. The method uses a traditional sequence alignment algorithm to generate alignments which are then evaluated by a method derived from threading techniques. As a final step, each threaded model is evaluated by a neural network in order to produce a single measure of confidence in the proposed prediction. The speed of the method, along with its sensitivity and very low false-positive rate makes it ideal for automatically predicting the structure of all the proteins in a translated bacterial genome (proteome). The method has been applied to the genome of Mycoplasma genitalium, and analysis of the results shows that as many as 46% of the proteins derived from the predicted protein coding regions have a significant relationship to a protein of known structure. Ln some cases, however, only one domain of the protein can be predicted, giving a total coverage of 30 % when calculated as a fraction of the number of amino acid residues in the whole proteome. (C) 1999 Academic Press.
引用
收藏
页码:797 / 815
页数:19
相关论文
共 50 条
  • [1] RECOGNITION OF DISTANTLY RELATED PROTEINS THROUGH ENERGY CALCULATIONS
    ABAGYAN, R
    FRISHMAN, D
    ARGOS, P
    [J]. PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1994, 19 (02): : 132 - 140
  • [2] Do aligned sequences share the same fold?
    Abagyan, RA
    Batalov, S
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1997, 273 (01) : 355 - 368
  • [3] ABOLA EE, 1987, CRYSTALLOGRAPHIC DAT, P107
  • [4] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [5] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [6] A novel model for the first nucleotide binding domain of the cystic fibrosis transmembrane conductance regulator
    Annereau, JP
    Wulbrand, U
    Vankeerberghen, A
    Cuppens, H
    Bontems, F
    Tummler, B
    Cassiman, JJ
    Stoven, V
    [J]. FEBS LETTERS, 1997, 407 (03) : 303 - 308
  • [7] PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES
    BERNSTEIN, FC
    KOETZLE, TF
    WILLIAMS, GJB
    MEYER, EF
    BRICE, MD
    RODGERS, JR
    KENNARD, O
    SHIMANOUCHI, T
    TASUMI, M
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) : 535 - 542
  • [8] BLEASBY AJ, 1994, NUCLEIC ACIDS RES, V22, P3574
  • [9] A METHOD TO IDENTIFY PROTEIN SEQUENCES THAT FOLD INTO A KNOWN 3-DIMENSIONAL STRUCTURE
    BOWIE, JU
    LUTHY, R
    EISENBERG, D
    [J]. SCIENCE, 1991, 253 (5016) : 164 - 170
  • [10] Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships
    Brenner, SE
    Chothia, C
    Hubbard, TJP
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (11) : 6073 - 6078