Comparison of sequence and structure alignments for protein domains

被引:31
作者
Marchler-Bauer, A [1 ]
Panchenko, AR [1 ]
Ariel, N [1 ]
Bryant, SH [1 ]
机构
[1] NIH, Computat Biol Branch, Natl Ctr Biotechnol Informat, Bethesda, MD 20894 USA
关键词
protein domain identification; sequence alignment; structure alignment; conserved domain database;
D O I
10.1002/prot.10163
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Profile search methods based on protein domain alignments have proven to be useful tools in comparative sequence analysis. Domain alignments used by currently available search methods have been computed by sequence comparison. With the growth of the protein structure database, however, alignments of many domain pairs have also been computed by structure comparison. Here, we examine the extent to which information from these two sources agrees. We measure agreement with respect to identification of homologous regions in each protein, that is, with respect to the location of domain boundaries. We also measure agreement with respect to identification of homologous residue sites by comparing alignments and assessing the accuracy of the molecular models they predict. We find that domain alignments in publicly available collections based on sequence and structure comparison are largely consistent. However, the homologous regions identified by sequence comparison are often shorter than those identified by 3D structure comparison. In addition, when overall sequence similarity is low alignments from sequence comparison produce less accurate molecular models, suggesting that they less accurately identify homologous sites. These observations suggest that structure comparison results might be used to improve the overall accuracy of domain alignment collections and the performance of profile search methods based on them.
引用
收藏
页码:439 / 446
页数:8
相关论文
共 51 条
  • [1] ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
  • [2] The InterPro database, an integrated documentation resource for protein families, domains and functional sites
    Apweiler, R
    Attwood, TK
    Bairoch, A
    Bateman, A
    Birney, E
    Biswas, M
    Bucher, P
    Cerutti, T
    Corpet, F
    Croning, MDR
    Durbin, R
    Falquet, L
    Fleischmann, W
    Gouzy, J
    Hermjakob, H
    Hulo, N
    Jonassen, I
    Kahn, D
    Kanapin, A
    Karavidopoulou, Y
    Lopez, R
    Marx, B
    Mulder, NJ
    Oinn, TM
    Pagni, M
    Servant, F
    Sigrist, CJA
    Zdobnov, EM
    [J]. NUCLEIC ACIDS RESEARCH, 2001, 29 (01) : 37 - 40
  • [3] The Protein Data Bank and the challenge of structural genomics
    Berman, HM
    Bhat, TN
    Bourne, PE
    Feng, ZK
    Gilliland, G
    Weissig, H
    Westbrook, J
    [J]. NATURE STRUCTURAL BIOLOGY, 2000, 7 (Suppl 11) : 957 - 959
  • [4] THE RELATION BETWEEN THE DIVERGENCE OF SEQUENCE AND STRUCTURE IN PROTEINS
    CHOTHIA, C
    LESK, AM
    [J]. EMBO JOURNAL, 1986, 5 (04) : 823 - 826
  • [5] The ProDom database of protein domain families
    Corpet, F
    Gouzy, J
    Kahn, D
    [J]. NUCLEIC ACIDS RESEARCH, 1998, 26 (01) : 323 - 326
  • [6] Dengler U, 2001, PROTEINS, V42, P332, DOI 10.1002/1097-0134(20010215)42:3<332::AID-PROT40>3.0.CO
  • [7] 2-S
  • [8] THE MULTIPLICITY OF DOMAINS IN PROTEINS
    DOOLITTLE, RF
    [J]. ANNUAL REVIEW OF BIOCHEMISTRY, 1995, 64 : 287 - 314
  • [9] Eddy S R, 1995, J Comput Biol, V2, P9, DOI 10.1089/cmb.1995.2.9
  • [10] A comparison of sequence and structure protein domain families as a basis for structural genomics
    Elofsson, A
    Sonnhammer, ELL
    [J]. BIOINFORMATICS, 1999, 15 (06) : 480 - 500