PIR-ALN: a database of protein sequence alignments

被引:18
作者
Srinivasarao, GY [1 ]
Yeh, LSL [1 ]
Marzec, CR [1 ]
Orcutt, BC [1 ]
Barker, WC [1 ]
机构
[1] Natl Biomed Res Fdn, Prot Informat Resource, Washington, DC 20007 USA
关键词
D O I
10.1093/bioinformatics/15.5.382
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The Protein Information Resource (PIR) maintains a database of annotated and curated alignments in order to visually represent interrelationships among sequences in the PIR-International Protein Sequence Database, to spread and standardize protein names, features and keywords among members of a family or superfamily, and to aid us in classifying sequences, in identifying conserved regions, and in defining new homology domains. Results: Release 22.0 (December 1998), of the PIR-ALN database contains a total of 3806 alignments, including 1303 superfamily, 2131 family and 372 homology domain alignments. This is an appropriate dataset to develop and extract patterns, test profiles, train neural networks or build Hidden Markov Models (HMMs). These alignments can be used to standardize and spread annotation to newer members by homology as well as to understand the modular architecture of multidomain proteins. PIR-ALN includes 529 alignments that can be used to develop patterns not represented in PROSITE. Blocks. PRINTS and Pfam databases. The ATLAS information retrieval system can be used to browse and query the PIR-ALN alignments.
引用
收藏
页码:382 / 390
页数:9
相关论文
共 63 条
  • [1] Single amino acid substitutions in proteins of the armadillo gene family abolish their binding to alpha-catenin
    Aberle, H
    Schwartz, H
    Hoschuetzky, H
    Kemler, R
    [J]. JOURNAL OF BIOLOGICAL CHEMISTRY, 1996, 271 (03) : 1520 - 1526
  • [2] An Integrated Sequence-Structure Database incorporating matching mRNA sequence, amino acid sequence and protein three-dimensional structure data
    Adzhubei, IA
    Adzhubei, AA
    Neidle, S
    [J]. NUCLEIC ACIDS RESEARCH, 1998, 26 (01) : 327 - 331
  • [3] Identification of Bmi1-interacting proteins as constituents of a multimeric mammalian Polycomb complex
    Alkema, MJ
    Bronk, M
    Verhoeven, E
    Otte, A
    vantVeer, LT
    Berns, A
    vanLohuizen, M
    [J]. GENES & DEVELOPMENT, 1997, 11 (02) : 226 - 240
  • [4] ATTWOOD T, 1997, COLOUR INTERACTIVE E, V3
  • [5] The PRINTS protein fingerprint database in its fifth year
    Attwood, TK
    Beck, ME
    Flower, DR
    Scordis, P
    Selley, JN
    [J]. NUCLEIC ACIDS RESEARCH, 1998, 26 (01) : 304 - 308
  • [6] Bachinsky AG, 1997, COMPUT APPL BIOSCI, V13, P115
  • [7] THE PROSITE DICTIONARY OF SITES AND PATTERNS IN PROTEINS, ITS CURRENT STATUS
    BAIROCH, A
    [J]. NUCLEIC ACIDS RESEARCH, 1993, 21 (13) : 3097 - 3103
  • [8] THE PIR-INTERNATIONAL DATABASES
    BARKER, W
    GEORGE, DG
    MEWES, HW
    PFEIFFER, F
    TSUGITA, A
    [J]. NUCLEIC ACIDS RESEARCH, 1993, 21 (13) : 3089 - 3092
  • [9] Barker WC, 1996, METHOD ENZYMOL, V266, P59
  • [10] The PIR-International Protein Sequence Database
    Barker, WC
    Garavelli, JS
    McGarvey, PB
    Marzec, CR
    Orcutt, BC
    Srinivasarao, GY
    Yeh, LSL
    Ledley, RS
    Mewes, HW
    Pfeiffer, F
    Tsugita, A
    Wu, C
    [J]. NUCLEIC ACIDS RESEARCH, 1999, 27 (01) : 39 - 43