PANTHER: a browsable database of gene products organized by biological function, using curated protein family and subfamily classification

被引:515
作者
Thomas, PD [1 ]
Kejariwal, A [1 ]
Campbell, MJ [1 ]
Mi, HY [1 ]
Diemer, K [1 ]
Guo, N [1 ]
Ladunga, I [1 ]
Ulitsky-Lazareva, B [1 ]
Muruganujan, A [1 ]
Rabkin, S [1 ]
Vandergriff, JA [1 ]
Doremieux, O [1 ]
机构
[1] Celera Genom, Prot Informat, Foster City, CA 94404 USA
关键词
D O I
10.1093/nar/gkg115
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The PANTHER database was designed for high-throughput analysis of protein sequences. One of the key features is a simplified ontology of protein function, which allows browsing of the database by biological functions. Biologist curators have associated the ontology terms with groups of protein sequences rather than individual sequences. Statistical models ( Hidden Markov Models, or HMMs) are built from each of these groups. The advantage of this approach is that new sequences can be automatically classified as they become available. To ensure accurate functional classification, HMMs are constructed not only for families, but also for functionally distinct subfamilies. Multiple sequence alignments and phylogenetic trees, including curator-assigned information, are available for each family. The current version of the PANTHER database includes training sequences from all organisms in the GenBank non-redundant protein database, and the HMMs have been used to classify gene products across the entire genomes of human, and Drosophila melanogaster. PANTHER is publicly available on the web at http://panther. celera.com.
引用
收藏
页码:334 / 341
页数:8
相关论文
共 11 条
  • [1] ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
  • [2] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [3] The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000
    Bairoch, A
    Apweiler, R
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 45 - 48
  • [4] The Celera Discovery System™
    Kerlavage, A
    Bonazzi, V
    di Tommaso, M
    Lawrence, C
    Li, P
    Mayberry, F
    Mural, R
    Nodell, M
    Yandell, M
    Zhang, JH
    Thomas, P
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 129 - 136
  • [5] MI H, UNPUB ASSESSMENT GEN
  • [6] Introducing RefSeq and LocusLink: curated human genome resources at the NCBI
    Pruitt, KD
    Katz, KS
    Sicotte, H
    Maglott, DR
    [J]. TRENDS IN GENETICS, 2000, 16 (01) : 44 - 47
  • [7] SMART, a simple modular architecture research tool: Identification of signaling domains
    Schultz, J
    Milpetz, F
    Bork, P
    Ponting, CP
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (11) : 5857 - 5864
  • [8] Sonnhammer ELL, 1997, PROTEINS, V28, P405, DOI 10.1002/(SICI)1097-0134(199707)28:3<405::AID-PROT10>3.0.CO
  • [9] 2-L
  • [10] THOMAS PD, UNPUB PANTHER LIB PR