PANTHER: A library of protein families and subfamilies indexed by function

被引:2301
作者
Thomas, PD [1 ]
Campbell, MJ [1 ]
Kejariwal, A [1 ]
Mi, HY [1 ]
Karlak, B [1 ]
Daverman, R [1 ]
Diemer, K [1 ]
Muruganujan, A [1 ]
Narechania, A [1 ]
机构
[1] Celera Genom, Prot Informat, Foster City, CA 94404 USA
关键词
D O I
10.1101/gr.772403
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In the genomic era, one of the fundamental goals is to characterize the function of proteins on a large scale. We describe a method, PANTHER, for relating protein sequence relationships to function relationships in a robust and accurate way. PANTHER is composed of two main components: the PANTHER library (PANTHER/LIB) and the PANTHER index (PANTHER/X). PANTHER/LIB is a collection of "books," each representing a protein family as a multiple sequence alignment, a Hidden Markov Model (HMM), and a family tree. Functional divergence within the family is represented by dividing the tree into subtrees based on shared function, and by subtree HMMs. PANTHER/X is an abbreviated ontology for summarizing and navigating molecular functions and biological processes associated with the families and subfamilies. We apply PANTHER to three areas of active research. First, we report the size and sequence diversity of the families and subfamilies, characterizing the relationship between sequence divergence and functional divergence across a wide range of protein families. Second, we use the PANTHER/X ontology to give a high-level representation of gene function across the human and mouse genomes. Third, we use the family HMMs to rank missense single nucleotide polymorphisms (SNPs), on a database-wide scale, according to their likelihood of affecting protein function.
引用
收藏
页码:2129 / 2141
页数:13
相关论文
共 45 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[3]  
ATTWOOD TK, 1994, NUCLEIC ACIDS RES, V22, P3590
[4]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[5]   PROSITE - A DICTIONARY OF SITES AND PATTERNS IN PROTEINS [J].
BAIROCH, A .
NUCLEIC ACIDS RESEARCH, 1992, 20 :2013-2018
[6]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
[7]   Allelic and locus heterogeneity in inherited venous malformations [J].
Calvert, JT ;
Riney, TJ ;
Kontos, CD ;
Cha, EH ;
Prieto, VG ;
Shea, CR ;
Berg, JN ;
Nevin, NC ;
Simpson, SA ;
Pasyk, KA ;
Speer, MC ;
Peters, KG ;
Marchuk, DA .
HUMAN MOLECULAR GENETICS, 1999, 8 (07) :1279-1289
[8]   Characterization of single-nucleotide polymorphisms in coding regions of human genes [J].
Cargill, M ;
Altshuler, D ;
Ireland, J ;
Sklar, P ;
Ardlie, K ;
Patil, N ;
Lane, CR ;
Lim, EP ;
Kalyanaraman, N ;
Nemesh, J ;
Ziaugra, L ;
Friedland, L ;
Rolfe, A ;
Warrington, J ;
Lipshutz, R ;
Daley, GQ ;
Lander, ES .
NATURE GENETICS, 1999, 22 (03) :231-238
[9]   NUCLEOTIDE-SEQUENCE EVIDENCE FOR RELATIONSHIP OF AIDS RETROVIRUS TO LENTIVIRUSES [J].
CHIU, IM ;
YANIV, A ;
DAHLBERG, JE ;
GAZIT, A ;
SKUNTZ, SF ;
TRONICK, SR ;
AARONSON, SA .
NATURE, 1985, 317 (6035) :366-368
[10]   The human gene mutation database [J].
Cooper, DN ;
Ball, EV ;
Krawczak, M .
NUCLEIC ACIDS RESEARCH, 1998, 26 (01) :285-287