Structure-based function inference using protein family-specific fingerprints

被引:29
作者
Bandyopadhyay, Deepak
Huan, Jun
Liu, Jinze
Prins, Jan
Snoeyink, Jack
Wang, Wei
Tropsha, Alexander
机构
[1] Univ N Carolina, Sch Pharm Med Chem & Nat Prod, Chapel Hill, NC 27599 USA
[2] Univ N Carolina, Dept Comp Sci, Chapel Hill, NC 27599 USA
关键词
subgraph mining; Delaunay; almost-Delaunay; protein classification; structure; based function inference; structural genomics; orphan structures;
D O I
10.1110/ps.062189906
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We describe a method to assign a protein structure to a functional family using family-specific fingerprints. Fingerprints represent amino acid packing patterns that occur in most members of a family but are rare in the background, a nonredundant subset of PDB; their information is additional to sequence alignments, sequence patterns, structural superposition, and active-site templates. Fingerprints were derived for 120 families in SCOP using Frequent Subgraph Mining. For a new structure, all occurrences of these family-specific fingerprints may be found by a fast algorithm for subgraph isomorphism; the structure can then be assigned to a family with a confidence value derived from the number of fingerprints found and their distribution in background proteins. In validation experiments, we infer the function of new members added to SCOP families and we discriminate between structurally similar, but functionally divergent TIM barrel families. We then apply our method to predict function for several structural genomics proteins, including orphan structures. Some predictions have been corroborated by other computational methods and some validated by subsequent functional characterization.
引用
收藏
页码:1537 / 1543
页数:7
相关论文
共 36 条
[1]   Automated structure-based prediction of functional sites in proteins: Applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking [J].
Aloy, P ;
Querol, E ;
Aviles, FX ;
Sternberg, MJE .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 311 (02) :395-408
[2]  
[Anonymous], ALMOST DELAUNAY SIMP
[3]   Phosphoesterase domains associated with DNA polymerases of diverse origins [J].
Aravind, L ;
Koonin, EV .
NUCLEIC ACIDS RESEARCH, 1998, 26 (16) :3746-3752
[4]   A GRAPH-THEORETIC APPROACH TO THE IDENTIFICATION OF 3-DIMENSIONAL PATTERNS OF AMINO-ACID SIDE-CHAINS IN PROTEIN STRUCTURES [J].
ARTYMIUK, PJ ;
POIRRETTE, AR ;
GRINDLEY, HM ;
RICE, DW ;
WILLETT, P .
JOURNAL OF MOLECULAR BIOLOGY, 1994, 243 (02) :327-344
[5]   The ENZYME database in 2000 [J].
Bairoch, A .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :304-305
[6]  
BANDYOPADHYAY D, 2004, TR04031 UNC DEP COMP
[7]  
BANDYOPADHYAY D, 2005, THESIS U N CAROLINA
[8]   An overview of structural genomics [J].
Burley, SK .
NATURE STRUCTURAL BIOLOGY, 2000, 7 (Suppl 11) :932-934
[9]   The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology [J].
Camon, E ;
Magrane, M ;
Barrell, D ;
Lee, V ;
Dimmer, E ;
Maslen, J ;
Binns, D ;
Harte, N ;
Lopez, R ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D262-D266
[10]   Improvement in protein functional site prediction by distinguishing structural and functional constraints on protein family evolution using computational design [J].
Cheng, G ;
Qian, B ;
Samudrala, R ;
Baker, D .
NUCLEIC ACIDS RESEARCH, 2005, 33 (18) :5861-5867