Structure-based function inference using protein family-specific fingerprints

被引:29
作者
Bandyopadhyay, Deepak
Huan, Jun
Liu, Jinze
Prins, Jan
Snoeyink, Jack
Wang, Wei
Tropsha, Alexander
机构
[1] Univ N Carolina, Sch Pharm Med Chem & Nat Prod, Chapel Hill, NC 27599 USA
[2] Univ N Carolina, Dept Comp Sci, Chapel Hill, NC 27599 USA
关键词
subgraph mining; Delaunay; almost-Delaunay; protein classification; structure; based function inference; structural genomics; orphan structures;
D O I
10.1110/ps.062189906
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We describe a method to assign a protein structure to a functional family using family-specific fingerprints. Fingerprints represent amino acid packing patterns that occur in most members of a family but are rare in the background, a nonredundant subset of PDB; their information is additional to sequence alignments, sequence patterns, structural superposition, and active-site templates. Fingerprints were derived for 120 families in SCOP using Frequent Subgraph Mining. For a new structure, all occurrences of these family-specific fingerprints may be found by a fast algorithm for subgraph isomorphism; the structure can then be assigned to a family with a confidence value derived from the number of fingerprints found and their distribution in background proteins. In validation experiments, we infer the function of new members added to SCOP families and we discriminate between structurally similar, but functionally divergent TIM barrel families. We then apply our method to predict function for several structural genomics proteins, including orphan structures. Some predictions have been corroborated by other computational methods and some validated by subsequent functional characterization.
引用
收藏
页码:1537 / 1543
页数:7
相关论文
共 36 条
[21]   The SUPERFAMILY database in 2004: additions and improvements [J].
Madera, M ;
Vogel, C ;
Kummerfeld, SK ;
Chothia, C ;
Gough, J .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D235-D239
[22]   Common Structural Cliques: a tool for protein structure and function analysis [J].
Milik, M ;
Szalma, S ;
Olszewski, KA .
PROTEIN ENGINEERING, 2003, 16 (08) :543-552
[23]  
MURZIN AG, 1995, J MOL BIOL, V247, P536, DOI 10.1016/S0022-2836(05)80134-2
[24]   Beyond annotation transfer by homology: novel protein-function prediction methods to assist drug discovery [J].
Ofran, Y ;
Punta, M ;
Schneider, R ;
Rost, B .
DRUG DISCOVERY TODAY, 2005, 10 (21) :1475-1482
[25]   Inference of protein function from protein structure [J].
Pal, D ;
Eisenberg, D .
STRUCTURE, 2005, 13 (01) :121-130
[26]   Automated prediction of protein function and detection of functional sites from structure [J].
Pazos, F ;
Sternberg, MJE .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (41) :14754-14759
[27]  
Pegg SCH, 2005, PACIFIC SYMPOSIUM ON BIOCOMPUTING 2005, P358
[28]   GenProtEC:: an updated and improved analysis of functions of Escherichia coli K-12 proteins [J].
Serres, MH ;
Goswami, S ;
Riley, M .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D300-D302
[29]   Finding functional sites in structural genomics proteins [J].
Stark, A ;
Shkumatov, A ;
Russell, RB .
STRUCTURE, 2004, 12 (08) :1405-1412
[30]   Annotation in three dimensions. PINTS: Patterns in non-homologous tertiary structures [J].
Stark, A ;
Russell, RB .
NUCLEIC ACIDS RESEARCH, 2003, 31 (13) :3341-3344