DDOMAIN: Dividing structures into domains using a normalized domain-domain interaction profile

被引:55
作者
Zhou, Hongyi
Xue, Bin
Zhou, Yaoqi [1 ]
机构
[1] Indiana Univ Purdue Univ, Sch Informat, Indianapolis, IN 46202 USA
[2] SUNY Buffalo, Dept Physiol & Biophys, Howard Hughes Med Inst, Ctr Computat Biol & Bioinformat, Buffalo, NY 14214 USA
[3] Indiana Univ, Sch Med, Ctr Computat Biol & Bioinformat, Indianapolis, IN 46202 USA
关键词
structure/function studies; structural proteins; new methods; domain parser;
D O I
10.1110/ps.062597307
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Dividing protein structures into domains is proven useful for more accurate structural and functional characterization of proteins. Here, we develop a method, called DDOMAIN, that divides structure into DOMAINs using a normalized contact-based domain-domain interaction profile. Results of DDOMAIN are compared to AUTHORS annotations ( domain definitions are given by the authors who solved protein structures), as well as to popular SCOP and CATH annotations by human experts and automatic programs. DDOMAIN's automatic annotations are most consistent with the AUTHORS annotations ( 90% agreement in number of domains and 88% agreement in both number of domains and at least 85% overlap in domain assignment of residues) if its three adjustable parameters are trained by the AUTHORS annotations. By comparison, the agreement is 83% ( 81% with at least 85% overlap criterion) between SCOP- trained DDOMAIN and SCOP annotations and 77% ( 73%) between CATH- trained DDOMAIN and CATH annotations. The agreement between DDOMAIN and AUTHORS annotations goes beyond single- domain proteins ( 97%, 82%, and 56% for single-, two-, and three- domain proteins, respectively). For an "easy'' data set of proteins whose CATH and SCOP annotations agree with each other in number of domains, the agreement is 90% ( 89%) between "easy-set''-trained DDOMAIN and CATH/ SCOP annotations. The consistency between SCOP- trained DDOMAIN and SCOP annotations is superior to two other recently developed, SCOP- trained, automatic methods PDP ( protein domain parser), and DomainParser 2. We also tested a simple consensus method made of PDP, DomainParser 2, and DDOMAIN and a different version of DDOMAIN based on a more sophisticated statistical energy function. The DDOMAIN server and its executable are available in the services section on http://sparks.informatics.iupui.edu.
引用
收藏
页码:947 / 955
页数:9
相关论文
共 35 条
[1]   PDP: protein domain parser [J].
Alexandrov, N ;
Shindyalov, I .
BIOINFORMATICS, 2003, 19 (03) :429-430
[2]  
Bateman A, 2002, NUCLEIC ACIDS RES, V30, P276, DOI [10.1093/nar/gkr1065, 10.1093/nar/gkp985, 10.1093/nar/gkh121]
[3]   Automated prediction of CASP-5 structures using the Robetta server [J].
Chivian, D ;
Kim, DE ;
Malmström, L ;
Bradley, P ;
Robertson, T ;
Murphy, P ;
Strauss, CEM ;
Bonneau, R ;
Rohl, CA ;
Baker, D .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 53 :524-533
[4]   TREE STRUCTURAL ORGANIZATION OF PROTEINS [J].
CRIPPEN, GM .
JOURNAL OF MOLECULAR BIOLOGY, 1978, 126 (03) :315-332
[5]   Improving the performance of DomainParser for structural domain partition using neural network [J].
Guo, JT ;
Xu, D ;
Kim, D ;
Xu, Y .
NUCLEIC ACIDS RESEARCH, 2003, 31 (03) :944-952
[6]   Exhaustive enumeration of protein domain families [J].
Heger, A ;
Holm, L .
JOURNAL OF MOLECULAR BIOLOGY, 2003, 328 (03) :749-767
[7]   Partitioning protein structures into domains: Why is it so difficult? [J].
Holland, Timothy A. ;
Veretnik, Stella ;
Shindyalov, Ilya N. ;
Bourne, Philip E. .
JOURNAL OF MOLECULAR BIOLOGY, 2006, 361 (03) :562-590
[8]  
Holm L, 1998, PROTEINS, V33, P88, DOI 10.1002/(SICI)1097-0134(19981001)33:1<88::AID-PROT8>3.0.CO
[9]  
2-H
[10]   PARSER FOR PROTEIN-FOLDING UNITS [J].
HOLM, L ;
SANDER, C .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 1994, 19 (03) :256-268