DAhunter: a web-based server that identifies homologous proteins by comparing domain architecture

被引:11
作者
Lee, Byungwook [1 ,2 ]
Lee, Doheon [2 ]
机构
[1] KRIBB, Korean BioInformat Ctr, Taejon 305806, South Korea
[2] Korea Adv Inst Sci & Technol, Dept Bio & Brain Engn, Taejon 305701, South Korea
关键词
D O I
10.1093/nar/gkn172
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We present DAhunter, a web-based server that identifies homologous proteins by comparing domain architectures, the organization of protein domains. A major obstacle in comparison of domain architecture is the existence of 'promiscuous' domains, which carry out auxiliary functions and appear in many unrelated proteins. To distinguish these promiscuous domains from protein domains, we assigned a weight score to each domain extracted from RefSeq proteins, based on its abundance and versatility. A domain's score represents its importance in the 'protein world' and is used in the comparison of domain architectures. In scoring domains, DAhunter also considers domain combinations as well as single domains. To measure the similarity of two domain architectures, we developed several methods that are based on algorithms used in information retrieval (the cosine similarity, the Goodman-Kruskal c function, and domain duplication index) and then combined these into a similarity score. Compared with other domain architecture algorithms, DAhunter is better at identifying homology. The server is available at http://www.dahunter.krandhttp://localodom.kobic.re.kr/dahunter/index.htm.
引用
收藏
页码:W60 / W64
页数:5
相关论文
共 23 条
[1]  
Apic G, 2001, Bioinformatics, V17 Suppl 1, pS83
[2]  
Bateman A, 2002, NUCLEIC ACIDS RES, V30, P276, DOI [10.1093/nar/gkr1065, 10.1093/nar/gkp985, 10.1093/nar/gkh121]
[3]  
Benson DA, 2010, NUCLEIC ACIDS RES, V38, pD46, DOI [10.1093/nar/gkp1024, 10.1093/nar/gkq1079, 10.1093/nar/gkl986, 10.1093/nar/gks1195, 10.1093/nar/gkw1070, 10.1093/nar/gkr1202, 10.1093/nar/gkn723, 10.1093/nar/gkx1094]
[4]   Domain rearrangements in protein evolution [J].
Björklund, ÅK ;
Ekman, D ;
Light, S ;
Frey-Skött, J ;
Elofsson, A .
JOURNAL OF MOLECULAR BIOLOGY, 2005, 353 (04) :911-923
[5]   An evolutionarily structured universe of protein architecture [J].
Caetano-Anollés, G ;
Caetano-Anollés, D .
GENOME RESEARCH, 2003, 13 (07) :1563-1571
[6]   Evolution of the protein repertoire [J].
Chothia, C ;
Gough, J ;
Vogel, C ;
Teichmann, SA .
SCIENCE, 2003, 300 (5626) :1701-1703
[7]   A tree of life based on protein domain organizations [J].
Fukami-Kobayashi, Kaoru ;
Minezaki, Yoshiaki ;
Tateno, Yoshio ;
Nishikawa, Ken .
MOLECULAR BIOLOGY AND EVOLUTION, 2007, 24 (05) :1181-1189
[8]   CDART: Protein homology by domain architecture [J].
Geer, LY ;
Domrachev, M ;
Lipman, DJ ;
Bryant, SH .
GENOME RESEARCH, 2002, 12 (10) :1619-1623
[9]   TXTGate: profiling gene groups with text-based information [J].
Glenisson, P ;
Coessens, B ;
Van Vooren, S ;
Mathys, J ;
Moreau, Y ;
De Moor, B .
GENOME BIOLOGY, 2004, 5 (06)
[10]   The Goodman-Kruskal coefficient and its applications in genetic diagnosis of cancer [J].
Jaroszewicz, S ;
Simovici, DA ;
Kuo, WP ;
Ohno-Machado, L .
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2004, 51 (07) :1095-1102