Comparative homology agreement search: An effective combination of homology-search methods

被引:16
作者
Alam, I
Dress, A
Rehmsmeier, M
Fuellen, G
机构
[1] AG Bioinformat, Dept Med, D-48149 Munster, Germany
[2] Max Planck Inst Math Sci, D-04103 Leipzig, Germany
[3] Univ Bielefeld, Ctr Biotechnol, Int NRW Grad Sch Bioinformat & Genome Res, D-33615 Bielefeld, Germany
[4] Univ Munster, Dept Biol, Div Bioinformat, D-48149 Munster, Germany
关键词
D O I
10.1073/pnas.0405612101
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Many methods have been developed to search for homologous members of a protein family in databases, and the reliability of results and conclusions may be compromised if only one method is used, neglecting the others. Here we introduce a general scheme for combining such methods. Based on this scheme, we implemented a tool called comparative homology agreement search (CHASE) that integrates different search strategies to obtain a combined "E value." Our results show that a consensus method integrating distinct strategies easily outperforms any of its component algorithms. More specifically, an evaluation based on the Structural Classification of Proteins database reveals that, on average, a coverage of 47% can be obtained in searches for distantly related homologues (i.e., members of the same superfamily but not the same family, which is a very difficult task), accepting only 10 false positives, whereas the individual methods obtain a coverage of 28-38%.
引用
收藏
页码:13814 / 13819
页数:6
相关论文
共 27 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]  
[Anonymous], 2002, Proc. of the Intl. Conf. on Research in Computational Molecular Biology
[4]   The InterPro database, an integrated documentation resource for protein families, domains and functional sites [J].
Apweiler, R ;
Attwood, TK ;
Bairoch, A ;
Bateman, A ;
Birney, E ;
Biswas, M ;
Bucher, P ;
Cerutti, T ;
Corpet, F ;
Croning, MDR ;
Durbin, R ;
Falquet, L ;
Fleischmann, W ;
Gouzy, J ;
Hermjakob, H ;
Hulo, N ;
Jonassen, I ;
Kahn, D ;
Kanapin, A ;
Karavidopoulou, Y ;
Lopez, R ;
Marx, B ;
Mulder, NJ ;
Oinn, TM ;
Pagni, M ;
Servant, F ;
Sigrist, CJA ;
Zdobnov, EM .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :37-40
[5]  
Bailey T., 1994, P 2 INT C INT SYST M, P28
[6]   Combining evidence using p-values: application to sequence homology searches [J].
Bailey, TL ;
Gribskov, M .
BIOINFORMATICS, 1998, 14 (01) :48-54
[7]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[8]   The PROSITE database, its status in 1997 [J].
Bairoch, A ;
Bucher, P ;
Hofmann, K .
NUCLEIC ACIDS RESEARCH, 1997, 25 (01) :217-221
[9]  
Bork P, 1996, METHOD ENZYMOL, V266, P162
[10]   JPred: a consensus secondary structure prediction server [J].
Cuff, JA ;
Clamp, ME ;
Siddiqui, AS ;
Finlay, M ;
Barton, GJ .
BIOINFORMATICS, 1998, 14 (10) :892-893