Protein homology detection by HMM-HMM comparison

被引:1788
作者
Söding, J [1 ]
机构
[1] Max Planck Inst Dev Biol, Dept Prot Evolut, D-72076 Tubingen, Germany
关键词
D O I
10.1093/bioinformatics/bti125
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Protein homology detection and sequence alignment are at the basis of protein structure prediction, function prediction and evolution. Results: We have generalized the alignment of protein sequences with a profile hidden Markov model (HMM) to the case of pairwise alignment of profile HMMs. We present a method for detecting distant homologous relationships between proteins based on this approach. The method (HHsearch) is benchmarked together with BLAST, PSI-BLAST, HMMER and the profile-profile comparison tools PROF_SIM and COMPASS, in an all-against-all comparison of a database of 3691 protein domains from SCOP 1.63 with pairwise sequence identities below 20%. Sensitivity: When the predicted secondary structure is included in the HMMs, HHsearch is able to detect between 2.7 and 4.2 times more homologs than PSI-BLAST or HMMER and between 1.44 and 1.9 times more than COMPASS or PROF_SIM for a rate of false positives of 10%. Approximately half of the improvement over the profile-profile comparison methods is attributable to the use of profile HMMs in place of simple profiles. Alignment quality: Higher sensitivity is mirrored by an increased alignment quality. HHsearch produced 1.2, 1.7 and 3.3 times more good alignments ('balanced' score > 0.3) than the next best method (COMPASS), and 1.6, 2.9 and 9.4 times more than PSI-BLAST, at the family, superfamily and fold level, respectively. Speed: HHsearch scans a query of 200 residues against 3691 domains in 33 s on an AMD64 2GHz PC. This is 10 times faster than PROF_SIM and 17 times faster than COMPASS.
引用
收藏
页码:951 / 960
页数:10
相关论文
共 53 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]  
Barrett C, 1997, COMPUT APPL BIOSCI, V13, P191
[4]   Predicting functions from protein sequences - where are the bottlenecks? [J].
Bork, P ;
Koonin, EV .
NATURE GENETICS, 1998, 18 (04) :313-318
[5]   The ASTRAL Compendium in 2004 [J].
Chandonia, JM ;
Hon, G ;
Walker, NS ;
Lo Conte, L ;
Koehl, P ;
Levitt, M ;
Brenner, SE .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D189-D192
[6]   Predicting reliable regions in protein sequence alignments [J].
Cline, M ;
Hughey, R ;
Karplus, K .
BIOINFORMATICS, 2002, 18 (02) :306-314
[7]   SIMILAR AMINO-ACID-SEQUENCES - CHANCE OR COMMON ANCESTRY [J].
DOOLITTLE, RF .
SCIENCE, 1981, 214 (4517) :149-159
[8]  
Durbin R., 1998, Biological sequence analysis: Probabilistic models of proteins and nucleic acids
[9]   Profile hidden Markov models [J].
Eddy, SR .
BIOINFORMATICS, 1998, 14 (09) :755-763
[10]   A comparison of scoring functions for protein sequence profile alignment [J].
Edgar, RC ;
Sjölander, K .
BIOINFORMATICS, 2004, 20 (08) :1301-1308