Protein homology detection by HMM-HMM comparison

被引:1799
作者
Söding, J [1 ]
机构
[1] Max Planck Inst Dev Biol, Dept Prot Evolut, D-72076 Tubingen, Germany
关键词
D O I
10.1093/bioinformatics/bti125
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Protein homology detection and sequence alignment are at the basis of protein structure prediction, function prediction and evolution. Results: We have generalized the alignment of protein sequences with a profile hidden Markov model (HMM) to the case of pairwise alignment of profile HMMs. We present a method for detecting distant homologous relationships between proteins based on this approach. The method (HHsearch) is benchmarked together with BLAST, PSI-BLAST, HMMER and the profile-profile comparison tools PROF_SIM and COMPASS, in an all-against-all comparison of a database of 3691 protein domains from SCOP 1.63 with pairwise sequence identities below 20%. Sensitivity: When the predicted secondary structure is included in the HMMs, HHsearch is able to detect between 2.7 and 4.2 times more homologs than PSI-BLAST or HMMER and between 1.44 and 1.9 times more than COMPASS or PROF_SIM for a rate of false positives of 10%. Approximately half of the improvement over the profile-profile comparison methods is attributable to the use of profile HMMs in place of simple profiles. Alignment quality: Higher sensitivity is mirrored by an increased alignment quality. HHsearch produced 1.2, 1.7 and 3.3 times more good alignments ('balanced' score > 0.3) than the next best method (COMPASS), and 1.6, 2.9 and 9.4 times more than PSI-BLAST, at the family, superfamily and fold level, respectively. Speed: HHsearch scans a query of 200 residues against 3691 domains in 33 s on an AMD64 2GHz PC. This is 10 times faster than PROF_SIM and 17 times faster than COMPASS.
引用
收藏
页码:951 / 960
页数:10
相关论文
共 53 条
[41]   LiveBench-6: Large-scale automated evaluation of protein structure prediction servers [J].
Rychlewski, L ;
Fischer, D ;
Elofsson, A .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 53 :542-547
[42]   COMPASS: A tool for comparison of multiple protein alignments with assessment of statistical significance [J].
Sadreyev, R ;
Grishin, N .
JOURNAL OF MOLECULAR BIOLOGY, 2003, 326 (01) :317-336
[43]   Profile-profile comparisons by COMPASS predict intricate homologies between protein families [J].
Sadreyev, RI ;
Baker, D ;
Grishin, NV .
PROTEIN SCIENCE, 2003, 12 (10) :2262-2272
[44]  
Sauder JM, 2000, PROTEINS, V40, P6, DOI 10.1002/(SICI)1097-0134(20000701)40:1<6::AID-PROT30>3.0.CO
[45]  
2-7
[46]   MaxSub: an automated measure for the assessment of protein structure prediction quality [J].
Siew, N ;
Elofsson, A ;
Rychiewski, L ;
Fischer, D .
BIOINFORMATICS, 2000, 16 (09) :776-785
[47]   On the role of structural information in remote homology detection and sequence alignment: New methods using hybrid sequence profiles [J].
Tang, CL ;
Xie, L ;
Koh, IYY ;
Posy, S ;
Alexov, E ;
Honig, B .
JOURNAL OF MOLECULAR BIOLOGY, 2003, 334 (05) :1043-1062
[48]   CLUSTAL-W - IMPROVING THE SENSITIVITY OF PROGRESSIVE MULTIPLE SEQUENCE ALIGNMENT THROUGH SEQUENCE WEIGHTING, POSITION-SPECIFIC GAP PENALTIES AND WEIGHT MATRIX CHOICE [J].
THOMPSON, JD ;
HIGGINS, DG ;
GIBSON, TJ .
NUCLEIC ACIDS RESEARCH, 1994, 22 (22) :4673-4680
[49]   FORTE: a profile-profile comparison tool for protein fold recognition [J].
Tomii, K ;
Akiyama, Y .
BIOINFORMATICS, 2004, 20 (04) :594-595
[50]   Comparative modeling in CASP5: Progress is evident, but alignment errors remain a significant hindrance [J].
Venclovas, C .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 53 :380-388