A comparison of scoring functions for protein sequence profile alignment

被引:81
作者
Edgar, RC [1 ]
Sjölander, K [1 ]
机构
[1] Univ Calif Berkeley, Dept Bioengn, Berkeley, CA 94720 USA
关键词
D O I
10.1093/bioinformatics/bth090
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation:In recent years, several methods have been proposed for aligning two protein sequence profiles, with reported improvements in alignment accuracy and homolog discrimination versus sequence-sequence methods (e.g. BLAST) and profile-sequence methods (e.g. PSI-BLAST). Profile-profile alignment is also the iterated step in progressive multiple sequence alignment algorithms such as CLUSTALW. However, little is known about the relative performance of different profile-profile scoring functions. In this work, we evaluate the alignment accuracy of 23 different profile-profile scoring functions by comparing alignments of 488 pairs of sequences with identity less than or equal to30% against structural alignments. We optimize parameters for all scoring functions on the same training set and use profiles of alignments from both PSI-BLAST and SAM-T99. Structural alignments are constructed from a consensus between the FSSP database and CE structural aligner. We compare the results with sequence-sequence and sequence-profile methods, including BLAST and PSI-BLAST. Results: We find that profile-profile alignment gives an average improvement over our test set of typically 2-3% over profile-sequence alignment and similar to40% over sequence-sequence alignment. No statistically significant difference is seen in the relative performance of most of the scoring functions tested. Significantly better results are obtained with profiles constructed from SAM-T99 alignments than from PSI-BLAST alignments.
引用
收藏
页码:1301 / 1308
页数:8
相关论文
共 41 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]  
[Anonymous], METHOD ENZYMOL
[4]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[5]   Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships [J].
Brenner, SE ;
Chothia, C ;
Hubbard, TJP .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (11) :6073-6078
[6]   Predicting reliable regions in protein sequence alignments [J].
Cline, M ;
Hughey, R ;
Karplus, K .
BIOINFORMATICS, 2002, 18 (02) :306-314
[7]  
CLINE M, 2000, THESIS U CALIFORNIA
[8]  
Dayhoff M.O., 1978, ATLAS PROTEIN SEQ ST, V5
[9]   SATCHMO:: sequence alignment and tree construction using hidden Markov models [J].
Edgar, RC ;
Sjölander, K .
BIOINFORMATICS, 2003, 19 (11) :1404-1411
[10]  
EDGAR RC, 2004, IN PRESS BIOINFORMAT