Use of multiple profiles corresponding to a sequence alignment enables effective detection of remote homologues

被引:21
作者
Anand, B [1 ]
Gowri, VS [1 ]
Srinivasan, N [1 ]
机构
[1] Indian Inst Sci, Mol Biophys Unit, Bangalore 560012, Karnataka, India
基金
英国惠康基金;
关键词
D O I
10.1093/bioinformatics/bti432
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Position specific scoring matrices (PSSMs) corresponding to aligned sequences of homologous proteins are commonly used in homology detection. A PSSM is generated on the basis of one of the homologues as a reference sequence, which is the query in the case of PSI-BLAST searches. The reference sequence is chosen arbitrarily while generating PSSMs for reverse BLAST searches. In this work we demonstrate that the use of multiple PSSMs corresponding to a given alignment and variable reference sequences is more effective than using traditional single PSSMs and hidden Markov models. Results: Searches for proteins with known 3-D structures have been made against three databases of protein family profiles corresponding to known structures: (1) One PSSM per family; (2) multiple PSSMs corresponding to an alignment and variable reference sequences for every family; and (3) hidden Markov models. A comparison of the performances of these three approaches suggests that the use of multiple PSSMs is most effective.
引用
收藏
页码:2821 / 2826
页数:6
相关论文
共 18 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   PALI - a database of Phylogeny and ALIgnment of homologous protein structures [J].
Balaji, S ;
Sujatha, S ;
Kumar, SSC ;
Srinivasan, N .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :61-65
[3]   HIDDEN MARKOV-MODELS OF BIOLOGICAL PRIMARY SEQUENCE INFORMATION [J].
BALDI, P ;
CHAUVIN, Y ;
HUNKAPILLER, T ;
MCCLURE, MA .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1994, 91 (03) :1059-1063
[4]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkr1065, 10.1093/nar/gkh121]
[5]   Profile hidden Markov models [J].
Eddy, SR .
BIOINFORMATICS, 1998, 14 (09) :755-763
[6]   SETOR - HARDWARE-LIGHTED 3-DIMENSIONAL SOLID MODEL REPRESENTATIONS OF MACROMOLECULES [J].
EVANS, SV .
JOURNAL OF MOLECULAR GRAPHICS, 1993, 11 (02) :134-&
[7]   Integration of related sequences with protein three-dimensional structural families in an updated version of PALI database [J].
Gowri, VS ;
Pandit, SB ;
Karthik, PS ;
Srinivasan, N ;
Balaji, S .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :486-488
[8]   PROFILE ANALYSIS - DETECTION OF DISTANTLY RELATED PROTEINS [J].
GRIBSKOV, M ;
MCLACHLAN, AD ;
EISENBERG, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1987, 84 (13) :4355-4358
[9]   Hidden Markov models for detecting remote protein homologies [J].
Karplus, K ;
Barrett, C ;
Hughey, R .
BIOINFORMATICS, 1998, 14 (10) :846-856
[10]   HIDDEN MARKOV-MODELS IN COMPUTATIONAL BIOLOGY - APPLICATIONS TO PROTEIN MODELING [J].
KROGH, A ;
BROWN, M ;
MIAN, IS ;
SJOLANDER, K ;
HAUSSLER, D .
JOURNAL OF MOLECULAR BIOLOGY, 1994, 235 (05) :1501-1531