Domain enhanced lookup time accelerated BLAST

被引:618
作者
Boratyn, Grzegorz M. [1 ]
Schaeffer, Alejandro A. [1 ]
Agarwala, Richa [1 ]
Altschul, Stephen F. [1 ]
Lipman, David J. [1 ]
Madden, Thomas L. [1 ]
机构
[1] NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, Bethesda, MD 20894 USA
来源
BIOLOGY DIRECT | 2012年 / 7卷
关键词
DISTANTLY RELATED PROTEINS; DNA-BINDING SITES; PSI-BLAST; STRUCTURE ALIGNMENTS; REGULATORY PROTEINS; SEQUENCE ALIGNMENT; DATABASE SEARCHES; SCORE MATRICES; ACID SEQUENCES; INFORMATION;
D O I
10.1186/1745-6150-7-12
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: BLAST is a commonly-used software package for comparing a query sequence to a database of known sequences; in this study, we focus on protein sequences. Position-specific-iterated BLAST (PSI-BLAST) iteratively searches a protein sequence database, using the matches in round i to construct a position-specific score matrix (PSSM) for searching the database in round i + 1. Biegert and Soding developed Context-sensitive BLAST (CS-BLAST), which combines information from searching the sequence database with information derived from a library of short protein profiles to achieve better homology detection than PSI-BLAST, which builds its PSSMs from scratch. Results: We describe a new method, called domain enhanced lookup time accelerated BLAST (DELTA-BLAST), which searches a database of pre-constructed PSSMs before searching a protein-sequence database, to yield better homology detection. For its PSSMs, DELTA-BLAST employs a subset of NCBI's Conserved Domain Database (CDD). On a test set derived from ASTRAL, with one round of searching, DELTA-BLAST achieves a ROC5000 of 0.270 vs. 0.116 for CS-BLAST. The performance advantage diminishes in iterated searches, but DELTA-BLAST continues to achieve better ROC scores than CS-BLAST. Conclusions: DELTA-BLAST is a useful program for the detection of remote protein homologs. It is available under the "Protein BLAST" link at http://blast.ncbi.nlm.nih.gov.
引用
收藏
页数:15
相关论文
共 44 条
[1]   AMINO-ACID SUBSTITUTION MATRICES FROM AN INFORMATION THEORETIC PERSPECTIVE [J].
ALTSCHUL, SF .
JOURNAL OF MOLECULAR BIOLOGY, 1991, 219 (03) :555-565
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   A PROTEIN ALIGNMENT SCORING SYSTEM SENSITIVE AT ALL EVOLUTIONARY DISTANCES [J].
ALTSCHUL, SF .
JOURNAL OF MOLECULAR EVOLUTION, 1993, 36 (03) :290-300
[4]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[5]   PSI-BLAST pseudocounts and the minimum description length principle [J].
Altschul, Stephen F. ;
Gertz, E. Michael ;
Agarwala, Richa ;
Schaffer, Alejandro A. ;
Yu, Yi-Kuo .
NUCLEIC ACIDS RESEARCH, 2009, 37 (03) :815-824
[6]   Gleaning non-trivial structural, functional and evolutionary information about proteins by iterative database searches [J].
Aravind, L ;
Koonin, EV .
JOURNAL OF MOLECULAR BIOLOGY, 1999, 287 (05) :1023-1040
[7]   Periodic distributions of hydrophobic amino acids allows the definition of fundamental building blocks to align distantly related proteins [J].
Baussand, J. ;
Deremble, C. ;
Carbone, A. .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2007, 67 (03) :695-708
[8]   SELECTION OF DNA-BINDING SITES BY REGULATORY PROTEINS .2. THE BINDING-SPECIFICITY OF CYCLIC-AMP RECEPTOR PROTEIN TO RECOGNITION SITES [J].
BERG, OG ;
VONHIPPEL, PH .
JOURNAL OF MOLECULAR BIOLOGY, 1988, 200 (04) :709-723
[9]   SELECTION OF DNA-BINDING SITES BY REGULATORY PROTEINS - STATISTICAL-MECHANICAL THEORY AND APPLICATION TO OPERATORS AND PROMOTERS [J].
BERG, OG ;
VONHIPPEL, PH .
JOURNAL OF MOLECULAR BIOLOGY, 1987, 193 (04) :723-743
[10]   Sequence context-specific profiles for homology searching [J].
Biegert, A. ;
Soeding, J. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (10) :3770-3775