Refining multiple sequence alignments with conserved core regions

被引:26
作者
Chakrabarti, Saikat [1 ]
Lanczycki, Christopher J. [1 ]
Panchenko, Anna R. [1 ]
Przytycka, Teresa M. [1 ]
Thiessen, Paul A. [1 ]
Bryant, Stephen H. [1 ]
机构
[1] Natl Lib Med, Natl Ctr Biotechnol Informat, NIH, Bethesda, MD 20894 USA
基金
美国国家卫生研究院;
关键词
D O I
10.1093/nar/gkl274
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Accurate multiple sequence alignments of proteins are very important to several areas of computational biology and provide an understanding of phylogenetic history of domain families, their identification and classification. This article presents a new algorithm, REFINER, that refines a multiple sequence alignment by iterative realignment of its individual sequences with the predetermined conserved core (block) model of a protein family. Realignment of each sequence can correct misalignments between a given sequence and the rest of the profile and at the same time preserves the family's overall block model. Large-scale benchmarking studies showed a noticeable improvement of alignment after refinement. This can be inferred from the increased alignment score and enhanced sensitivity for database searching using the sequence profiles derived from refined alignments compared with the original alignments. A standalone version of the program is available by ftp distribution (ftp://ftp.ncbi.nih.gov/pub/REFINER) and will be incorporated into the next release of the Cn3D structure/alignment viewer.
引用
收藏
页码:2598 / 2606
页数:9
相关论文
共 26 条
[1]   A STRATEGY FOR THE RAPID MULTIPLE ALIGNMENT OF PROTEIN SEQUENCES - CONFIDENCE LEVELS FROM TERTIARY STRUCTURE COMPARISONS [J].
BARTON, GJ ;
STERNBERG, MJE .
JOURNAL OF MOLECULAR BIOLOGY, 1987, 198 (02) :327-337
[2]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkr1065, 10.1093/nar/gkh121]
[3]  
BERGER MP, 1991, COMPUT APPL BIOSCI, V7, P479
[4]   MMDB: Entrez's 3D-structure database [J].
Chen, J ;
Anderson, JB ;
DeWeese-Scott, C ;
Fedorova, ND ;
Geer, LY ;
He, SQ ;
Hurwitz, DI ;
Jackson, JD ;
Jacobs, AR ;
Lanczycki, CJ ;
Liebert, CA ;
Liu, CL ;
Madej, T ;
Marchler-Bauer, A ;
Marchler, GH ;
Mazumder, R ;
Nikolskaya, AN ;
Rao, BS ;
Panchenko, AR ;
Shoemaker, BA ;
Simonyan, V ;
Song, JS ;
Thiessen, PA ;
Vasudevan, S ;
Wang, YL ;
Yamashita, RA ;
Yin, JJ ;
Bryant, SH .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :474-477
[5]   ProbCons: Probabilistic consistency-based multiple sequence alignment [J].
Do, CB ;
Mahabhashyam, MSP ;
Brudno, M ;
Batzoglou, S .
GENOME RESEARCH, 2005, 15 (02) :330-340
[6]   Profile hidden Markov models [J].
Eddy, SR .
BIOINFORMATICS, 1998, 14 (09) :755-763
[7]   PROGRESSIVE SEQUENCE ALIGNMENT AS A PREREQUISITE TO CORRECT PHYLOGENETIC TREES [J].
FENG, DF ;
DOOLITTLE, RF .
JOURNAL OF MOLECULAR EVOLUTION, 1987, 25 (04) :351-360
[8]   Surprising similarities in structure comparison [J].
Gibrat, JF ;
Madej, T ;
Bryant, SH .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1996, 6 (03) :377-385
[9]   Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments [J].
Gotoh, O .
JOURNAL OF MOLECULAR BIOLOGY, 1996, 264 (04) :823-838
[10]   Local weighting schemes for protein multiple sequence alignment [J].
Heringa, J .
COMPUTERS & CHEMISTRY, 2002, 26 (05) :459-477