Upcoming challenges for multiple sequence alignment methods in the high-throughput era

被引:138
作者
Kemena, Carsten [1 ]
Notredame, Cedric [1 ]
机构
[1] Pompeus Fabre Univ, Ctr Genom Regulat, Barcelona 08003, Spain
关键词
PROTEIN-STRUCTURE ALIGNMENT; ACCURATE; ALGORITHM; BENCHMARK; CONSISTENCY; HOMOLOGY; COFFEE; IDENTIFICATION; PREDICTION; MUSCLE;
D O I
10.1093/bioinformatics/btp452
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
This review focuses on recent trends in multiple sequence alignment tools. It describes the latest algorithmic improvements including the extension of consistency-based methods to the problem of template-based multiple sequence alignments. Some results are presented suggesting that template-based methods are significantly more accurate than simpler alternative methods. The validation of existing methods is also discussed at length with the detailed description of recent results and some suggestions for future validation strategies. The last part of the review addresses future challenges for multiple sequence alignment methods in the genomic era, most notably the need to cope with very large sequences, the need to integrate large amounts of experimental data, the need to accurately align non-coding and non-transcribed sequences and finally, the need to integrate many alternative methods and approaches.
引用
收藏
页码:2455 / 2465
页数:11
相关论文
共 79 条
[1]   Prediction of function divergence in protein families using the substitution rate variation parameter alpha [J].
Abhiman, Saraswathi ;
Daub, Carsten O. ;
Sonnhammer, Erik L. L. .
MOLECULAR BIOLOGY AND EVOLUTION, 2006, 23 (07) :1406-1413
[2]   The iRMSD: a local measure of sequence alignment accuracy using structural information [J].
Armougom, Fabrice ;
Moretti, Sebastien ;
Keduas, Vladimir ;
Notredame, Cedric .
BIOINFORMATICS, 2006, 22 (14) :E35-E39
[3]   Expresso: automatic incorporation of structural information in multiple sequence alignments using 3D-coffee [J].
Armougom, Fabrice ;
Moretti, Sebastien ;
Poirot, Olivier ;
Audic, Stephane ;
Dumas, Pierre ;
Schaeli, Basile ;
Keduas, Vladimir ;
Notredame, Cedric .
NUCLEIC ACIDS RESEARCH, 2006, 34 :W604-W608
[4]   Automated server predictions in CASP7 [J].
Battey, James N. D. ;
Kopp, Jurgen ;
Bordoli, Lorenza ;
Read, Randy J. ;
Clarke, Neil D. ;
Schwede, Torsten .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2007, 69 :68-82
[5]  
Bauer M, 2005, LECT NOTES COMPUT SC, V3692, P303
[6]   Local RNA base pairing probabilities in large sequences [J].
Bernhart, SH ;
Hofacker, IL ;
Stadler, PF .
BIOINFORMATICS, 2006, 22 (05) :614-615
[7]   Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project [J].
Birney, Ewan ;
Stamatoyannopoulos, John A. ;
Dutta, Anindya ;
Guigo, Roderic ;
Gingeras, Thomas R. ;
Margulies, Elliott H. ;
Weng, Zhiping ;
Snyder, Michael ;
Dermitzakis, Emmanouil T. ;
Stamatoyannopoulos, John A. ;
Thurman, Robert E. ;
Kuehn, Michael S. ;
Taylor, Christopher M. ;
Neph, Shane ;
Koch, Christoph M. ;
Asthana, Saurabh ;
Malhotra, Ankit ;
Adzhubei, Ivan ;
Greenbaum, Jason A. ;
Andrews, Robert M. ;
Flicek, Paul ;
Boyle, Patrick J. ;
Cao, Hua ;
Carter, Nigel P. ;
Clelland, Gayle K. ;
Davis, Sean ;
Day, Nathan ;
Dhami, Pawandeep ;
Dillon, Shane C. ;
Dorschner, Michael O. ;
Fiegler, Heike ;
Giresi, Paul G. ;
Goldy, Jeff ;
Hawrylycz, Michael ;
Haydock, Andrew ;
Humbert, Richard ;
James, Keith D. ;
Johnson, Brett E. ;
Johnson, Ericka M. ;
Frum, Tristan T. ;
Rosenzweig, Elizabeth R. ;
Karnani, Neerja ;
Lee, Kirsten ;
Lefebvre, Gregory C. ;
Navas, Patrick A. ;
Neri, Fidencio ;
Parker, Stephen C. J. ;
Sabo, Peter J. ;
Sandstrom, Richard ;
Shafer, Anthony .
NATURE, 2007, 447 (7146) :799-816
[8]   Fast embedding methods for clustering tens of thousands of sequences [J].
Blackshields, Gordon ;
Larkin, Mark ;
Wallace, Iain M. ;
Wilm, Andreas ;
Higgins, Desmond G. .
COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2008, 32 (04) :282-286
[9]  
Blackshields Gordon, 2006, In Silico Biol, V6, P321
[10]   Target selection and deselection at the Berkeley Structural Genomics Center [J].
Chandonia, JM ;
Kim, SH ;
Brenner, SE .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2006, 62 (02) :356-370