Multiple alignment of complete sequences (MACS) in the post-genomic era

被引:44
作者
Lecompte, O [1 ]
Thompson, JD [1 ]
Plewniak, F [1 ]
Thierry, JC [1 ]
Poch, O [1 ]
机构
[1] ULP, INSERM, CNRS, Inst Genet & Biol Mol & Cellulaire,Lab Biol & Gen, F-67404 Illkirch Graffenstaden, France
关键词
bioinformatics; sequence analysis; functional genomics; genome annotation;
D O I
10.1016/S0378-1119(01)00461-9
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Multiple alignment, since its introduction in the early seventies, has become a cornerstone of modem molecular biology. It has traditionally been used to deduce structure / function by homology, to detect conserved motifs and in phylogenetic studies. There has recently been some renewed interest in the development of multiple alignment techniques, with current opinion moving away from a single all-encompassing algorithm to iterative and / or co-operative strategies. The exploitation of multiple alignments in genome annotation projects represents a qualitative leap in the functional analysis process, opening the way to the study of the co-evolution of validated sets of proteins and to reliable phylogenomic analysis. However, the alignment of the highly complex proteins detected by today's advanced database search methods is a daunting task, In addition, with the explosion of the sequence databases and with the establishment of numerous specialized biological databases, multiple alignment programs must evolve if they are to successfully rise to the new challenges of the post-genomic era. The way forward is clearly an integrated system bringing together sequence data, know-ledge-based systems and prediction methods with their inherent unreliability. The incorporation of such heterogeneous, often non-consistent, data will require major changes to the fundamental alignment algorithms used to date. Such an integrated multiple alignment system will provide an ideal workbench for the validation, propagation and presentation of this information in a format that is concise, clear and intuitive. (C) 2001 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:17 / 30
页数:14
相关论文
共 140 条
  • [1] Achard F, 1997, Pac Symp Biocomput, P39
  • [2] Alexandrov NN, 1998, PROTEIN SCI, V7, P254
  • [3] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [4] ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
  • [5] Andrade M A, 1999, Proc Int Conf Intell Syst Mol Biol, P28
  • [6] Automated genome sequence analysis and annotation
    Andrade, MA
    Brown, NP
    Leroy, C
    Hoersch, S
    de Daruvar, A
    Reich, C
    Franchini, A
    Tamames, J
    Valencia, A
    Ouzounis, C
    Sander, C
    [J]. BIOINFORMATICS, 1999, 15 (05) : 391 - 412
  • [7] Sequence alignment in molecular biology
    Apostolico, A
    Giancarlo, R
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 1998, 5 (02) : 173 - 196
  • [8] Gleaning non-trivial structural, functional and evolutionary information about proteins by iterative database searches
    Aravind, L
    Koonin, EV
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1999, 287 (05) : 1023 - 1040
  • [9] Lineage-specific loss and divergence of functionally linked genes in eukaryotes
    Aravind, L
    Watanabe, H
    Lipman, DJ
    Koonin, EV
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (21) : 11319 - 11324
  • [10] Homology modelling by distance geometry
    Aszodi, A
    Taylor, WR
    [J]. FOLDING & DESIGN, 1996, 1 (05): : 325 - 334