DETECTING SUBTLE SEQUENCE SIGNALS - A GIBBS SAMPLING STRATEGY FOR MULTIPLE ALIGNMENT

被引:1222
作者
LAWRENCE, CE
ALTSCHUL, SF
BOGUSKI, MS
LIU, JS
NEUWALD, AF
WOOTTON, JC
机构
[1] HARVARD UNIV, DEPT STAT, CAMBRIDGE, MA 02138 USA
[2] NEW YORK STATE DEPT HLTH, WADSWORTH CTR LABS & RES, BIOMETR LAB, ALBANY, NY 12201 USA
关键词
D O I
10.1126/science.8211139
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
A wealth of protein and DNA sequence data is being generated by genome projects and other sequencing efforts. A crucial barrier to deciphering these sequences and understanding the relations among them is the difficulty of detecting subtle local residue patterns common to multiple sequences. Such patterns frequently reflect similar molecular structures and biological properties. A mathematical definition of this ''local multiple alignment'' problem suitable for full computer automation has been used to develop a new and sensitive algorithm, based on the statistical method of iterative sampling. This algorithm finds an optimized local alignment model for N sequences in N-linear time, requiring only seconds on current workstations, and allows the simultaneous detection and optimization of multiple patterns and pattern repeats. The method is illustrated as applied to helix-turn-helix proteins, lipocalins, and prenyltransferases.
引用
收藏
页码:208 / 214
页数:7
相关论文
共 116 条
[101]  
STADEN R, 1989, COMPUT APPL BIOSCI, V5, P293
[102]   IDENTIFYING PROTEIN-BINDING SITES FROM UNALIGNED DNA FRAGMENTS [J].
STORMO, GD ;
HARTZELL, GW .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1989, 86 (04) :1183-1187
[103]   A METHOD FOR MULTIPLE SEQUENCE ALIGNMENT WITH GAPS [J].
SUBBIAH, S ;
HARRISON, SC .
JOURNAL OF MOLECULAR BIOLOGY, 1989, 209 (04) :539-548
[104]   THE CALCULATION OF POSTERIOR DISTRIBUTIONS BY DATA AUGMENTATION [J].
TANNER, MA ;
WING, HW .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1987, 82 (398) :528-540
[105]  
TAYLOR WR, 1990, METHOD ENZYMOL, V183, P456
[106]   THE HOMEODOMAIN - A NEW FACE FOR THE HELIX-TURN-HELIX [J].
TREISMAN, J ;
HARRIS, E ;
WILSON, D ;
DESPLAN, C .
BIOESSAYS, 1992, 14 (03) :145-150
[107]   WEIGHTING IN SEQUENCE SPACE - A COMPARISON OF METHODS IN TERMS OF GENERALIZED SEQUENCES [J].
VINGRON, M ;
SIBBALD, PR .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1993, 90 (19) :8777-8781
[108]   MOTIF RECOGNITION AND ALIGNMENT FOR MANY SEQUENCES BY COMPARISON OF DOT-MATRICES [J].
VINGRON, M ;
ARGOS, P .
JOURNAL OF MOLECULAR BIOLOGY, 1991, 218 (01) :33-43
[109]  
VINGRON M, 1989, COMPUT APPL BIOSCI, V5, P115
[110]   LINE GEOMETRIES FOR SEQUENCE COMPARISONS [J].
WATERMAN, MS ;
PERLWITZ, M .
BULLETIN OF MATHEMATICAL BIOLOGY, 1984, 46 (04) :567-577