ISSUES IN SEARCHING MOLECULAR SEQUENCE DATABASES

被引:616
作者
ALTSCHUL, SF [1 ]
BOGUSKI, MS [1 ]
GISH, W [1 ]
WOOTTON, JC [1 ]
机构
[1] NIH,NATL LIB MED,NATL CTR BIOTECHNOL INFORMAT,BETHESDA,MD 20894
关键词
D O I
10.1038/ng0294-119
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Sequence similarity search programs are versatile tools for the molecular biologist, frequently able to identify possible DNA coding regions and to provide clues to gene and protein structure and function. While much attention had been paid to the precise algorithms these programs employ and to their relative speeds, there is a constellation of associated issues that are equally important to realize the full potential of these methods. Here, we consider a number of these issues, including the choice of scoring systems, the statistical significance of alignments, the masking of uninformative or potentially confounding sequence regions, the nature and extent of sequence redundancy in the databases and network access to similarity search services.
引用
收藏
页码:119 / 129
页数:11
相关论文
共 90 条
[31]   THE MANY ROADS THAT LEAD TO RAS [J].
FEIG, LA .
SCIENCE, 1993, 260 (5109) :767-768
[32]   ALIGNING AMINO-ACID SEQUENCES - COMPARISON OF COMMONLY USED METHODS [J].
FENG, DF ;
JOHNSON, MS ;
DOOLITTLE, RF .
JOURNAL OF MOLECULAR EVOLUTION, 1985, 21 (02) :112-125
[33]   OPTIMAL SEQUENCE ALIGNMENTS [J].
FITCH, WM ;
SMITH, TF .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA-BIOLOGICAL SCIENCES, 1983, 80 (05) :1382-1386
[34]   IDENTIFICATION OF PROTEIN CODING REGIONS BY DATABASE SIMILARITY SEARCH [J].
GISH, W ;
STATES, DJ .
NATURE GENETICS, 1993, 3 (03) :266-272
[35]   PATTERN-RECOGNITION IN NUCLEIC-ACID SEQUENCES .1. A GENERAL-METHOD FOR FINDING LOCAL HOMOLOGIES AND SYMMETRIES [J].
GOAD, WB ;
KANEHISA, MI .
NUCLEIC ACIDS RESEARCH, 1982, 10 (01) :247-263
[36]   EXHAUSTIVE MATCHING OF THE ENTIRE PROTEIN-SEQUENCE DATABASE [J].
GONNET, GH ;
COHEN, MA ;
BENNER, SA .
SCIENCE, 1992, 256 (5062) :1443-1445
[37]   AN IMPROVED ALGORITHM FOR MATCHING BIOLOGICAL SEQUENCES [J].
GOTOH, O .
JOURNAL OF MOLECULAR BIOLOGY, 1982, 162 (03) :705-708
[38]   ANCIENT CONSERVED REGIONS IN NEW GENE-SEQUENCES AND THE PROTEIN DATABASES [J].
GREEN, P ;
LIPMAN, D ;
HILLIER, L ;
WATERSTON, R ;
STATES, D ;
CLAVERIE, JM .
SCIENCE, 1993, 259 (5102) :1711-1716
[39]  
Gumbel E J., 1958, STAT EXTREMES
[40]  
HANKS SK, 1991, METHOD ENZYMOL, V200, P38