The Genome Sequence DataBase (GSDB): improving data quality and data access

被引:15
作者
Harger, C [1 ]
Skupski, M [1 ]
Bingham, J [1 ]
Farmer, A [1 ]
Hoisie, S [1 ]
Hraber, P [1 ]
Kiphart, D [1 ]
Krakowski, L [1 ]
McLeod, M [1 ]
Schwertfeger, J [1 ]
Seluja, G [1 ]
Siepel, A [1 ]
Singh, G [1 ]
Stamper, D [1 ]
Steadman, P [1 ]
Thayer, N [1 ]
Thompson, R [1 ]
Wargo, P [1 ]
Waugh, M [1 ]
Zhuang, JJ [1 ]
Schad, PA [1 ]
机构
[1] Natl Ctr Genome Resources, Santa Fe, NM 87505 USA
关键词
D O I
10.1093/nar/26.1.21
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In 1997 the primary focus of the Genome Sequence DataBase (GSDB;www.ncgr.org/gsdb) located at the National Center for Genome Resources was to improve data quality and accessibility. Efforts to increase the quality of data within the database included two major projects; one to identify and remove all vector contamination from sequences in the database and one to create premier sequence sets (including both alignments and discontiguous sequences). Data accessibility was improved during the course of the last year in several ways. First, a graphical database sequence viewer was made available to researchers. Second, an update process was implemented for the web-based query tool, Maestro. Third, a web-based tool, Excerpt, was developed to retrieve selected regions of any sequence in the database. And lastly a GSDB flatfile that contains annotation unique to GSDB (e.g., sequence analysis and alignment data) was developed. Additionally, the GSDB web site provides a tool for the detection of matrix attachment regions (MARs), which can be used to identify regions of high coding potential. The ultimate goal of this work is to make GSDB a more useful resource for genomic comparison studies and gene level studies by improving data quality and by providing data access capabilities that are consistent with the needs of both types of studies.
引用
收藏
页码:21 / 26
页数:6
相关论文
共 29 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
BAIROCH A, 1997, PROTEOME RES NEW FRO, P93
[3]   GenBank [J].
Benson, DA ;
Boguski, M ;
Lipman, DJ ;
Ostell, J .
NUCLEIC ACIDS RESEARCH, 1996, 24 (01) :1-5
[4]   GenBank [J].
Benson, DA ;
Boguski, MS ;
Lipman, DJ ;
Ostell, J ;
Ouellette, BFF .
NUCLEIC ACIDS RESEARCH, 1998, 26 (01) :1-7
[5]   The complete genome sequence of Escherichia coli K-12 [J].
Blattner, FR ;
Plunkett, G ;
Bloch, CA ;
Perna, NT ;
Burland, V ;
Riley, M ;
ColladoVides, J ;
Glasner, JD ;
Rode, CK ;
Mayhew, GF ;
Gregor, J ;
Davis, NW ;
Kirkpatrick, HA ;
Goeden, MA ;
Rose, DJ ;
Mau, B ;
Shao, Y .
SCIENCE, 1997, 277 (5331) :1453-+
[6]   Go hunting in sequence databases but watch out for the traps [J].
Bork, P .
TRENDS IN GENETICS, 1996, 12 (10) :425-427
[7]   ANALYSIS OF THE ESCHERICHIA-COLI GENOME .6. DNA-SEQUENCE OF THE REGION FROM 92.8 THROUGH 100 MINUTES [J].
BURLAND, V ;
PLUNKETT, G ;
SOFIA, HJ ;
DANIELS, DL ;
BLATTNER, FR .
NUCLEIC ACIDS RESEARCH, 1995, 23 (12) :2105-2119
[8]   BACTERIAL GENOMES - A TIGR IN THE TANK [J].
DEVINE, KM ;
WOLFE, K .
TRENDS IN GENETICS, 1995, 11 (11) :429-431
[9]   THE MINIMAL GENE COMPLEMENT OF MYCOPLASMA-GENITALIUM [J].
FRASER, CM ;
GOCAYNE, JD ;
WHITE, O ;
ADAMS, MD ;
CLAYTON, RA ;
FLEISCHMANN, RD ;
BULT, CJ ;
KERLAVAGE, AR ;
SUTTON, G ;
KELLEY, JM ;
FRITCHMAN, JL ;
WEIDMAN, JF ;
SMALL, KV ;
SANDUSKY, M ;
FUHRMANN, J ;
NGUYEN, D ;
UTTERBACK, TR ;
SAUDEK, DM ;
PHILLIPS, CA ;
MERRICK, JM ;
TOMB, JF ;
DOUGHERTY, BA ;
BOTT, KF ;
HU, PC ;
LUCIER, TS ;
PETERSON, SN ;
SMITH, HO ;
HUTCHISON, CA ;
VENTER, JC .
SCIENCE, 1995, 270 (5235) :397-403
[10]   The Genome Sequence DataBase version 1.0 (GSDB): From low pass sequences to complete genomes [J].
Harger, C ;
Skupski, M ;
Allen, E ;
Clark, C ;
Crowley, D ;
Dickinson, E ;
Easley, D ;
EspinosaLujan, A ;
Farmer, A ;
Fields, C ;
Flores, L ;
Harris, L ;
Kenn, G ;
Manning, M ;
McLeod, M ;
ONeill, J ;
Pumilia, M ;
Reinert, R ;
Rider, D ;
Rohrlich, J ;
Romero, Y ;
Schwertfeger, J ;
Seluja, G ;
Siepel, A ;
Singh, G ;
Smyth, L ;
Stamper, D ;
Stein, J ;
Suggs, R ;
Takkallapalli, R ;
Thayer, N ;
Thompson, G ;
Walsh, C ;
Wedgeworth, F ;
Schad, PA .
NUCLEIC ACIDS RESEARCH, 1997, 25 (01) :18-23