Surviving in a sea of data: a survey of plant genome data resources and issues in building data management systems

被引:13
作者
Reiser, L [1 ]
Mueller, LA [1 ]
Rhee, SY [1 ]
机构
[1] Carnegie Inst Washington, Dept Plant Biol, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
controlled vocabulary; databases; data management; genomics; information systems; nomenclature;
D O I
10.1023/A:1013726308559
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Exponential growth of data, largely from whole-genome analyses, has changed the way biologists think about and handle data. Optimal use of these data requires effective methods to analyze and manage these data sets. Computers, software and the World Wide Web are now integral components of biological discovery. Understanding how information is obtained, processed and annotated in public databases allows researchers to effectively organize, analyze and export their own data into these databases. In this review we focus largely on two areas related to management of genomic data. We cite examples of resources available in the public domain and describe some of the software for data management systems currently available for plant research. In addition, we discuss a few concepts of data management from the perspective of an individual or group that wishes to provide data to the public databases, to use the information in the public databases more efficiently, or to develop a database to manage large data sets internally or for public access. These concepts include data descriptions, exchange format, curation, attribution, and database implementation.
引用
收藏
页码:59 / 74
页数:16
相关论文
共 48 条
[11]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
[12]  
Baxevanis AD, 1998, METHOD BIOCHEM ANAL, V39, P98
[13]   The Molecular Biology Database Collection: an updated compilation of biological database resources [J].
Baxevanis, AD .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :1-10
[14]   GenBank [J].
Benson, DA ;
Karsch-Mizrachi, I ;
Lipman, DJ ;
Ostell, J ;
Rapp, BA ;
Wheeler, DL .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :15-18
[15]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[16]   Errors in genome annotation [J].
Brenner, SE .
TRENDS IN GENETICS, 1999, 15 (04) :132-133
[17]  
Busch M, 1996, MOL GEN GENET, V250, P681, DOI 10.1007/s004380050121
[18]   Public informatics resources for rice and other grasses [J].
Cartinhour, SW .
PLANT MOLECULAR BIOLOGY, 1997, 35 (1-2) :241-251
[19]   UK CropNet: a collection of databases and bioinformatics resources for crop plant genomics [J].
Dicks, J ;
Anderson, M ;
Cardle, L ;
Cartinhour, S ;
Couchman, M ;
Davenport, G ;
Dickson, J ;
Gale, M ;
Marshall, D ;
May, S ;
McWilliam, H ;
O'Malia, A ;
Ougham, H ;
Trick, M ;
Walsh, S ;
Waugh, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :104-107
[20]   Algorithms for mutant sorting: the need for phenotype vocabularies [J].
Eppig, JT .
MAMMALIAN GENOME, 2000, 11 (07) :584-589