BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata

被引:239
作者
Barrett, Tanya [1 ]
Clark, Karen [1 ]
Gevorgyan, Robert [1 ]
Gorelenkov, Vyacheslav [1 ]
Gribov, Eugene [1 ]
Karsch-Mizrachi, Ilene [1 ]
Kimelman, Michael [1 ]
Pruitt, Kim D. [1 ]
Resenchuk, Sergei [1 ]
Tatusova, Tatiana [1 ]
Yaschenko, Eugene [1 ]
Ostell, James [1 ]
机构
[1] NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, Bethesda, MD 20892 USA
基金
美国国家卫生研究院;
关键词
GENOMES;
D O I
10.1093/nar/gkr1163
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
As the volume and complexity of data sets archived at NCBI grow rapidly, so does the need to gather and organize the associated metadata. Although metadata has been collected for some archival databases, previously, there was no centralized approach at NCBI for collecting this information and using it across databases. The BioProject database was recently established to facilitate organization and classification of project data submitted to NCBI, EBI and DDBJ databases. It captures descriptive information about research projects that result in high volume submissions to archival databases, ties together related data across multiple archives and serves as a central portal by which to inform users of data availability. Concomitantly, the BioSample database is being developed to capture descriptive information about the biological samples investigated in projects. BioProject and BioSample records link to corresponding data stored in archival repositories. Submissions are supported by a web-based Submission Portal that guides users through a series of forms for input of rich metadata describing their projects and samples. Together, these databases offer improved ways for users to query, locate, integrate and interpret the masses of data held in NCBI's archival repositories. The BioProject and BioSample databases are available at http://www.ncbi.nlm.nih.gov/biosample, respectively.
引用
收藏
页码:D57 / D63
页数:7
相关论文
共 17 条
[1]  
[Anonymous], NUCLEIC ACIDS RES S1
[2]   GenBank [J].
Benson, Dennis A. ;
Karsch-Mizrachi, Ilene ;
Lipman, David J. ;
Ostell, James ;
Sayers, Eric W. .
NUCLEIC ACIDS RESEARCH, 2011, 39 :D32-D37
[3]   Public data archives for genomic structural variation [J].
Church, Deanna M. ;
Lappalainen, Ilkka ;
Sneddon, Tam P. ;
Hinton, Jonathan ;
Maguire, Michael ;
Lopez, John ;
Garner, John ;
Paschall, Justin ;
DiCuccio, Michael ;
Yaschenko, Eugene ;
Scherer, Stephen W. ;
Feuk, Lars ;
Flicek, Paul .
NATURE GENETICS, 2010, 42 (10) :813-814
[4]   The International Nucleotide Sequence Database Collaboration [J].
Cochrane, Guy ;
Karsch-Mizrachi, Ilene ;
Nakamura, Yasukazu .
NUCLEIC ACIDS RESEARCH, 2011, 39 :D15-D18
[5]   MicrobesOnline: an integrated portal for comparative and functional genomics [J].
Dehal, Paramvir S. ;
Joachimiak, Marcin P. ;
Price, Morgan N. ;
Bates, John T. ;
Baumohl, Jason K. ;
Chivian, Dylan ;
Friedland, Greg D. ;
Huang, Katherine H. ;
Keller, Keith ;
Novichkov, Pavel S. ;
Dubchak, Inna L. ;
Alm, Eric J. ;
Arkin, Adam P. .
NUCLEIC ACIDS RESEARCH, 2010, 38 :D396-D400
[6]   NCBI Epigenomics: a new public resource for exploring epigenomic data sets [J].
Fingerman, Ian M. ;
McDaniel, Lee ;
Zhang, Xuan ;
Ratzat, Walter ;
Hassan, Tarek ;
Jiang, Zhifang ;
Cohen, Robert F. ;
Schuler, Gregory D. .
NUCLEIC ACIDS RESEARCH, 2011, 39 :D908-D912
[7]   The BioSample Database (BioSD) at the European Bioinformatics Institute [J].
Gostev, Mikhail ;
Faulconbridge, Adam ;
Brandizi, Marco ;
Fernandez-Banet, Julio ;
Sarkans, Ugis ;
Brazma, Alvis ;
Parkinson, Helen .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D64-D70
[8]   GDR (Genome Database for Rosaceae): integrated web resources for Rosaceae genomics and genetics research [J].
Jung, S ;
Jesudurai, C ;
Staton, M ;
Du, ZD ;
Ficklin, S ;
Cho, IH ;
Abbott, A ;
Tomkins, J ;
Main, D .
BMC BIOINFORMATICS, 2004, 5 (1)
[9]   The Genomes On Line Database (GOLD) in 2009: status of genomic and metagenomic projects and their associated metadata [J].
Liolios, Konstantinos ;
Chen, I-Min A. ;
Mavromatis, Konstantinos ;
Tavernarakis, Nektarios ;
Hugenholtz, Philip ;
Markowitz, Victor M. ;
Kyrpides, Nikos C. .
NUCLEIC ACIDS RESEARCH, 2010, 38 :D346-D354
[10]   The NCBI dbGaP database of genotypes and phenotypes [J].
Mailman, Matthew D. ;
Feolo, Michael ;
Jin, Yumi ;
Kimura, Masato ;
Tryka, Kimberly ;
Bagoutdinov, Rinat ;
Hao, Luning ;
Kiang, Anne ;
Paschall, Justin ;
Phan, Lon ;
Popova, Natalia ;
Pretel, Stephanie ;
Ziyabari, Lora ;
Lee, Moira ;
Shao, Yu ;
Wang, Zhen Y. ;
Sirotkin, Karl ;
Ward, Minghong ;
Kholodov, Michael ;
Zbicz, Kerry ;
Beck, Jeffrey ;
Kimelman, Michael ;
Shevelev, Sergey ;
Preuss, Don ;
Yaschenko, Eugene ;
Graeff, Alan ;
Ostell, James ;
Sherry, Stephen T. .
NATURE GENETICS, 2007, 39 (10) :1181-1186