DDBJ new system and service refactoring

被引:29
作者
Ogasawara, Osamu [1 ]
Mashima, Jun [1 ]
Kodama, Yuichi [1 ]
Kaminuma, Eli [1 ]
Nakamura, Yasukazu [1 ]
Okubo, Kousaku [1 ]
Takagi, Toshihisa [1 ,2 ]
机构
[1] Natl Inst Genet, DDBJ Ctr, Mishima, Shizuoka 4118540, Japan
[2] Japan Sci & Technol Agcy, Natl Biosci Database Ctr, Tokyo 1028666, Japan
关键词
SEQUENCE; ARCHIVE;
D O I
10.1093/nar/gks1152
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The DNA data bank of Japan (DDBJ, http://www.ddbj.nig.ac.jp) maintains a primary nucleotide sequence database and provides analytical resources for biological information to researchers. This database content is exchanged with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). Resources provided by the DDBJ include traditional nucleotide sequence data released in the form of 27 316 452 entries or 16 876 791 557 base pairs (as of June 2012), and raw reads of new generation sequencers in the sequence read archive (SRA). A Japanese researcher published his own genome sequence via DDBJ-SRA on 31 July 2012. To cope with the ongoing genomic data deluge, in March 2012, our computer previous system was totally replaced by a commodity cluster-based system that boasts 122.5 TFlops of CPU capacity and 5 PB of storage space. During this upgrade, it was considered crucial to replace and refactor substantial portions of the DDBJ software systems as well. As a result of the replacement process, which took more than 2 years to perform, we have achieved significant improvements in system performance.
引用
收藏
页码:D25 / D29
页数:5
相关论文
共 17 条
[1]   EFFICIENT STRING MATCHING - AID TO BIBLIOGRAPHIC SEARCH [J].
AHO, AV ;
CORASICK, MJ .
COMMUNICATIONS OF THE ACM, 1975, 18 (06) :333-340
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   Major submissions tool developments at the European nucleotide archive [J].
Amid, Clara ;
Birney, Ewan ;
Bower, Lawrence ;
Cerdeno-Tarraga, Ana ;
Cheng, Ying ;
Cleland, Iain ;
Faruque, Nadeem ;
Gibson, Richard ;
Goodgame, Neil ;
Hunter, Christopher ;
Jang, Mikyung ;
Leinonen, Rasko ;
Liu, Xin ;
Oisel, Arnaud ;
Pakseresht, Nima ;
Plaister, Sheila ;
Radhakrishnan, Rajesh ;
Reddy, Kethi ;
Riviere, Stephane ;
Rossello, Marc ;
Senf, Alexander ;
Smirnov, Dimitriy ;
Ten Hoopen, Petra ;
Vaughan, Daniel ;
Vaughan, Robert ;
Zalunin, Vadim ;
Cochrane, Guy .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D43-D47
[4]  
[Anonymous], BERLINERS INFORM TAG
[5]  
[Anonymous], 2012, COMPUTER ARCHITECTUR
[6]  
[Anonymous], B MATH STAT
[7]   BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata [J].
Barrett, Tanya ;
Clark, Karen ;
Gevorgyan, Robert ;
Gorelenkov, Vyacheslav ;
Gribov, Eugene ;
Karsch-Mizrachi, Ilene ;
Kimelman, Michael ;
Pruitt, Kim D. ;
Resenchuk, Sergei ;
Tatusova, Tatiana ;
Yaschenko, Eugene ;
Ostell, James .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D57-D63
[8]   GenBank [J].
Benson, Dennis A. ;
Karsch-Mizrachi, Ilene ;
Clark, Karen ;
Lipman, David J. ;
Ostell, James ;
Sayers, Eric W. .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D48-D53
[9]   Evidence standards in experimental and inferential INSDC Third Party Annotation data [J].
Cochrane, Guy ;
Bates, Kirsty ;
Apweiler, Rolf ;
Tateno, Yoshio ;
Mashima, Jun ;
Kosuge, Takehide ;
Mizrachi, Ilene Karsch ;
Schafer, Susan ;
Fetchko, Michael .
OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY, 2006, 10 (02) :105-113
[10]   The NCBI Taxonomy database [J].
Federhen, Scott .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D136-D143