NCBI Reference Sequences: current status, policy and new initiatives

被引：568

作者：

Pruitt, Kim D. ^{[1
]}

Tatusova, Tatiana ^{[1
]}

Klimke, William ^{[1
]}

Maglott, Donna R. ^{[1
]}

机构：

[1] NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, Bethesda, MD 20892 USA

来源：

NUCLEIC ACIDS RESEARCH | 2009年 / 37卷

关键词：

ALIGNMENT; DATABASE; ENTREZ;

D O I：

10.1093/nar/gkn721

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

NCBI's Reference Sequence (RefSeq) database (http://www.ncbi.nlm.nih.gov/RefSeq/) is a curated non-redundant collection of sequences representing genomes, transcripts and proteins. RefSeq records integrate information from multiple sources and represent a current description of the sequence, the gene and sequence features. The database includes over 5300 organisms spanning prokaryotes, eukaryotes and viruses, with records for more than 5.5 x 10(6) proteins (RefSeq release 30). Feature annotation is applied by a combination of curation, collaboration, propagation from other sources and computation. We report here on the recent growth of the database, recent changes to feature annotations and record types for eukaryotic (primarily vertebrate) species and policies regarding species inclusion and genome annotation. In addition, we introduce RefSeqGene, a new initiative to support reporting variation data on a stable genomic coordinate system.

引用

页码：D32 / D36

页数：5

共 13 条

[1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
Altschul, SF
Madden, TL
Schaffer, AA
Zhang, JH
Zhang, Z
Miller, W
Lipman, DJ
[J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
[2] BASIC LOCAL ALIGNMENT SEARCH TOOL
ALTSCHUL, SF
GISH, W
MILLER, W
MYERS, EW
LIPMAN, DJ
[J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
[3] BENSON DA, 2009, NUCL ACIDS IN PRESS
[4] A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure
Eddy, SR
[J]. BMC BIOINFORMATICS, 2002, 3 (1)
[5] Rfam: annotating non-coding RNAs in complete genomes
Griffiths-Jones, S
Moxon, S
Marshall, M
Khanna, A
Eddy, SR
Bateman, A
[J]. NUCLEIC ACIDS RESEARCH, 2005, 33 : D121 - D124
[6] Gulley ML, 2007, ARCH PATHOL LAB MED, V131, P852
[7] Splign: algorithms for computing spliced alignments with identification of paralogs
Kapustin, Yuri
Souvorov, Alexander
Tatusova, Tatiana
Lipman, David
[J]. BIOLOGY DIRECT, 2008, 3 (1)
[8] KLIMKE W, 2009, NUCL ACIDS IN PRESS
[9] tRNAscan-SE: A program for improved detection of transfer RNA genes in genomic sequence
Lowe, TM
Eddy, SR
[J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (05) : 955 - 964
[10] Entrez Gene: gene-centered information at NCBI
Maglott, Donna
Ostell, Jim
Pruitt, Kim D.
Tatusova, Tatiana
[J]. NUCLEIC ACIDS RESEARCH, 2007, 35 : D26 - D31

← 1 2 →