NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins

被引:1390
作者
Pruitt, KD [1 ]
Tatusova, T [1 ]
Maglott, DR [1 ]
机构
[1] NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, Bethesda, MD 20892 USA
关键词
D O I
10.1093/nar/gki025
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database (http://www.ncbi.nim.nih.gov/RefSeq/) provides a non-redundant collection of sequences representing genomic data, transcripts and proteins. Although the goal is to provide a comprehensive dataset representing the complete sequence information for any given species, the database pragmatically includes sequence data that are currently publicly available in the archival databases. The database incorporates data from over 2400 organisms and includes over one million proteins representing significant taxonomic diversity spanning prokaryotes, eukaryotes and viruses. Nucleotide and protein sequences are explicitly linked, and the sequences are linked to other resources including the NCBI Map Viewer and Gene. Sequences are annotated to include coding regions, conserved domains, variation, references, names, database cross-references, and other features using a combined approach of collaboration and other input from the scientific community, automated annotation, propagation from GenBank and curation by NCBI staff.
引用
收藏
页码:D501 / D504
页数:4
相关论文
共 14 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [3] GenBank
    Benson, DA
    Karsch-Mizrachi, I
    Lipman, DJ
    Ostell, J
    Wheeler, DL
    [J]. NUCLEIC ACIDS RESEARCH, 2005, 33 : D34 - D38
  • [4] The Mouse Genome Database (MGD): integrating biology with the genome
    Bult, CJ
    Blake, JA
    Richardson, JE
    Kadin, JA
    Eppig, JT
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : D476 - D481
  • [5] Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms
    Christie, KR
    Weng, S
    Balakrishnan, R
    Costanzo, MC
    Dolinski, K
    Dwight, SS
    Engel, SR
    Feierbach, B
    Fisk, DG
    Hirschman, JE
    Hong, EL
    Issel-Tarver, L
    Nash, R
    Sethuraman, A
    Starr, B
    Theesfeld, CL
    Andrada, R
    Binkley, G
    Dong, Q
    Lane, C
    Schroeder, M
    Botstein, D
    Cherry, JM
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : D311 - D314
  • [6] Regulation of gene expression by stop codon recoding: selenocysteine
    Copeland, PR
    [J]. GENE, 2003, 312 : 17 - 25
  • [7] The FlyBase database of the Drosophila genome projects and community literature
    Gelbart, W
    Bayraktaroglu, L
    Bettencourt, B
    Campbell, K
    Crosby, M
    Emmert, D
    Hradecky, P
    Huang, Y
    Letovsky, S
    Matthews, B
    Russo, S
    Schroeder, A
    Smutniak, F
    Zhou, P
    Zytkovicz, M
    Ashburner, M
    Drysdale, R
    de Grey, A
    Foulger, R
    Millburn, G
    Yamada, C
    Kaufman, T
    Matthews, K
    Gilbert, D
    Grumbling, G
    Strelets, V
    Shemen, C
    Rubin, G
    Berman, B
    Frise, E
    Gibson, M
    Harris, N
    Kaminker, J
    Lewis, S
    Marshall, B
    Misra, S
    Mungall, C
    Prochnik, S
    Richter, J
    Smith, C
    Shu, S
    Tupy, J
    Wiel, C
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (01) : 172 - 175
  • [8] Entrez Gene: gene-centered information at NCBI
    Maglott, D
    Ostell, J
    Pruitt, KD
    Tatusova, T
    [J]. NUCLEIC ACIDS RESEARCH, 2005, 33 : D54 - D58
  • [9] CDD: a curated Entrez database of conserved domain alignments
    Marchler-Bauer, A
    Anderson, JB
    DeWeese-Scott, C
    Fedorova, ND
    Geer, LY
    He, SQ
    Hurwitz, DI
    Jackson, JD
    Jacobs, AR
    Lanczycki, CJ
    Liebert, CA
    Liu, CL
    Madej, T
    Marchler, GH
    Mazumder, R
    Nikolskaya, AN
    Panchenko, AR
    Rao, BS
    Shoemaker, BA
    Simonyan, V
    Song, JS
    Thiessen, PA
    Vasudevan, S
    Wang, YL
    Yamashita, RA
    Yin, JJ
    Bryant, SH
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (01) : 383 - 387
  • [10] Schuval S, 1996, PEDIATR AIDS HIV INF, V7, P266