CDD: a conserved domain database for protein classification

被引:949
作者
Marchler-Bauer, A [1 ]
Anderson, JB [1 ]
Cherukuri, PF [1 ]
DeWweese-Scott, C [1 ]
Geer, LY [1 ]
Gwadz, M [1 ]
He, SQ [1 ]
Hurwitz, DI [1 ]
Jackson, JD [1 ]
Ke, ZX [1 ]
Lanczycki, CJ [1 ]
Liebert, CA [1 ]
Liu, CL [1 ]
Lu, F [1 ]
Marchler, GH [1 ]
Mullokandov, M [1 ]
Shoemaker, BA [1 ]
Simonyan, V [1 ]
Song, JS [1 ]
Thiessen, PA [1 ]
Yamashita, RA [1 ]
Yin, JJ [1 ]
Zhang, DC [1 ]
Bryant, SH [1 ]
机构
[1] Natl Lib Med, Natl Ctr Biotechnol Informat, Natl Inst Hlth, Bethesda, MD 20894 USA
基金
美国国家卫生研究院;
关键词
D O I
10.1093/nar/gki069
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The Conserved Domain Database (CDD) is the protein classification component of NCBI's Entrez query and retrieval system. CDD is linked to other Entrez databases such as Proteins, Taxonomy and PubMed(R), and can be accessed at http://www.ncbi.nlm.nih. gov/entrez/query.fcgi?db=cdd. CD-Search, which is available at http://www.ncbi.nim.nih.gov/Structure/ cdd/wrpsb.cgi, is a fast, interactive tool to identify conserved domains in new protein sequences. CD-Search results for protein sequences in Entrez are pre-computed to provide links between proteins and domain models, and computational annotation visible upon request. Protein-protein queries submitted to NCBI's BLAST search service at http:// www.ncbi.nim.nih.gov/BLAST are scanned for the presence of conserved domains by default. While CDD started out as essentially a mirror of publicly available domain alignment collections, such as SMART, Pfam and COG, we have continued an effort to update, and in some cases replace these models with domain hierarchies curated at the NCBI. Here, we report on the progress of the curation effort and associated improvements in the functionality of the CDD information retrieval system.
引用
收藏
页码:D192 / D196
页数:5
相关论文
共 9 条
  • [1] Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkr1065, 10.1093/nar/gkh121]
  • [2] CDART: Protein homology by domain architecture
    Geer, LY
    Domrachev, M
    Lipman, DJ
    Bryant, SH
    [J]. GENOME RESEARCH, 2002, 12 (10) : 1619 - 1623
  • [3] SMART 4.0: towards genomic data integration
    Letunic, I
    Copley, RR
    Schmidt, S
    Ciccarelli, FD
    Doerks, T
    Schultz, J
    Ponting, CP
    Bork, P
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : D142 - D144
  • [4] Comparison of sequence and structure alignments for protein domains
    Marchler-Bauer, A
    Panchenko, AR
    Ariel, N
    Bryant, SH
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2002, 48 (03) : 439 - 446
  • [5] CDD: a database of conserved domain alignments with links to domain three-dimensional structure
    Marchler-Bauer, A
    Panchenko, AR
    Shoemaker, BA
    Thiessen, PA
    Geer, LY
    Bryant, SH
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 281 - 283
  • [6] CD-Search: protein domain annotations on the fly
    Marchler-Bauer, A
    Bryant, SH
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : W327 - W331
  • [7] The COG database: an updated version includes eukaryotes
    Tatusov, RL
    Fedorova, ND
    Jackson, JD
    Jacobs, AR
    Kiryutin, B
    Koonin, EV
    Krylov, DM
    Mazumder, R
    Mekhedov, SL
    Nikolskaya, AN
    Rao, BS
    Smirnov, S
    Sverdlov, AV
    Vasudevan, S
    Wolf, YI
    Yin, JJ
    Natale, DA
    [J]. BMC BIOINFORMATICS, 2003, 4 (1)
  • [8] Cn3D: sequence and structure views for Entrez
    Wang, YL
    Geer, LY
    Chappey, C
    Kans, JA
    Bryant, SH
    [J]. TRENDS IN BIOCHEMICAL SCIENCES, 2000, 25 (06) : 300 - 302
  • [9] Database resources of the National Center for Biotechnology Information: update
    Wheeler, DL
    Church, DM
    Edgar, R
    Federhen, S
    Helmberg, W
    Madden, TL
    Pontius, JU
    Schuler, GD
    Schriml, LM
    Sequeira, E
    Suzek, TO
    Tatusova, TA
    Wagner, L
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : D35 - D40