CDD: a Conserved Domain Database for the functional annotation of proteins

被引:2450
作者
Marchler-Bauer, Aron [1 ]
Lu, Shennan [1 ]
Anderson, John B. [1 ]
Chitsaz, Farideh [1 ]
Derbyshire, Myra K. [1 ]
DeWeese-Scott, Carol [1 ]
Fong, Jessica H. [1 ]
Geer, Lewis Y. [1 ]
Geer, Renata C. [1 ]
Gonzales, Noreen R. [1 ]
Gwadz, Marc [1 ]
Hurwitz, David I. [1 ]
Jackson, John D. [1 ]
Ke, Zhaoxi [1 ]
Lanczycki, Christopher J. [1 ]
Lu, Fu [1 ]
Marchler, Gabriele H. [1 ]
Mullokandov, Mikhail [1 ]
Omelchenko, Marina V. [1 ]
Robertson, Cynthia L. [1 ]
Song, James S. [1 ]
Thanki, Narmada [1 ]
Yamashita, Roxanne A. [1 ]
Zhang, Dachuan [1 ]
Zhang, Naigong [1 ]
Zheng, Chanjuan [1 ]
Bryant, Stephen H. [1 ]
机构
[1] NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, Bethesda, MD 20894 USA
基金
美国国家卫生研究院;
关键词
SEARCH;
D O I
10.1093/nar/gkq1189
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
NCBI's Conserved Domain Database (CDD) is a resource for the annotation of protein sequences with the location of conserved domain footprints, and functional sites inferred from these footprints. CDD includes manually curated domain models that make use of protein 3D structure to refine domain models and provide insights into sequence/structure/function relationships. Manually curated models are organized hierarchically if they describe domain families that are clearly related by common descent. As CDD also imports domain family models from a variety of external sources, it is a partially redundant collection. To simplify protein annotation, redundant models and models describing homologous families are clustered into superfamilies. By default, domain footprints are annotated with the corresponding superfamily designation, on top of which specific annotation may indicate high-confidence assignment of family membership. Pre-computed domain annotation is available for proteins in the Entrez/Protein dataset, and a novel interface, Batch CD-Search, allows the computation and download of annotation for large sets of protein queries. CDD can be accessed via http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.
引用
收藏
页码:D225 / D229
页数:5
相关论文
共 10 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] The Pfam protein families database
    Finn, Robert D.
    Mistry, Jaina
    Tate, John
    Coggill, Penny
    Heger, Andreas
    Pollington, Joanne E.
    Gavin, O. Luke
    Gunasekaran, Prasad
    Ceric, Goran
    Forslund, Kristoffer
    Holm, Liisa
    Sonnhammer, Erik L. L.
    Eddy, Sean R.
    Bateman, Alex
    [J]. NUCLEIC ACIDS RESEARCH, 2010, 38 : D211 - D222
  • [3] Protein subfamily assignment using the Conserved Domain Database
    Fong J.H.
    Marchler-Bauer A.
    [J]. BMC Research Notes, 1 (1)
  • [4] SMART 5: domains in the context of genomes and networks
    Letunic, Ivica
    Copley, Richard R.
    Pils, Birgit
    Pinkert, Stefan
    Schultz, Joerg
    Bork, Peer
    [J]. NUCLEIC ACIDS RESEARCH, 2006, 34 : D257 - D260
  • [5] CDD: a database of conserved domain alignments with links to domain three-dimensional structure
    Marchler-Bauer, A
    Panchenko, AR
    Shoemaker, BA
    Thiessen, PA
    Geer, LY
    Bryant, SH
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 281 - 283
  • [6] CD-Search: protein domain annotations on the fly
    Marchler-Bauer, A
    Bryant, SH
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : W327 - W331
  • [7] CDD: specific functional annotation with the Conserved Domain Database
    Marchler-Bauer, Aron
    Anderson, John B.
    Chitsaz, Farideh
    Derbyshire, Myra K.
    DeWeese-Scott, Carol
    Fong, Jessica H.
    Geer, Lewis Y.
    Geer, Renata C.
    Gonzales, Noreen R.
    Gwadz, Marc
    He, Siqian
    Hurwitz, David I.
    Jackson, John D.
    Ke, Zhaoxi
    Lanczycki, Christopher J.
    Liebert, Cynthia A.
    Liu, Chunlei
    Lu, Fu
    Lu, Shennan
    Marchler, Gabriele H.
    Mullokandov, Mikhail
    Song, James S.
    Tasneem, Asba
    Thanki, Narmada
    Yamashita, Roxanne A.
    Zhang, Dachuan
    Zhang, Naigong
    Bryant, Stephen H.
    [J]. NUCLEIC ACIDS RESEARCH, 2009, 37 : D205 - D210
  • [8] Database resources of the National Center for Biotechnology Information
    Sayers, Eric W.
    Barrett, Tanya
    Benson, Dennis A.
    Bolton, Evan
    Bryant, Stephen H.
    Canese, Kathi
    Chetvernin, Vyacheslav
    Church, Deanna M.
    DiCuccio, Michael
    Federhen, Scott
    Feolo, Michael
    Geer, Lewis Y.
    Helmberg, Wolfgang
    Kapustin, Yuri
    Landsman, David
    Lipman, David J.
    Lu, Zhiyong
    Madden, Thomas L.
    Madej, Tom
    Maglott, Donna R.
    Marchler-Bauer, Aron
    Miller, Vadim
    Mizrachi, Ilene
    Ostell, James
    Panchenko, Anna
    Pruitt, Kim D.
    Schuler, Gregory D.
    Sequeira, Edwin
    Sherry, Stephen T.
    Shumway, Martin
    Sirotkin, Karl
    Slotta, Douglas
    Souvorov, Alexandre
    Starchenko, Grigory
    Tatusova, Tatiana A.
    Wagner, Lukas
    Wang, Yanli
    Wilbur, W. John
    Yaschenko, Eugene
    Ye, Jian
    [J]. NUCLEIC ACIDS RESEARCH, 2010, 38 : D5 - D16
  • [9] TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes
    Selengut, Jeremy D.
    Haft, Daniel H.
    Davidsen, Tanja
    Ganapathy, Anurhada
    Gwinn-Giglio, Michelle
    Nelson, William C.
    Richter, Alexander R.
    White, Owen
    [J]. NUCLEIC ACIDS RESEARCH, 2007, 35 : D260 - D264
  • [10] The COG database: an updated version includes eukaryotes
    Tatusov, RL
    Fedorova, ND
    Jackson, JD
    Jacobs, AR
    Kiryutin, B
    Koonin, EV
    Krylov, DM
    Mazumder, R
    Mekhedov, SL
    Nikolskaya, AN
    Rao, BS
    Smirnov, S
    Sverdlov, AV
    Vasudevan, S
    Wolf, YI
    Yin, JJ
    Natale, DA
    [J]. BMC BIOINFORMATICS, 2003, 4 (1)