CATH: expanding the horizons of structure-based functional annotations for genome sequences

被引:90
作者
Sillitoe, Ian [1 ]
Dawson, Natalie [1 ]
Lewis, Tony E. [1 ]
Das, Sayoni [1 ]
Lees, Jonathan G. [1 ]
Ashford, Paul [1 ]
Tolulope, Adeyelu [1 ]
Scholes, Harry M. [1 ]
Senatorov, Ilya [1 ]
Bujan, Andra [1 ]
Rodriguez-Conde, Fatima Ceballos [1 ]
Dowling, Benjamin [1 ]
Thornton, Janet [2 ]
Orengo, Christine A. [1 ]
机构
[1] UCL, Struct & Mol Biol, London WC1E 6BT, England
[2] European Bioinformat Inst, Wellcome Trust Genome Campus, Hinxton CB10 1SD, Cambs, England
基金
英国惠康基金; 英国生物技术与生命科学研究理事会;
关键词
PROTEIN FUNCTION; LARGE-SCALE; CLASSIFICATION; EVOLUTION; RESOURCE;
D O I
10.1093/nar/gky1097
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
This article provides an update of the latest data and developments within the CATH protein structure classification database (http://www.cathdb.info). The resource provides two levels of release: CATH-B, a daily snapshot of the latest structural domain boundaries and superfamily assignments, and CATH+, which adds layers of derived data, such as predicted sequence domains, functional annotations and functional clustering (known as Functional Families or FunFams). The most recent CATH+ release (version 4.2) provides a huge update in the coverage of structural data. This release increases the number of fully- classified domains by over 40% (from 308 999 to 434 857 structural domains), corresponding to an almost two- fold increase in sequence data (from 53 million to over 95 million predicted domains) organised into 6119 superfamilies. The coverage of high-resolution, protein PDB chains that contain at least one assigned CATH domain is now 90.2% (increased from 82.3% in the previous release). A number of highly requested features have also been implemented in our web pages: allowing the user to view an alignment between their query sequence and a representative FunFam structure and providing tools that make it easier to view the full structural context (multi-domain architecture) of domains and chains.
引用
收藏
页码:D280 / D284
页数:5
相关论文
共 24 条
  • [1] Ensembl 2017
    Aken, Bronwen L.
    Achuthan, Premanand
    Akanni, Wasiu
    Amode, M. Ridwan
    Bernsdorff, Friederike
    Bhai, Jyothish
    Billis, Konstantinos
    Carvalho-Silva, Denise
    Cummins, Carla
    Clapham, Peter
    Gil, Laurent
    Giron, Carlos Garcia
    Gordon, Leo
    Hourlier, Thibaut
    Hunt, Sarah E.
    Janacek, Sophie H.
    Juettemann, Thomas
    Keenan, Stephen
    Laird, Matthew R.
    Lavidas, Ilias
    Maurel, Thomas
    McLaren, William
    Moore, Benjamin
    Murphy, Daniel N.
    Nag, Rishi
    Newman, Victoria
    Nuhn, Michael
    Ong, Chuang Kee
    Parker, Anne
    Patricio, Mateus
    Riat, Harpreet Singh
    Sheppard, Daniel
    Sparrow, Helen
    Taylor, Kieron
    Thormann, Anja
    Vullo, Alessandro
    Walts, Brandon
    Wilder, Steven P.
    Zadissa, Amonida
    Kostadima, Myrto
    Martin, Fergal J.
    Muffato, Matthieu
    Perry, Emily
    Ruffier, Magali
    Staines, Daniel M.
    Trevanion, Stephen J.
    Cunningham, Fiona
    Yates, Andrew
    Zerbino, Daniel R.
    Flicek, Paul
    [J]. NUCLEIC ACIDS RESEARCH, 2017, 45 (D1) : D635 - D642
  • [2] The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data
    Berman, Helen
    Henrick, Kim
    Nakamura, Haruki
    Markley, John L.
    [J]. NUCLEIC ACIDS RESEARCH, 2007, 35 : D301 - D303
  • [3] Blomberg N., 2016, F1000RESEARCH, V4
  • [4] webPRC: the Profile Comparer for alignment-based searching of public domain databases
    Brandt, Bernd W.
    Heringa, Jaap
    [J]. NUCLEIC ACIDS RESEARCH, 2009, 37 : W48 - W52
  • [5] New Insights about Enzyme Evolution from Large Scale Studies of Sequence and Structure Relationships
    Brown, Shoshana D.
    Babbitt, Patricia C.
    [J]. JOURNAL OF BIOLOGICAL CHEMISTRY, 2014, 289 (44) : 30221 - 30228
  • [6] Expansion of the Gene Ontology knowledgebase and resources
    Carbon, S.
    Dietze, H.
    Lewis, S. E.
    Mungall, C. J.
    Munoz-Torres, M. C.
    Basu, S.
    Chisholm, R. L.
    Dodson, R. J.
    Fey, P.
    Thomas, P. D.
    Mi, H.
    Muruganujan, A.
    Huang, X.
    Poudel, S.
    Hu, J. C.
    Aleksander, S. A.
    McIntosh, B. K.
    Renfro, D. P.
    Siegele, D. A.
    Antonazzo, G.
    Attrill, H.
    Brown, N. H.
    Marygold, S. J.
    McQuilton, P.
    Ponting, L.
    Millburn, G. H.
    Rey, A. J.
    Stefancsik, R.
    Tweedie, S.
    Falls, K.
    Schroeder, A. J.
    Courtot, M.
    Osumi-Sutherland, D.
    Parkinson, H.
    Roncaglia, P.
    Lovering, R. C.
    Foulger, R. E.
    Huntley, R. P.
    Denny, P.
    Campbell, N. H.
    Kramarz, B.
    Patel, S.
    Buxton, J. L.
    Umrao, Z.
    Deng, A. T.
    Alrohaif, H.
    Mitchell, K.
    Ratnaraj, F.
    Omer, W.
    Rodriguez-Lopez, M.
    [J]. NUCLEIC ACIDS RESEARCH, 2017, 45 (D1) : D331 - D338
  • [7] The evolution of enzyme function in the isomerases
    Cuesta, Sergio Martinez
    Furnham, Nicholas
    Rahman, Syed Asad
    Sillitoe, Ian
    Thornton, Janet M.
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 2014, 26 : 121 - 130
  • [8] Protein function annotation using protein domain family resources
    Das, Sayoni
    Orengo, Christine A.
    [J]. METHODS, 2016, 93 : 24 - 34
  • [9] Functional classification of CATH superfamilies: a domain-based approach for protein function annotation
    Das, Sayoni
    Lee, David
    Sillitoe, Ian
    Dawson, Natalie L.
    Lees, Jonathan G.
    Orengo, Christine A.
    [J]. BIOINFORMATICS, 2015, 31 (21) : 3460 - 3467
  • [10] Dawson N, 2017, METHODS MOL BIOL, V1525, P137, DOI 10.1007/978-1-4939-6622-6_7