CATH: an expanded resource to predict protein function through structure and sequence

被引:260
作者
Dawson, Natalie L. [1 ]
Lewis, Tony E. [1 ]
Das, Sayoni [1 ]
Lees, Jonathan G. [1 ]
Lee, David [1 ]
Ashford, Paul [1 ]
Orengo, Christine A. [1 ]
Sillitoe, Ian [1 ]
机构
[1] UCL, Inst Struct & Mol Biol, Gower St, London WC1E 6BT, England
基金
英国生物技术与生命科学研究理事会; 英国惠康基金;
关键词
CLASSIFICATION; SUPERFAMILIES; PROGRAM;
D O I
10.1093/nar/gkw1098
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The latest version of the CATH-Gene3D protein structure classification database has recently been released (version 4.1, http://www.cathdb.info). The resource comprises over 300 000 domain structures and over 53 million protein domains classified into 2737 homologous superfamilies, doubling the number of predicted protein domains in the previous version. The daily-updated CATH-B, which contains our very latest domain assignment data, provides putative classifications for over 100 000 additional protein domains. This article describes developments to the CATH-Gene3D resource over the last two years since the publication in 2015, including: significant increases to our structural and sequence coverage; expansion of the functional families in CATH; building a support vector machine (SVM) to automatically assign domains to superfamilies; improved search facilities to return alignments of query sequences against multiple sequence alignments; the redesign of the web pages and download site.
引用
收藏
页码:D289 / D295
页数:7
相关论文
共 25 条
[1]  
Aken B. L., 2016, DATABASE-OXFORD, V44, pD710
[2]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[3]   UniProt: a hub for protein information [J].
Bateman, Alex ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Apweiler, Rolf ;
Alpi, Emanuele ;
Antunes, Ricardo ;
Arganiska, Joanna ;
Bely, Benoit ;
Bingley, Mark ;
Bonilla, Carlos ;
Britto, Ramona ;
Bursteinas, Borisas ;
Chavali, Gayatri ;
Cibrian-Uhalte, Elena ;
Da Silva, Alan ;
De Giorgi, Maurizio ;
Dogan, Tunca ;
Fazzini, Francesco ;
Gane, Paul ;
Cas-tro, Leyla Garcia ;
Garmiri, Penelope ;
Hatton-Ellis, Emma ;
Hieta, Reija ;
Huntley, Rachael ;
Legge, Duncan ;
Liu, Wudong ;
Luo, Jie ;
MacDougall, Alistair ;
Mutowo, Prudence ;
Nightin-gale, Andrew ;
Orchard, Sandra ;
Pichler, Klemens ;
Poggioli, Diego ;
Pundir, Sangya ;
Pureza, Luis ;
Qi, Guoying ;
Rosanoff, Steven ;
Saidi, Rabie ;
Sawford, Tony ;
Shypitsyna, Aleksandra ;
Turner, Edward ;
Volynkin, Vladimir ;
Wardell, Tony ;
Watkins, Xavier ;
Zellner, Hermann ;
Cowley, Andrew ;
Figueira, Luis ;
Li, Weizhong ;
McWilliam, Hamish .
NUCLEIC ACIDS RESEARCH, 2015, 43 (D1) :D204-D212
[4]   BLAST plus : architecture and applications [J].
Camacho, Christiam ;
Coulouris, George ;
Avagyan, Vahram ;
Ma, Ning ;
Papadopoulos, Jason ;
Bealer, Kevin ;
Madden, Thomas L. .
BMC BIOINFORMATICS, 2009, 10
[5]   Functional classification of CATH superfamilies: a domain-based approach for protein function annotation [J].
Das, Sayoni ;
Lee, David ;
Sillitoe, Ian ;
Dawson, Natalie L. ;
Lees, Jonathan G. ;
Orengo, Christine A. .
BIOINFORMATICS, 2015, 31 (21) :3460-3467
[6]   HMMER web server: 2015 update [J].
Finn, Robert D. ;
Clements, Jody ;
Arndt, William ;
Miller, Benjamin L. ;
Wheeler, Travis J. ;
Schreiber, Fabian ;
Bateman, Alex ;
Eddy, Sean R. .
NUCLEIC ACIDS RESEARCH, 2015, 43 (W1) :W30-W38
[7]   CD-HIT: accelerated for clustering the next-generation sequencing data [J].
Fu, Limin ;
Niu, Beifang ;
Zhu, Zhengwei ;
Wu, Sitao ;
Li, Weizhong .
BIOINFORMATICS, 2012, 28 (23) :3150-3152
[8]   Large-Scale Analysis Exploring Evolution of Catalytic Machineries and Mechanisms in Enzyme Superfamilies [J].
Furnham, Nicholas ;
Dawson, Natalie L. ;
Rahman, Syed A. ;
Thornton, Janet M. ;
Orengo, Christine A. .
JOURNAL OF MOLECULAR BIOLOGY, 2016, 428 (02) :253-267
[9]   An expanded evaluation of protein function prediction methods shows an improvement in accuracy [J].
Jiang, Yuxiang ;
Oron, Tal Ronnen ;
Clark, Wyatt T. ;
Bankapur, Asma R. ;
D'Andrea, Daniel ;
Lepore, Rosalba ;
Funk, Christopher S. ;
Kahanda, Indika ;
Verspoor, Karin M. ;
Ben-Hur, Asa ;
Koo, Da Chen Emily ;
Penfold-Brown, Duncan ;
Shasha, Dennis ;
Youngs, Noah ;
Bonneau, Richard ;
Lin, Alexandra ;
Sahraeian, Sayed M. E. ;
Martelli, Pier Luigi ;
Profiti, Giuseppe ;
Casadio, Rita ;
Cao, Renzhi ;
Zhong, Zhaolong ;
Cheng, Jianlin ;
Altenhoff, Adrian ;
Skunca, Nives ;
Dessimoz, Christophe ;
Dogan, Tunca ;
Hakala, Kai ;
Kaewphan, Suwisa ;
Mehryary, Farrokh ;
Salakoski, Tapio ;
Ginter, Filip ;
Fang, Hai ;
Smithers, Ben ;
Oates, Matt ;
Gough, Julian ;
Toronen, Petri ;
Koskinen, Patrik ;
Holm, Liisa ;
Chen, Ching-Tai ;
Hsu, Wen-Lian ;
Bryson, Kevin ;
Cozzetto, Domenico ;
Minneci, Federico ;
Jones, David T. ;
Chapman, Samuel ;
Dukka, B. K. C. ;
Khan, Ishita K. ;
Kihara, Daisuke ;
Ofer, Dan .
GENOME BIOLOGY, 2016, 17
[10]   MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability [J].
Katoh, Kazutaka ;
Standley, Daron M. .
MOLECULAR BIOLOGY AND EVOLUTION, 2013, 30 (04) :772-780