Pfam: the protein families database

被引:3554
作者
Finn, Robert D. [1 ]
Bateman, Alex [2 ]
Clements, Jody [1 ]
Coggill, Penelope [2 ,3 ]
Eberhardt, Ruth Y. [2 ,3 ]
Eddy, Sean R. [1 ]
Heger, Andreas [4 ]
Hetherington, Kirstie [3 ]
Holm, Liisa [5 ,6 ]
Mistry, Jaina [2 ]
Sonnhammer, Erik L. L. [7 ]
Tate, John [2 ,3 ]
Punta, Marco [2 ,3 ]
机构
[1] HHMI Janelia Farm Res Campus, Ashburn, VA 20147 USA
[2] European Bioinformat Inst EMBL EBI, European Mol Biol Lab, Cambridge CB10 1SD, England
[3] Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England
[4] Univ Oxford, MRC Funct Genom Unit, Dept Physiol Anat & Genet, Oxford OX1 3QX, England
[5] Univ Helsinki, Inst Biotechnol, FIN-00014 Helsinki, Finland
[6] Univ Helsinki, Dept Biol & Environm Sci, FIN-00014 Helsinki, Finland
[7] Stockholm Univ, Sci Life Lab, Dept Biochem & Biophys, Stockholm Bioinformat Ctr,Swedish eSci Res Ctr, SE-17121 Solna, Sweden
基金
英国惠康基金; 英国生物技术与生命科学研究理事会;
关键词
DATA-BANK; DOMAIN; SEQUENCES; LINKERS; CLASSIFICATION; PREDICTION; TAXONOMY; REGIONS; BIOLOGY;
D O I
10.1093/nar/gkt1223
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Pfam, available via servers in the UK (http://pfam.sanger.ac.uk/) and the USA (http://pfam.janelia.org/), is a widely used database of protein families, containing 14 831 manually curated entries in the current release, version 27.0. Since the last update article 2 years ago, we have generated 1182 new families and maintained sequence coverage of the UniProt Knowledgebase (UniProtKB) at nearly 80%, despite a 50% increase in the size of the underlying sequence database. Since our 2012 article describing Pfam, we have also undertaken a comprehensive review of the features that are provided by Pfam over and above the basic family data. For each feature, we determined the relevance, computational burden, usage statistics and the functionality of the feature in a website context. As a consequence of this review, we have removed some features, enhanced others and developed new ones to meet the changing demands of computational biology. Here, we describe the changes to Pfam content. Notably, we now provide family alignments based on four different representative proteome sequence data sets and a new interactive DNA search interface. We also discuss the mapping between Pfam and known 3D structures.
引用
收藏
页码:D222 / D230
页数:9
相关论文
共 37 条
[1]  
Agarwala R, 2015, NUCLEIC ACIDS RES, V43, pD6, DOI [10.1093/nar/gku1130, 10.1093/nar/gkv1290]
[2]  
[Anonymous], DATABASE
[3]   Reorganizing the protein space at the Universal Protein Resource (UniProt) [J].
Apweiler, Rolf ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Alam-Faruque, Yasmin ;
Antunes, Ricardo ;
Casanova, Elisabet Barrera ;
Bely, Benoit ;
Bingley, Mark ;
Bower, Lawrence ;
Bursteinas, Borisas ;
Chan, Wei Mun ;
Chavali, Gayatri ;
Da Silva, Alan ;
Dimmer, Emily ;
Eberhardt, Ruth ;
Fazzini, Francesco ;
Fedotov, Alexander ;
Garavelli, John ;
Castro, Leyla Garcia ;
Gardner, Michael ;
Hieta, Reija ;
Huntley, Rachael ;
Jacobsen, Julius ;
Legge, Duncan ;
Liu, Wudong ;
Luo, Jie ;
Orchard, Sandra ;
Patient, Samuel ;
Pichler, Klemens ;
Poggioli, Diego ;
Pontikos, Nikolas ;
Pundir, Sangya ;
Rosanoff, Steven ;
Sawford, Tony ;
Sehra, Harminder ;
Turner, Edward ;
Wardell, Tony ;
Watkins, Xavier ;
Corbett, Matt ;
Donnelly, Mike ;
van Rensburg, Pieter ;
Goujon, Mickael ;
McWilliam, Hamish ;
Lopez, Rodrigo ;
Xenarios, Ioannis ;
Bougueleret, Lydie ;
Bridge, Alan ;
Poux, Sylvain ;
Redaschi, Nicole .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D71-D75
[4]   Intrinsically disordered proteins: regulation and disease [J].
Babu, M. Madan ;
van der Lee, Robin ;
de Groot, Natalia Sanchez ;
Gsponer, Joerg .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2011, 21 (03) :432-440
[5]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
[6]   PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES [J].
BERNSTEIN, FC ;
KOETZLE, TF ;
WILLIAMS, GJB ;
MEYER, EF ;
BRICE, MD ;
RODGERS, JR ;
KENNARD, O ;
SHIMANOUCHI, T ;
TASUMI, M .
JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) :535-542
[7]   GeneWise and genomewise [J].
Birney, E ;
Clamp, M ;
Durbin, R .
GENOME RESEARCH, 2004, 14 (05) :988-995
[8]   Alternative splicing of intrinsically disordered regions and rewiring of protein interactions [J].
Buljan, Marija ;
Chalancon, Guilhem ;
Dunker, A. Keith ;
Bateman, Alex ;
Balaji, S. ;
Fuxreiter, Monika ;
Babu, M. Madan .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2013, 23 (03) :443-450
[9]   Representative Proteomes: A Stable, Scalable and Unbiased Proteome Set for Sequence Analysis and Functional Annotation [J].
Chen, Chuming ;
Natale, Darren A. ;
Finn, Robert D. ;
Huang, Hongzhan ;
Zhang, Jian ;
Wu, Cathy H. ;
Mazumder, Raja .
PLOS ONE, 2011, 6 (04)
[10]   Linkers in the structural biology of protein-protein interactions [J].
Chichili, Vishnu Priyanka Reddy ;
Kumar, Veerendra ;
Sivaraman, J. .
PROTEIN SCIENCE, 2013, 22 (02) :153-167