共 35 条
Pfam 10 years on: 10 000 families and still growing
被引:89
作者:

Sammut, Stephen John
论文数: 0 引用数: 0
h-index: 0
机构:
Univ Malta, Msida, Malta Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England

Finn, Robert D.
论文数: 0 引用数: 0
h-index: 0
机构: Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England

Bateman, Alex
论文数: 0 引用数: 0
h-index: 0
机构:
Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England
机构:
[1] Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England
[2] Univ Malta, Msida, Malta
基金:
英国惠康基金;
关键词:
Pfam;
protein families;
classification;
coverage;
hidden Markov model;
D O I:
10.1093/bib/bbn010
中图分类号:
Q5 [生物化学];
学科分类号:
071010 ;
081704 ;
摘要:
Classifications of proteins into groups of related sequences are in some respects like a periodic table for biology, allowing us to understand the underlying molecular biology of any organism. Pfam is a large collection of protein domains and families. Its scientific goal is to provide a complete and accurate classification of protein families and domains. The next release of the database will contain over 10 000 entries, which leads us to reflect on how far we are from completing this work. Currently Pfam matches 72 of known protein sequences, but for proteins with known structure Pfam matches 95, which we believe represents the likely upper bound. Based on our analysis a further 28 000 families would be required to achieve this level of coverage for the current sequence database. We also show that as more sequences are added to the sequence databases the fraction of sequences that Pfam matches is reduced, suggesting that continued addition of new families is essential to maintain its relevance.
引用
收藏
页码:210 / 219
页数:10
相关论文
共 35 条
[1]
The universal protein resource (UniProt)
[J].
Bairoch, Amos
;
Bougueleret, Lydie
;
Altairac, Severine
;
Amendolia, Valeria
;
Auchincloss, Andrea
;
Puy, Ghislaine Argoud
;
Axelsen, Kristian
;
Baratin, Delphine
;
Blatter, Marie-Claude
;
Boeckmann, Brigitte
;
Bollondi, Laurent
;
Boutet, Emmanuel
;
Quintaje, Silvia Braconi
;
Breuza, Lionel
;
Bridge, Alan
;
deCastro, Edouard
;
Coral, Danielle
;
Coudert, Elisabeth
;
Cusin, Isabelle
;
Dobrokhotov, Pavel
;
Dornevil, Dolnide
;
Duvaud, Severine
;
Estreicher, Anne
;
Famiglietti, Livia
;
Feuermann, Marc
;
Gehant, Sebastian
;
Farriol-Mathis, Nathalie
;
Ferro, Serenella
;
Gasteiger, Elisabeth
;
Gateau, Alain
;
Gerritsen, Vivienne
;
Gos, Arnaud
;
Gruaz-Gumowski, Nadine
;
Hinz, Ursula
;
Hulo, Chantal
;
Hulo, Nicolas
;
Ioannidis, Vassilios
;
Ivanyi, Ivan
;
James, Janet
;
Jain, Eric
;
Jimenez, Silvia
;
Jungo, Florence
;
Junker, Vivien
;
Keller, Guillaume
;
Lachaize, Corinne
;
Lane-Guermonprez, Lydie
;
Langendijk-Genevaux, Petra
;
Lara, Vicente
;
Lemercier, Philippe
;
Le Saux, Virginie
.
NUCLEIC ACIDS RESEARCH,
2007, 35
:D193-D197

Bairoch, Amos
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Bougueleret, Lydie
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Altairac, Severine
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Amendolia, Valeria
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Auchincloss, Andrea
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Puy, Ghislaine Argoud
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Axelsen, Kristian
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Baratin, Delphine
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Blatter, Marie-Claude
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Boeckmann, Brigitte
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Bollondi, Laurent
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Boutet, Emmanuel
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Quintaje, Silvia Braconi
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Breuza, Lionel
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Bridge, Alan
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

deCastro, Edouard
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Coral, Danielle
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Coudert, Elisabeth
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Cusin, Isabelle
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Dobrokhotov, Pavel
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Dornevil, Dolnide
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Duvaud, Severine
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Estreicher, Anne
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Famiglietti, Livia
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Feuermann, Marc
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Gehant, Sebastian
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Farriol-Mathis, Nathalie
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Ferro, Serenella
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Gasteiger, Elisabeth
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Gateau, Alain
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Gerritsen, Vivienne
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Gos, Arnaud
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Gruaz-Gumowski, Nadine
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Hinz, Ursula
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Hulo, Chantal
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Hulo, Nicolas
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Ioannidis, Vassilios
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Ivanyi, Ivan
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

James, Janet
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Jain, Eric
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Jimenez, Silvia
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Jungo, Florence
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Junker, Vivien
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Keller, Guillaume
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Lachaize, Corinne
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Lane-Guermonprez, Lydie
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Langendijk-Genevaux, Petra
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Lara, Vicente
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Lemercier, Philippe
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA

Le Saux, Virginie
论文数: 0 引用数: 0
h-index: 0
机构: Georgetown Univ, Med Ctr, Washington, DC 20007 USA
[2]
The Protein Data Bank
[J].
Berman, HM
;
Westbrook, J
;
Feng, Z
;
Gilliland, G
;
Bhat, TN
;
Weissig, H
;
Shindyalov, IN
;
Bourne, PE
.
NUCLEIC ACIDS RESEARCH,
2000, 28 (01)
:235-242

Berman, HM
论文数: 0 引用数: 0
h-index: 0
机构: Rutgers State Univ, Dept Chem, Piscataway, NJ 08854 USA

Westbrook, J
论文数: 0 引用数: 0
h-index: 0
机构: Rutgers State Univ, Dept Chem, Piscataway, NJ 08854 USA

Feng, Z
论文数: 0 引用数: 0
h-index: 0
机构: Rutgers State Univ, Dept Chem, Piscataway, NJ 08854 USA

Gilliland, G
论文数: 0 引用数: 0
h-index: 0
机构: Rutgers State Univ, Dept Chem, Piscataway, NJ 08854 USA

Bhat, TN
论文数: 0 引用数: 0
h-index: 0
机构: Rutgers State Univ, Dept Chem, Piscataway, NJ 08854 USA

Weissig, H
论文数: 0 引用数: 0
h-index: 0
机构: Rutgers State Univ, Dept Chem, Piscataway, NJ 08854 USA

Shindyalov, IN
论文数: 0 引用数: 0
h-index: 0
机构: Rutgers State Univ, Dept Chem, Piscataway, NJ 08854 USA

Bourne, PE
论文数: 0 引用数: 0
h-index: 0
机构: Rutgers State Univ, Dept Chem, Piscataway, NJ 08854 USA
[3]
The ProDom database of protein domain families: more emphasis on 3D
[J].
Bru, C
;
Courcelle, E
;
Carrre, S
;
Beausse, Y
;
Dalmar, S
;
Kahn, D
.
NUCLEIC ACIDS RESEARCH,
2005, 33
:D212-D215

Bru, C
论文数: 0 引用数: 0
h-index: 0
机构:
CNRS, INRA, Lab Interact Plantes Microorgan, F-31326 Castanet Tolosan, France CNRS, INRA, Lab Interact Plantes Microorgan, F-31326 Castanet Tolosan, France

Courcelle, E
论文数: 0 引用数: 0
h-index: 0
机构:
CNRS, INRA, Lab Interact Plantes Microorgan, F-31326 Castanet Tolosan, France CNRS, INRA, Lab Interact Plantes Microorgan, F-31326 Castanet Tolosan, France

Carrre, S
论文数: 0 引用数: 0
h-index: 0
机构:
CNRS, INRA, Lab Interact Plantes Microorgan, F-31326 Castanet Tolosan, France CNRS, INRA, Lab Interact Plantes Microorgan, F-31326 Castanet Tolosan, France

Beausse, Y
论文数: 0 引用数: 0
h-index: 0
机构:
CNRS, INRA, Lab Interact Plantes Microorgan, F-31326 Castanet Tolosan, France CNRS, INRA, Lab Interact Plantes Microorgan, F-31326 Castanet Tolosan, France

Dalmar, S
论文数: 0 引用数: 0
h-index: 0
机构:
CNRS, INRA, Lab Interact Plantes Microorgan, F-31326 Castanet Tolosan, France CNRS, INRA, Lab Interact Plantes Microorgan, F-31326 Castanet Tolosan, France

Kahn, D
论文数: 0 引用数: 0
h-index: 0
机构:
CNRS, INRA, Lab Interact Plantes Microorgan, F-31326 Castanet Tolosan, France CNRS, INRA, Lab Interact Plantes Microorgan, F-31326 Castanet Tolosan, France
[4]
PROTEINS - 1000 FAMILIES FOR THE MOLECULAR BIOLOGIST
[J].
CHOTHIA, C
.
NATURE,
1992, 357 (6379)
:543-544

CHOTHIA, C
论文数: 0 引用数: 0
h-index: 0
机构:
MRC,MOLEC BIOL LAB,CAMBRIDGE CB2 2QH,ENGLAND MRC,MOLEC BIOL LAB,CAMBRIDGE CB2 2QH,ENGLAND
[5]
Bacterial Genomes as new gene homes:: The genealogy of ORFans in E-coli
[J].
Daubin, V
;
Ochman, H
.
GENOME RESEARCH,
2004, 14 (06)
:1036-1042

论文数: 引用数:
h-index:
机构:

Ochman, H
论文数: 0 引用数: 0
h-index: 0
机构:
Univ Arizona, Dept Biochem & Mol Biophys, Tucson, AZ 85721 USA Univ Arizona, Dept Biochem & Mol Biophys, Tucson, AZ 85721 USA
[6]
MUSCLE: multiple sequence alignment with high accuracy and high throughput
[J].
Edgar, RC
.
NUCLEIC ACIDS RESEARCH,
2004, 32 (05)
:1792-1797

Edgar, RC
论文数: 0 引用数: 0
h-index: 0
机构: Mill Valley, CA 94941
[7]
Protein function in the post-genomic era
[J].
Eisenberg, D
;
Marcotte, EM
;
Xenarios, I
;
Yeates, TO
.
NATURE,
2000, 405 (6788)
:823-826

Eisenberg, D
论文数: 0 引用数: 0
h-index: 0
机构: Univ Calif Los Angeles, Inst Mol Biol, Los Angeles, CA 90095 USA

Marcotte, EM
论文数: 0 引用数: 0
h-index: 0
机构: Univ Calif Los Angeles, Inst Mol Biol, Los Angeles, CA 90095 USA

论文数: 引用数:
h-index:
机构:

Yeates, TO
论文数: 0 引用数: 0
h-index: 0
机构: Univ Calif Los Angeles, Inst Mol Biol, Los Angeles, CA 90095 USA
[8]
GeneRAGE: a robust algorithm for sequence clustering and domain detection
[J].
Enright, AJ
;
Ouzounis, CA
.
BIOINFORMATICS,
2000, 16 (05)
:451-457

Enright, AJ
论文数: 0 引用数: 0
h-index: 0
机构:
EMBL, European Bioinformat Inst, Res Programme, Computat Gen Grp,Cambridge Outstn, Cambridge CB10 1SD, England EMBL, European Bioinformat Inst, Res Programme, Computat Gen Grp,Cambridge Outstn, Cambridge CB10 1SD, England

Ouzounis, CA
论文数: 0 引用数: 0
h-index: 0
机构:
EMBL, European Bioinformat Inst, Res Programme, Computat Gen Grp,Cambridge Outstn, Cambridge CB10 1SD, England EMBL, European Bioinformat Inst, Res Programme, Computat Gen Grp,Cambridge Outstn, Cambridge CB10 1SD, England
[9]
Detection of unrelated proteins in sequences multiple alignments by using predicted secondary structures
[J].
Errami, M
;
Geourjon, C
;
Deléage, G
.
BIOINFORMATICS,
2003, 19 (04)
:506-512

Errami, M
论文数: 0 引用数: 0
h-index: 0
机构:
Inst Biol & Chim Prot, Pole Bioinformat Lyonnais, CNRS, UMR 5086, F-69367 Lyon 07, France Inst Biol & Chim Prot, Pole Bioinformat Lyonnais, CNRS, UMR 5086, F-69367 Lyon 07, France

Geourjon, C
论文数: 0 引用数: 0
h-index: 0
机构:
Inst Biol & Chim Prot, Pole Bioinformat Lyonnais, CNRS, UMR 5086, F-69367 Lyon 07, France Inst Biol & Chim Prot, Pole Bioinformat Lyonnais, CNRS, UMR 5086, F-69367 Lyon 07, France

Deléage, G
论文数: 0 引用数: 0
h-index: 0
机构:
Inst Biol & Chim Prot, Pole Bioinformat Lyonnais, CNRS, UMR 5086, F-69367 Lyon 07, France Inst Biol & Chim Prot, Pole Bioinformat Lyonnais, CNRS, UMR 5086, F-69367 Lyon 07, France
[10]
Pfam:: clans, web tools and services
[J].
Finn, Robert D.
;
Mistry, Jaina
;
Schuster-Bockler, Benjamin
;
Griffiths-Jones, Sam
;
Hollich, Volker
;
Lassmann, Timo
;
Moxon, Simon
;
Marshall, Mhairi
;
Khanna, Ajay
;
Durbin, Richard
;
Eddy, Sean R.
;
Sonnhammer, Erik L. L.
;
Bateman, Alex
.
NUCLEIC ACIDS RESEARCH,
2006, 34
:D247-D251

Finn, Robert D.
论文数: 0 引用数: 0
h-index: 0
机构: Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England

Mistry, Jaina
论文数: 0 引用数: 0
h-index: 0
机构: Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England

Schuster-Bockler, Benjamin
论文数: 0 引用数: 0
h-index: 0
机构: Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England

Griffiths-Jones, Sam
论文数: 0 引用数: 0
h-index: 0
机构: Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England

Hollich, Volker
论文数: 0 引用数: 0
h-index: 0
机构: Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England

Lassmann, Timo
论文数: 0 引用数: 0
h-index: 0
机构: Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England

Moxon, Simon
论文数: 0 引用数: 0
h-index: 0
机构: Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England

Marshall, Mhairi
论文数: 0 引用数: 0
h-index: 0
机构: Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England

Khanna, Ajay
论文数: 0 引用数: 0
h-index: 0
机构: Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England

Durbin, Richard
论文数: 0 引用数: 0
h-index: 0
机构: Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England

Eddy, Sean R.
论文数: 0 引用数: 0
h-index: 0
机构: Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England

Sonnhammer, Erik L. L.
论文数: 0 引用数: 0
h-index: 0
机构: Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England

Bateman, Alex
论文数: 0 引用数: 0
h-index: 0
机构: Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England