ProGMap: an integrated annotation resource for protein orthology

被引:11
作者
Kuzniar, Arnold [1 ]
Lin, Ke [1 ]
He, Ying [1 ]
Nijveen, Harm [1 ]
Pongor, Sandor [2 ,3 ]
Leunissen, Jack A. M. [1 ]
机构
[1] Univ Wageningen & Res Ctr, Lab Bioinformat, NL-6703 HA Wageningen, Netherlands
[2] Int Ctr Genet Engn & Biotechnol, Prot Struct & Bioinformat Grp, I-34012 Trieste, Italy
[3] Hungarian Acad Sci, Biol Res Ctr, H-6726 Szeged, Hungary
关键词
DATABASE; GENE; FAMILIES; MBL2; TOOL;
D O I
10.1093/nar/gkp462
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Current protein sequence databases employ different classification schemes that often provide conflicting annotations, especially for poorly characterized proteins. ProGMap (Protein Group Mappings, http://www.bioinformatics.nl/progmap) is a web-tool designed to help researchers and database annotators to assess the coherence of protein groups defined in various databases and thereby facilitate the annotation of newly sequenced proteins. ProGMap is based on a non-redundant dataset of over 6.6 million protein sequences which is mapped to 240 000 protein group descriptions collected from UniProt, RefSeq, Ensembl, COG, KOG, OrthoMCL-DB, HomoloGene, TRIBES and PIRSF. ProGMap combines the underlying classification schemes via a network of links constructed by a fast and fully automated mapping approach originally developed for document classification. The web interface enables queries to be made using sequence identifiers, gene symbols, protein functions or amino acid and nucleotide sequences. For the latter query type BLAST similarity search and QuickMatch identity search services have been incorporated, for finding sequences similar (or identical) to a query sequence. ProGMap is meant to help users of high throughput methodologies who deal with partially annotated genomic data.
引用
收藏
页码:W428 / W434
页数:7
相关论文
共 25 条
[1]   IDconverter and IDClight:: Conversion and annotation of gene and protein IDs [J].
Alibes, Andreu ;
Yankilevich, Patricio ;
Canada, Andres ;
Diaz-Uriarte, Ramon .
BMC BIOINFORMATICS, 2007, 8
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   MatchMiner: a tool for batch navigation among gene and gene product identifiers [J].
Bussey, KJ ;
Kane, D ;
Sunshine, M ;
Narasimhan, S ;
Nishizuka, S ;
Reinhold, WC ;
Zeeberg, B ;
Ajay ;
Weinstein, JN .
GENOME BIOLOGY, 2003, 4 (04)
[4]   CARGO:: a web portal to integrate customized biological information [J].
Cases, Ildefonso ;
Pisano, David G. ;
Andres, Eduardo ;
Carro, Angel ;
Fernandez, Jose M. ;
Gomez-Lopez, Gonzalo ;
Rodriguez, Jose M. ;
Vera, Jaime F. ;
Valencia, Alfonso ;
Rojas, Ana M. .
NUCLEIC ACIDS RESEARCH, 2007, 35 :W16-W20
[5]   OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups [J].
Chen, Feng ;
Mackey, Aaron J. ;
Stoeckert, Christian J., Jr. ;
Roos, David S. .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D363-D368
[6]   SOURCE: a unified genomic resource of functional annotations, ontologies, and gene expression data [J].
Diehn, M ;
Sherlock, G ;
Binkley, G ;
Jin, H ;
Matese, JC ;
Hernandez-Boussard, T ;
Rees, CA ;
Cherry, JM ;
Botstein, D ;
Brown, PO ;
Alizadeh, AA .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :219-223
[7]   Protein families and TRIBES in genome sequence space [J].
Enright, AJ ;
Kunin, V ;
Ouzounis, CA .
NUCLEIC ACIDS RESEARCH, 2003, 31 (15) :4632-4638
[8]  
ETZOLD T, 1993, COMPUT APPL BIOSCI, V9, P49
[9]   HCOP: a searchable database of human orthology predictions [J].
Eyre, Tina A. ;
Wright, Mathew W. ;
Lush, Michael J. ;
Bruford, Elspeth A. .
BRIEFINGS IN BIOINFORMATICS, 2007, 8 (01) :2-5
[10]   The Pfam protein families database [J].
Finn, Robert D. ;
Tate, John ;
Mistry, Jaina ;
Coggill, Penny C. ;
Sammut, Stephen John ;
Hotz, Hans-Rudolf ;
Ceric, Goran ;
Forslund, Kristoffer ;
Eddy, Sean R. ;
Sonnhammer, Erik L. L. ;
Bateman, Alex .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D281-D288