Babel's tower revisited: a universal resource for cross-referencing across annotation databases

被引:28
作者
Draghici, Sorin [1 ]
Sellamuthu, Sivakumar [1 ]
Khatri, Purvesh [1 ]
机构
[1] Wayne State Univ, Dept Comp Sci, Detroit, MI 48202 USA
基金
美国国家卫生研究院;
关键词
D O I
10.1093/bioinformatics/btl372
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Annotation databases are widely used as public repositories of biological knowledge. However, most of these resources have been developed by independent groups which used different designs and different identifiers for the same biological entities. As we show in this article, incoherent name spaces between various databases represent a serious impediment to using the existing annotations at their full potential. Navigating between various such name spaces by mapping IDs from one database to another is a very important issue which is not properly addressed at the moment. Results: We have developed a web-based resource, Onto-Translate (OT), which effectively addresses this problem. OT is able to map onto each other different types of biological entities from the following annotation databases: Swiss-Prot, TrEMBL, NREF, PIR, Gene Ontology, KEGG, Entrez Gene, GenBank, GenPept, IMAGE, RefSeq, UniGene, OMIM, PDB, Eukaryotic Promoter Database, HUGO Gene Nomenclature Committee and NetAffx. Currently, OT is able to perform 462 types of mappings between 29 different types of IDs from 17 databases concerning 53 organisms. Among these, over 300 types of translations and 15 types of IDs are not currently supported by any other tool or resource. On average, OT is able to correctly map between 96 and 99% of the biological entities provided as input. In terms of speed, sets of similar to 20 000 IDs can be translated in < 30 s, in most cases.
引用
收藏
页码:2934 / 2939
页数:6
相关论文
共 23 条
[1]  
Ashburner M, 2001, GENOME RES, V11, P1425
[2]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[3]   The universal protein resource (UniProt) [J].
Bairoch, A ;
Apweiler, R ;
Wu, CH ;
Barker, WC ;
Boeckmann, B ;
Ferro, S ;
Gasteiger, E ;
Huang, HZ ;
Lopez, R ;
Magrane, M ;
Martin, MJ ;
Natale, DA ;
O'Donovan, C ;
Redaschi, N ;
Yeh, LSL .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D154-D159
[4]  
Benson Dennis A, 2005, Nucleic Acids Res, V33, pD34
[5]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[6]   DBEST - DATABASE FOR EXPRESSED SEQUENCE TAGS [J].
BOGUSKI, MS ;
LOWE, TMJ ;
TOLSTOSHEV, CM .
NATURE GENETICS, 1993, 4 (04) :332-333
[7]   MatchMiner: a tool for batch navigation among gene and gene product identifiers [J].
Bussey, KJ ;
Kane, D ;
Sunshine, M ;
Narasimhan, S ;
Nishizuka, S ;
Reinhold, WC ;
Zeeberg, B ;
Ajay ;
Weinstein, JN .
GENOME BIOLOGY, 2003, 4 (04)
[8]   GeneMerge - post-genomic analysis, data mining, and hypothesis testing [J].
Castillo-Davis, CI ;
Hartl, DL .
BIOINFORMATICS, 2003, 19 (07) :891-892
[9]   SOURCE: a unified genomic resource of functional annotations, ontologies, and gene expression data [J].
Diehn, M ;
Sherlock, G ;
Binkley, G ;
Jin, H ;
Matese, JC ;
Hernandez-Boussard, T ;
Rees, CA ;
Cherry, JM ;
Botstein, D ;
Brown, PO ;
Alizadeh, AA .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :219-223
[10]   Global functional profiling of gene expression [J].
Draghici, S ;
Khatri, P ;
Martins, RP ;
Ostermeier, GC ;
Krawetz, SA .
GENOMICS, 2003, 81 (02) :98-104