Ontology-Based Querying with Bio2RDF's Linked Open Data

被引:28
作者
Callahan A. [1 ]
Cruz-Toledo J. [1 ]
Dumontier M. [1 ,2 ,3 ]
机构
[1] Carleton University, Department of Biology, 1125 Colonel By Drive, Ottawa, ON
[2] Carleton University, Institute of Biochemistry, 1125 Colonel By Drive, Ottawa, ON
[3] School of Computer Science Carleton University, 1125 Colonel By Drive, Ottawa, ON
基金
加拿大自然科学与工程研究理事会;
关键词
Resource Description Framework; Open Biomedical Ontology; Saccharomyces Genome Database; Comparative Toxicogenomics Database; Basic Formal Ontology;
D O I
10.1186/2041-1480-4-S1-S1
中图分类号
学科分类号
摘要
Background: A key activity for life scientists in this post "-omics" age involves searching for and integrating biological data from a multitude of independent databases. However, our ability to find relevant data is hampered by non-standard web and database interfaces backed by an enormous variety of data formats. This heterogeneity presents an overwhelming barrier to the discovery and reuse of resources which have been developed at great public expense. To address this issue, the open-source Bio2RDF project promotes a simple convention to integrate diverse biological data using Semantic Web technologies. However, querying Bio2RDF remains difficult due to the lack of uniformity in the representation of Bio2RDF datasets. Results: We describe an update to Bio2RDF that includes tighter integration across 19 new and updated RDF datasets. All available open-source scripts were first consolidated to a single GitHub repository and then redeveloped using a common API that generates normalized IRIs using a centralized dataset registry. We then mapped dataset specific types and relations to the Semanticscience Integrated Ontology (SIO) and demonstrate simplified federated queries across multiple Bio2RDF endpoints. Conclusions: This coordinated release marks an important milestone for the Bio2RDF open source linked data framework. Principally, it improves the quality of linked data in the Bio2RDF network and makes it easier to access or recreate the linked data locally. We hope to continue improving the Bio2RDF network of linked data by identifying priority databases and increasing the vocabulary coverage to additional dataset vocabularies beyond SIO. © 2013 Callahan et al; licensee BioMed Central Ltd.
引用
收藏
相关论文
共 39 条
[1]  
Sayers, E.W., Barrett, T., Benson, D.A., Bolton, E., Bryant, S.H., Canese, K., Chetvernin, V., Federhen, S., Database resources of the National Center for Biotechnology Information (2012) Nucleic Acids Res, 40 (DATABASE ISSUE), pp. D13-25
[2]  
Brooksbank, C., Cameron, G., Thornton, J., The European Bioinformatics Institute's data resources (2010) Nucleic Acids Res, 38 (DATABASE ISSUE), pp. D17-25
[3]  
Razick, S., Magklaras, G., Donaldson, I.M., iRefIndex: a consolidated protein interaction database with provenance (2008) BMC Bioinformatics, 9, p. 405
[4]  
Magrane, M., Consortium, U., UniProt Knowledgebase: a hub of integrated protein data (2011) Database (Oxford), 2011. , bar009
[5]  
Goble, C., Stevens, R., State of the nation in data integration for bioinformatics (2008) J Biomed Inform, 41 (5), pp. 687-693
[6]  
Belleau, F., Nolin, M.-A., Tourigny, N., Rigault, P., Morissette, J., Bio2RDF: towards a mashup to build bioinformatics knowledge systems (2008) Journal of Biomedical Informatics, 41 (5), pp. 706-716
[7]  
Nolin, M.A., Dumontier, M., Belleau, F., Corbeil, J., Building an HIV data mashup using Bio2RDF (2011) Briefings in Bioinformatics
[8]  
Nolin, M-A., Ansell, P., Belleau, F., Idehen, K., Rigault, P., Tourigny, N., Roe, P., Dumontier, M., Bio2RDF Network of Linked Data (2008), Semantic Web Challenge
[9]  
International Semantic Web Conference (ISWC). 2008, Karlsruhe, Germany
[10]  
http://www.w3.org/DesignIssues/LinkedData.html