Exploiting disjointness axioms to improve semantic similarity measures

被引:13
作者
Ferreira, Joao D. [1 ]
Hastings, Janna [2 ,3 ,4 ]
Couto, Francisco M. [1 ]
机构
[1] Univ Lisbon, Dept Informat, Fac Ciencias, P-1749016 Lisbon, Portugal
[2] EMBL European Bioinformat Inst, Hinxton CB10 1SD, England
[3] Univ Geneva, Swiss Ctr Affect Sci, CH-1205 Geneva, Switzerland
[4] Swiss Inst Bioinformat, Evolutionary Bioinformat Grp, CH-1015 Lausanne, Switzerland
关键词
ONTOLOGY; INFORMATION; DATABASE; CHEBI;
D O I
10.1093/bioinformatics/btt491
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Representing domain knowledge in biology has traditionally been accomplished by creating simple hierarchies of classes with textual annotations. Recently, expressive ontology languages, such as Web Ontology Language, have become more widely adopted, supporting axioms that express logical relationships other than class-subclass, e. g. disjointness. This is improving the coverage and validity of the knowledge contained in biological ontologies. However, current semantic tools still need to adapt to this more expressive information. In this article, we propose a method to integrate disjointness axioms, which are being incorporated in real-world ontologies, such as the Gene Ontology and the chemical entities of biological interest ontology, into semantic similarity, the measure that estimates the closeness in meaning between classes. Results: We present a modification of the measure of shared information content, which extends the base measure to allow the incorporation of disjointness information. To evaluate our approach, we applied it to several randomly selected datasets extracted from the chemical entities of biological interest ontology. In 93.8% of these datasets, our measure performed better than the base measure of shared information content. This supports the idea that semantic similarity is more accurate if it extends beyond the hierarchy of classes of the ontology.
引用
收藏
页码:2781 / 2787
页数:7
相关论文
共 23 条
[1]  
[Anonymous], FUNDAMENTALS BIOSTAT
[2]  
[Anonymous], INT C RES COMP LING
[3]  
[Anonymous], 1995, P 14 INT JOINT C ART
[4]  
[Anonymous], INT C BIOM ONT GRAZ
[5]  
Bolton EE, 2010, ANN REP COMP CHEM, V4, P217, DOI 10.1016/S1574-1400(08)00012-1
[6]   THE NEXT GENERATION OF SIMILARITY MEASURES THAT FULLY EXPLORE THE SEMANTICS IN BIOMEDICAL ONTOLOGIES [J].
Couto, Francisco M. ;
Sofia Pinto, H. .
JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2013, 11 (05)
[7]   Disjunctive shared information between ontology concepts: application to Gene Ontology [J].
Couto, Francisco M. ;
Silva, Mario J. .
JOURNAL OF BIOMEDICAL SEMANTICS, 2011, 2
[8]   ChEBI:: a database and ontology for chemical entities of biological interest [J].
Degtyarenko, Kirill ;
de Matos, Paula ;
Ennis, Marcus ;
Hastings, Janna ;
Zbinden, Martin ;
McNaught, Alan ;
Alcantara, Rafael ;
Darsow, Michael ;
Guedj, Mickael ;
Ashburner, Michael .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D344-D350
[9]   Semantic Similarity for Automatic Classification of Chemical Compounds [J].
Ferreira, Joao D. ;
Couto, Francisco M. .
PLOS COMPUTATIONAL BIOLOGY, 2010, 6 (09)
[10]   On the properties of bit string-based measures of chemical similarity [J].
Flower, DR .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1998, 38 (03) :379-386