Metric-based stochastic conceptual clustering for ontologies

被引:11
作者
Fanizzi, Nicola [1 ]
d'Amato, Claudia [1 ]
Esposito, Floriana [1 ]
机构
[1] Univ Bari, Dipartimento Informat, I-70125 Bari, Italy
关键词
Conceptual clustering; Unsupervised learning; Metric learning; Genetic programming; Evolutionary algorithms; Description logics; Randomized optimization; CONCEPT DRIFT; DL;
D O I
10.1016/j.is.2009.03.008
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A conceptual clustering framework is presented which can be applied to multi-relational knowledge bases storing resource annotations expressed in the standard languages for the Semantic Web. The framework adopts an effective and language-independent family of semi-distance measures defined for the space of individual resources. These measures are based on a finite number of dimensions corresponding to a committee of discriminating features represented by concept descriptions. The clustering algorithm expresses the possible clusterings in terms of strings of central elements (medoids, w.r.t. the given metric) of variable length. The method performs a stochastic search in the space of possible clusterings, exploiting a technique based on genetic programming. Besides, the number of clusters is not necessarily required as a parameter: a natural number of clusters is autonomously determined, since the search spans a space of strings of different length. An experimentation with real ontologies proves the feasibility of the clustering method and its effectiveness in terms of standard validity indices. The framework is completed by a successive phase, where a newly constructed intensional definition, expressed in the adopted concept language, can be assigned to each cluster. Finally, two possible extensions are proposed. One allows the induction of hierarchies of clusters. The other applies clustering to concept drift and novelty detection in the context of ontologies. (c) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:792 / 806
页数:15
相关论文
共 39 条
[31]  
Nasraoui O, 2002, SIAM PROC S, P531
[32]  
NASRAOUI O, 2006, LECT NOTES COMPUTER, V4737, P82
[33]   A web usage mining framework for mining evolving user profiles in dynamic Web sites [J].
Nasraoui, Olfa ;
Soliman, Maha ;
Saka, Esin ;
Badia, Antonio ;
Germain, Richard .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (02) :202-215
[34]  
NIENHUYSCHENG SH, 1998, LECT NOTES ARTIF INT, V1446, P250
[35]  
Pawlak Z., 1991, Rough Sets: Theoretical Aspects of Reasoning About Data, V9, DOI [10.1007/978-94-011-3534-4, DOI 10.1007/978-94-011-3534-4]
[36]  
SPINOSA EJ, 2007, P 22 ANN ACM S APPL, V1, P448
[37]   CONCEPTUAL CLUSTERING OF STRUCTURED OBJECTS - A GOAL-ORIENTED APPROACH [J].
STEPP, RE ;
MICHALSKI, RS .
ARTIFICIAL INTELLIGENCE, 1986, 28 (01) :43-69
[38]  
Widmer G, 1996, MACH LEARN, V23, P69, DOI 10.1023/A:1018046501280
[39]  
ZEZULA P, 2007, ADV DATABASE SYSTEMS