Computing iceberg concept lattices with TITANIC

被引:270
作者
Stumme, G
Taouil, R
Bastide, Y
Pasquier, N
Lakhal, L
机构
[1] Univ Karlsruhe, Inst Angew Informat & Formale Beschreibungsverfah, D-76128 Karlsruhe, Germany
[2] Inst Natl Rech Informat & Automat Lorraine, LORIA, F-54506 Vandoeuvre Les Nancy, France
[3] Univ Clermont Ferrand, Lab Informat, LIMOS, F-63177 Aubiere, France
[4] Univ Nice, UNSA, CNRS, UPRESA 6070, F-06903 Sophia Antipolis, France
[5] Univ Mediterranee, LIM, CNRS, FRE 2246, F-13288 Marseille 9, France
关键词
knowledge discovery; database analysis; formal concept analysis; closure systems; lattices; algorithms;
D O I
10.1016/S0169-023X(02)00057-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce the notion of iceberg concept lattices and show their use in knowledge discovery in databases. Iceberg lattices are a conceptual clustering method, which is well suited for analyzing very large databases. They also serve as a condensed representation of frequent itemsets, as starting point for computing bases of association rules, and as a visualization method for association rules. Iceberg concept lattices are based on the theory of Formal Concept Analysis, a mathematical theory with applications in data analysis, information retrieval, and knowledge discovery. We present a new algorithm called TITANIC for computing (iceberg) concept lattices. It is based on data mining techniques with a level-wise approach. In fact, TITANIC can be used for a more general problem: Computing arbitrary closure systems when the closure operator comes along with a so-called weight function. The use of weight functions for computing closure systems has not been discussed in the literature up to now. Applications providing such a weight function include association rule mining, functional dependencies in databases, conceptual clustering, and ontology engineering. The algorithm is experimentally evaluated and compared with Ganter's Next-Closure algorithm. The evaluation shows an important gain in efficiency, especially for weakly correlated data. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:189 / 222
页数:34
相关论文
共 51 条
[1]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[2]  
Agrawal R., 1994, P 20 INT C VER LARG, P478
[3]  
[Anonymous], P 1998 ACM SIGMOD IN
[4]  
[Anonymous], 1993, P ICML
[5]  
[Anonymous], 1991, MATHEMATIQUES INFORM
[6]  
ARNAULD A, 1668, LOGIQUE ART PENSER C
[7]  
Bastide I, 2000, LECT NOTES ARTIF INT, V1861, P972
[8]  
BASTIDE Y, 2000, SIGKDD EXPLORATIONS, V2, P71
[9]  
Becker K, 2000, LECT NOTES ARTIF INT, V1937, P352
[10]  
Cole R, 2000, LECT NOTES ARTIF INT, V1867, P438