An empirical study of domain knowledge and its benefits to substructure discovery

被引:12
作者
Djoko, S
Cook, DJ
Holder, LB
机构
[1] Department of Computer Science and Engineering, University of Texas at Arlington, Box 19015, Arlington
基金
美国国家航空航天局;
关键词
data mining; minimum description length principle; data compression; inexact graph match; domain knowledge;
D O I
10.1109/69.617051
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Discovering repetitive, interesting, and functional substructures in a structural database improves the ability to interpret and compress the data. However, scientists working with a database in their area of expertise often search for predetermined types of structures or for structures exhibiting characteristics specific to the domain. This paper presents a method for guiding the discovery process with domain-specific knowledge. In this paper, the SUBDUE discovery system is used to evaluate the benefits of using domain knowledge to guide the discovery process. Domain knowledge is incorporated into SUBDUE following a single general methodology to guide the discovery process. Results show that domain-specific knowledge improves the search for substructures that are useful to the domain and leads to greater compression of the data. To illustrate these benefits, examples and experiments from the computer programming, computer-aided design circuit, and artificially generated domains are presented.
引用
收藏
页码:575 / 586
页数:12
相关论文
共 23 条
[1]  
[Anonymous], 1991, KNOWLEDGE DISCOVERY
[2]  
[Anonymous], 1980, RC ACTIVE CIRCUITS T
[3]   Inexact graph matching for structural pattern recognition [J].
Bunke, H. ;
Allermann, G. .
PATTERN RECOGNITION LETTERS, 1983, 1 (04) :245-253
[4]  
CHEESEMAN P, 1988, 5TH P INT C MACH LEA, P54
[5]  
Conklin D., 1992, P 9 INT C MACH LEARN, P111
[6]   Substructure Discovery Using Minimum Description Length and Background Knowledge [J].
Cook, Diane J. ;
Holder, Lawrence B. .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1993, 1 :231-255
[7]   THE SEGMENT LENGTH CURSE IN LONG TREE-RING CHRONOLOGY DEVELOPMENT FOR PALEOCLIMATIC STUDIES [J].
COOK, ER ;
BRIFFA, KR ;
MEKO, DM ;
GRAYBILL, DA ;
FUNKHOUSER, G .
HOLOCENE, 1995, 5 (02) :229-237
[8]  
DERTHICK M, 1991, PROCEEDINGS : NINTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 AND 2, P565
[9]  
Fisher D. H., 1987, Machine Learning, V2, P139, DOI 10.1007/BF00114265
[10]   CONSTRUCTING SIMPLE STABLE DESCRIPTIONS FOR IMAGE PARTITIONING [J].
LECLERC, YG .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 1989, 3 (01) :73-102