Substructure Discovery Using Minimum Description Length and Background Knowledge

被引:223
作者
Cook, Diane J. [1 ]
Holder, Lawrence B. [1 ]
机构
[1] Univ Texas Arlington, Dept Comp Sci Engn, Box 19015, Arlington, TX 76019 USA
关键词
Approximation theory - Computational methods - Constraint theory - Data acquisition - Data compression - Database systems - Graph theory - Knowledge based systems;
D O I
10.1613/jair.43
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The ability to identify interesting and repetitive substructures is an essential component to discovering knowledge in structural data. We describe a new version of our Subdue substructure discovery system based on the minimum description length principle. The Subdue system discovers substructures that compress the original data and represent structural concepts in the data. By replacing previously-discovered substructures in the data, multiple passes of Subdue produce a hierarchical description of the structural regularities in the data. Subdue uses a computationally-bounded inexact graph match that identifies similar, but not identical, instances of a substructure and finds an approximate measure of closeness of two substructures when under computational constraints. In addition to the minimum description length principle, other background knowledge can be used by Subdue to guide the search towards more appropriate substructures. Experiments in a variety of domains demonstrate Subdue's ability to find substructures capable of compressing the original data and to discover structural concepts important to the domain
引用
收藏
页码:231 / 255
页数:25
相关论文
共 26 条
[1]  
[Anonymous], 1988, P 5 INT C MACH LEARN
[2]  
Bharat Rao R., 1992, AAAI-92. Proceedings Tenth National Conference on Artificial Intelligence, P717
[3]   Inexact graph matching for structural pattern recognition [J].
Bunke, H. ;
Allermann, G. .
PATTERN RECOGNITION LETTERS, 1983, 1 (04) :245-253
[4]  
Conklin D., 1992, P 9 INT C MACH LEARN, P111
[5]  
Derthick M., 1991, AAAI-91. Proceedings Ninth National Conference on Artificial Intelligence, P565
[6]  
Fisher D. H., 1987, Machine Learning, V2, P139, DOI 10.1007/BF00114265
[7]  
Fu K.S., 2019, APPL PATTERN RECOGNI
[8]   DISCOVERY OF INEXACT CONCEPTS FROM STRUCTURAL DATA [J].
HOLDER, LB ;
COOK, DJ .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1993, 5 (06) :992-994
[9]  
JELTSCH E, 1991, LECT NOTES COMPUT SC, V532, P461, DOI 10.1007/BFb0017406
[10]   CONSTRUCTING SIMPLE STABLE DESCRIPTIONS FOR IMAGE PARTITIONING [J].
LECLERC, YG .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 1989, 3 (01) :73-102