DIRECT: a system for mining data value conversion rules from disparate data sources

被引:14
作者
Fan, WG
Lu, HJ
Madnick, SE
Cheung, D
机构
[1] Univ Michigan, Sch Business, Dept Comp & Informat Sci, Ann Arbor, MI 48109 USA
[2] Hong Kong Univ Sci & Technol, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China
[3] MIT, Sloan Sch Management, Cambridge, MA 02139 USA
[4] Univ Hong Kong, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China
关键词
data integration; data mining; semantic conflicts; data visualization; statistical analysis; data value conversion;
D O I
10.1016/S0167-9236(02)00006-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The successful integration of data from autonomous and heterogeneous systems calls for the resolution of semantic conflicts that may be present. Such conflicts are often reflected by discrepancies in attribute values of the same data object. In this paper, we describe a recently developed prototype system, Discovering and REconciling ConflicTs (DIRECT). The system mines data value conversion rules in the process of integrating business data from multiple sources. The system architecture and functional modules are described. The process of discovering conversion rules from sales data of a trading company is presented as an illustrative example. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:19 / 39
页数:21
相关论文
共 38 条
[1]  
Afifi A.A., 1996, COMPUTER AIDED MULTI, V3rd
[2]  
AGARWAL S, 1995, P IEEE INT C DAT ENG
[3]  
[Anonymous], 1990, SUBSET SELECTION REG, DOI DOI 10.1007/978-1-4899-2939-6
[4]   Mining business databases [J].
Brachman, RJ ;
Khabaza, T ;
Kloesgen, W ;
PiatetskyShapiro, G ;
Simoudis, E .
COMMUNICATIONS OF THE ACM, 1996, 39 (11) :42-48
[5]  
BRESSAN S, 1997, P INT LOG PROGR S OC
[6]  
BRESSAN S, 1997, P ACM SIGMOD PODS JO
[7]  
BURNS P, 1992, GENETIC ALOGORITHM R
[8]  
Chatterjee S., 1991, REGRESSION ANAL EXAM
[9]  
Dayal U., 1983, Processing queries over generalization hierarchies in a multidatabase system Proceedings of the 9th International Conference on Very Large, Data Bases, P342
[10]  
DeMichiel L. G., 1989, IEEE Transactions on Knowledge and Data Engineering, V1, P485, DOI 10.1109/69.43423