Mining constrained frequent itemsets from distributed uncertain data

被引:54
作者
Cuzzocrea, Alfredo [1 ,2 ]
Leung, Carson Kai-Sang [3 ]
MacKinnon, Richard Kyle [3 ]
机构
[1] ICAR CNR, I-87036 Arcavacata Di Rende, CS, Italy
[2] Univ Calabria, I-87036 Arcavacata Di Rende, CS, Italy
[3] Univ Manitoba, Dept Comp Sci, Winnipeg, MB R3T 2N2, Canada
来源
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE | 2014年 / 37卷
基金
加拿大自然科学与工程研究理事会;
关键词
Data mining; Frequent pattern mining; Advanced data-intensive computing algorithms; Constraints; Distributed computing; PARALLEL;
D O I
10.1016/j.future.2013.10.026
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Nowadays, high volumes of massive data can be generated from various sources (e.g., sensor data from environmental surveillance). Many existing distributed frequent itemset mining algorithms do not allow users to express the itemsets to be mined according to their intention via the use of constraints. Consequently, these unconstrained mining algorithms can yield numerous itemsets that are not interesting to users. Moreover, due to inherited measurement inaccuracies and/or network latencies, the data are often riddled with uncertainty. These call for both constrained mining and uncertain data mining. In this journal article, we propose a data-intensive computer system for tree-based mining of frequent itemsets that satisfy user-defined constraints from a distributed environment such as a wireless sensor network of uncertain data. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:117 / 126
页数:10
相关论文
共 38 条
[31]   Mining of Frequent Itemsets from Streams of Uncertain Data [J].
Leung, Carson Kai-Sang ;
Hao, Boyu .
ICDE: 2009 IEEE 25TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2009, :1663-1670
[32]  
Ng R. T., 1998, SIGMOD Record, V27, P13, DOI 10.1145/276305.276307
[33]   Pushing convertible constraints in frequent itemset mining [J].
Pei, J ;
Han, JW ;
Lakshmanan, LVS .
DATA MINING AND KNOWLEDGE DISCOVERY, 2004, 8 (03) :227-252
[34]  
Sarangi S.R., 2010, Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD'2010, P383
[35]   A high-performance distributed algorithm for mining association rules [J].
Schuster, A ;
Wolff, R ;
Trock, D .
KNOWLEDGE AND INFORMATION SYSTEMS, 2005, 7 (04) :458-475
[36]   APHID: An architecture for private, high-performance integrated data mining [J].
Secretan, Jimmy ;
Georgiopoulos, Michael ;
Koufakou, Anna ;
Cardona, Kel .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2010, 26 (07) :891-904
[37]   P-found: Grid-enabling distributed repositories of protein folding and unfolding simulations for data mining [J].
Swain, Martin ;
Silva, Candida G. ;
Loureiro-Ferreira, Nuno ;
Ostropytskyy, Vitaliy ;
Brito, Joao ;
Riche, Olivier ;
Stahl, Frederick ;
Dubitzky, Werner ;
Brito, Rui M. M. .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2010, 26 (03) :424-433
[38]   Parallel and distributed association mining: A survey [J].
Zaki, MJ .
IEEE CONCURRENCY, 1999, 7 (04) :14-25