Data Mining with Big Data

被引:1596
作者
Wu, Xindong [1 ,2 ]
Zhu, Xingquan [3 ]
Wu, Gong-Qing [1 ]
Ding, Wei [4 ]
机构
[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei, Peoples R China
[2] Univ Vermont, Dept Comp Sci, Burlington, VT 05405 USA
[3] Florida Atlantic Univ, Dept Comp & Elect Engn & Comp Sci, Boca Raton, FL 33341 USA
[4] Univ Massachusetts, Dept Comp Sci, Boston, MA 02125 USA
基金
中国国家自然科学基金; 澳大利亚研究理事会; 美国国家科学基金会;
关键词
Big Data; data mining; heterogeneity; autonomous sources; complex and evolving associations; ALGORITHMS; KNOWLEDGE; PRIVACY; CLASSIFICATION; MAPREDUCE; BEHAVIOR;
D O I
10.1109/TKDE.2013.109
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Big Data concern large-volume, complex, growing data sets with multiple, autonomous sources. With the fast development of networking, data storage, and the data collection capacity, Big Data are now rapidly expanding in all science and engineering domains, including physical, biological and biomedical sciences. This paper presents a HACE theorem that characterizes the features of the Big Data revolution, and proposes a Big Data processing model, from the data mining perspective. This data-driven model involves demand-driven aggregation of information sources, mining and analysis, user interest modeling, and security and privacy considerations. We analyze the challenging issues in the data-driven model and also in the Big Data revolution.
引用
收藏
页码:97 / 107
页数:11
相关论文
共 56 条
[1]   Algorithms for mining the evolution of conserved relational states in dynamic networks [J].
Ahmed, Rezwan ;
Karypis, George .
KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 33 (03) :603-630
[2]   Novel approaches to crawling important pages early [J].
Alam, Md. Hijbul ;
Ha, JongWoo ;
Lee, SangKeun .
KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 33 (03) :707-734
[3]  
[Anonymous], 2006, NIPS
[4]  
[Anonymous], P INT ASTR UN S TIM
[5]  
[Anonymous], MANY PHOTOS ARE UPLO
[6]  
[Anonymous], 2010, MCKINSEY Q
[7]  
[Anonymous], 2011, MINING MASSIVE DATA
[8]  
[Anonymous], 2008, NY TIMES
[9]  
[Anonymous], 2008, ANN ICRP
[10]  
[Anonymous], 2012, IBM WHAT IS BIG DAT