An adaptive ensemble classifier for mining concept drifting data streams

被引:123
作者
Farid, Dewan Md. [1 ]
Zhang, Li [1 ]
Hossain, Alamgir [1 ]
Rahman, Chowdhury Mofizur [2 ]
Strachan, Rebecca [1 ]
Sexton, Graham [1 ]
Dahal, Keshav [3 ]
机构
[1] Northumbria Univ, Dept Comp Sci & Digital Technol, Computat Intelligence Grp, Newcastle Upon Tyne NE1 8ST, Tyne & Wear, England
[2] United Int Univ, Dept Comp Sci & Engn, Dhaka, Bangladesh
[3] Univ Bradford, Sch Comp Informat & Media, Artificial Intelligence Res Grp, Bradford BD7 1DP, W Yorkshire, England
关键词
Adaptive ensembles; Concept drift; Clustering; Data streams; Decision trees; Novel classes; NOVELTY DETECTION;
D O I
10.1016/j.eswa.2013.05.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is challenging to use traditional data mining techniques to deal with real-time data stream classifications. Existing mining classifiers need to be updated frequently to adapt to the changes in data streams. To address this issue, in this paper we propose an adaptive ensemble approach for classification and novel class detection in concept drifting data streams. The proposed approach uses traditional mining classifiers and updates the ensemble model automatically so that it represents the most recent concepts in data streams. For novel class detection we consider the idea that data points belonging to the same class should be closer to each other and should be far apart from the data points belonging to other classes. If a data point is well separated from the existing data clusters, it is identified as a novel class instance. We tested the performance of this proposed stream classification model against that of existing mining algorithms using real benchmark datasets from UCI (University of California, Irvine) machine learning repository. The experimental results prove that our approach shows great flexibility and robustness in novel class detection in concept drifting and outperforms traditional classification models in challenging real-life data stream applications. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:5895 / 5906
页数:12
相关论文
共 41 条
[1]   A framework for on-demand classification of evolving data streams [J].
Aggarwal, CC ;
Han, JW ;
Wang, JY ;
Yu, PS .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (05) :577-589
[2]   Evolving Fuzzy-Rule-Based Classifiers From Data Streams [J].
Angelov, Plamen P. ;
Zhou, Xiaowei .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2008, 16 (06) :1462-1475
[3]  
[Anonymous], 2013, ACM T SPEECH LANG PR, DOI DOI 10.1145/2407736.2407738
[4]  
[Anonymous], 2006, P HUM LANG TECHN C N
[5]  
[Anonymous], 2014, C4. 5: programs for machine learning
[6]  
[Anonymous], 1984, OLSHEN STONE CLASSIF, DOI 10.2307/2530946
[7]  
[Anonymous], 2011, IJ MOD ED COMPUT SCI, DOI DOI 10.5815/IJMECS.2011.04.05
[8]  
[Anonymous], 2009, WEKA DATA MINING SOF
[9]  
Biswas A., 2012, J COMPUTER SCI ENG, V14, P1
[10]   Adaptive clustering for multiple evolving streams [J].
Dai, Bi-Ru ;
Huang, Jen-Wei ;
Yeh, Mi-Yen ;
Chen, Ming-Syan .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (09) :1166-1180