Fast Perceptron Decision Tree Learning from Evolving Data Streams

被引:63
作者
Bifet, Albert [1 ]
Holmes, Geoff [1 ]
Pfahringer, Bernhard [1 ]
Frank, Eibe [1 ]
机构
[1] Univ Waikato, Hamilton, New Zealand
来源
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT II, PROCEEDINGS | 2010年 / 6119卷
关键词
D O I
10.1007/978-3-642-13672-6_30
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mining of data streams must balance three evaluation dimensions: accuracy, time and memory. Excellent accuracy on data streams has been obtained with Naive Bayes Hoeffding Trees-Hoeffding Trees with naive Hayes models at the leaf nodes-albeit with increased runtime compared to standard Hoeffding Trees. In this paper, we show that runtime can be reduced by replacing naive Bayes with perceptron classifiers, while maintaining highly competitive accuracy. We also show that accuracy can be increased even further by combining majority vote, naive Hayes, and perceptrons. We evaluate four perceptron-based learning strategies and compare them against appropriate baselines: simple perceptrons, Perceptron Hoeffding Trees, hybrid Naive Bayes Perceptron Trees, and bagged versions thereof. We implement a perceptron that uses the sigmoid activation function instead of the threshold activation function and optimizes the squared error, with one perceptron per class value. We test our methods by performing an evaluation study on synthetic and real-world datasets comprising up to ten million examples.
引用
收藏
页码:299 / 310
页数:12
相关论文
共 26 条
[1]  
[Anonymous], 1984, OLSHEN STONE CLASSIF, DOI 10.2307/2530946
[2]  
[Anonymous], 2007, Uci machine learning repository
[3]  
[Anonymous], MOA MASSIVE ONLINE A
[4]   Enlarging the margins in perceptron decision trees [J].
Bennett, KP ;
Cristianini, N ;
Shawe-Taylor, J ;
Wu, DH .
MACHINE LEARNING, 2000, 41 (03) :295-313
[5]  
Bifet A., 2007, LEARNING TIME CHANGI
[6]  
Bifet A, 2009, KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P139
[7]  
DOMINGOS P, 2000, KNOWLEDGE DISCOVERY, P71, DOI DOI 10.1145/347090.347107
[8]   Technical note: Using model trees for classification [J].
Frank, E ;
Wang, Y ;
Inglis, S ;
Holmes, G ;
Witten, IH .
MACHINE LEARNING, 1998, 32 (01) :63-76
[9]  
Gama J, 2004, LECT NOTES ARTIF INT, V3171, P286
[10]  
GAMA J, 2009, COMBINING CLASSIFICA