A Semi-Random Multiple Decision-Tree Algorithm for Mining Data Streams

被引:6
作者
胡学钢 [1 ]
李培培 [1 ]
吴信东 [2 ]
吴共庆 [1 ]
机构
[1] School of Computer Science and Information Engineering Hefei University of Technology,Hefei ,China
[2] School of Computer Science and Information Engineering Hefei University of Technology,Hefei ,China Department of Computer Science,University of Vermont,Burlington,VT ,USA
关键词
data streams; Naive Bayes; random decision trees;
D O I
暂无
中图分类号
TP311.13 [];
学科分类号
1201 ;
摘要
<正>Mining with streaming data is a hot topic in data mining.When performing classification on data streams, traditional classification algorithms based on decision trees,such as ID3 and C4.5,have a relatively poor efficiency in both time and space due to the characteristics of streaming data.There are some advantages in time and space when using random decision trees.An incremental algorithm for mining data streams,SRMTDS(Semi-Random Multiple decision Trees for Data Streams),based on random decision trees is proposed in this paper.SRMTDS uses the inequality of Hoeffding bounds to choose the minimum number of split-examples,a heuristic method to compute the information gain for obtaining the split thresholds of numerical attributes,and a Naive Bayes classifier to estimate the class labels of tree leaves.Our extensive experimental study shows that SRMTDS has an improved performance in time,space,accuracy and the anti-noise capability in comparison with VFDTc,a state-of-the-art decision-tree algorithm for classifying data streams.
引用
收藏
页码:711 / 724
页数:14
相关论文
共 7 条
[1]   高维数据流子空间聚类发现及维护算法 [J].
周晓云 ;
孙志挥 ;
张柏礼 ;
杨宜东 .
计算机研究与发展, 2006, (05) :834-840
[2]   基于数据流的频繁集挖掘 [J].
徐利军 ;
谢康林 ;
徐虹 .
上海交通大学学报, 2006, (03) :502-506
[3]   Random Forests [J].
Leo Breiman .
Machine Learning, 2001, 45 :5-32
[4]  
Thomas G. Dietterich.An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization[J].Machine Learning,2000
[5]  
Yali Amit,Donald Geman.Shape Quantization and Recognition with Randomized Trees[J].Neural Computation,1997
[6]   Decision Tree Induction Based on Efficient Tree Restructuring [J].
Paul E. Utgoff ;
Neil C. Berkman ;
Jeffery A. Clouse .
Machine Learning, 1997, 29 :5-44
[7]  
Dimitrios Kalles,Tim Morris.Efficient incremental induction of decision trees[J].Machine Learning,1996