A parallel incremental extreme SVM classifier

被引:40
作者
He, Qing [1 ]
Du, Changying [1 ,2 ]
Wang, Qun [1 ,2 ]
Zhuang, Fuzhen [1 ,2 ]
Shi, Zhongzhi [1 ]
机构
[1] Chinese Acad Sci, Key Lab Intelligent Informat Proc, Inst Comp Technol, Beijing 100190, Peoples R China
[2] Chinese Acad Sci, Grad Sch, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Parallel extreme SVM (PESVM); MapReduce; Incremental extreme SVM (IESVM); Parallel incremental extreme SVM (PIESVM); LEARNING-MACHINE; ALGORITHM; MAPREDUCE; NETWORKS;
D O I
10.1016/j.neucom.2010.11.036
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The classification algorithm extreme SVM (ESVM) proposed recently has been proved to provide very good generalization performance in relatively short time, however, it is inappropriate to deal with large-scale data set due to the highly intensive computation. Thus we propose to implement an efficient parallel ESVM (PESVM) based on the current and powerful parallel programming framework MapReduce. Furthermore, we investigate that for some new coming training data, it is brutal for ESVM to always retrain a new model on all training data (including old and new coming data). Along this line, we develop an incremental learning algorithm for ESVM (IESVM), which can meet the requirement of online learning to update the existing model. Following that we also provide the parallel version of IESVM (PIESVM), which can solve both the large-scale problem and the online problem at the same time. The experimental results show that the proposed parallel algorithms not only can tackle large-scale data set, but also scale well in terms of the evaluation metrics of speedup, sizeup and scaleup. It is also worth to mention that PESVM, IESVM and PIESVM are much more efficient than ESVM, while the same solutions as ESVM are exactly obtained. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:2532 / 2540
页数:9
相关论文
共 35 条
[1]  
[Anonymous], HAD OP SOURC IMPL MA
[2]  
[Anonymous], P 18 EUR S ART NEUR
[3]  
[Anonymous], P KDD 2001 KNOWL DIS
[4]  
[Anonymous], ESANN2010
[5]  
[Anonymous], NIPS
[6]  
[Anonymous], P ADV NEUR INF PROC
[7]  
Borthakur D, 2007, The hadoop distributed file system: Architecture and design
[8]  
Chan P. K., 1993, AAAI WORKSH KNOWL DI, P227
[9]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
[10]  
Evgeniou T, 2000, ADV NEUR IN, P171