On the Accuracy of Meta-learning for Scalable Data Mining

被引:66
作者
Chan P.K. [1 ]
Stolfo S.J. [2 ]
机构
[1] Computer Science, Florida Institute of Technology, Melbourne
[2] Department of Computer Science, Columbia University, New York
基金
美国国家科学基金会;
关键词
Classifiers; Data mining; Machine learning; Meta-learning; Scalability;
D O I
10.1023/A:1008640732416
中图分类号
学科分类号
摘要
In this paper, we describe a general approach to scaling data mining applications that we have come to call meta-learning. Meta-Learning refers to a general strategy that seeks to learn how to combine a number of separate learning processes in an intelligent fashion. We desire a meta-learning architecture that exhibits two key behaviors. First, the meta-learning strategy must produce an accurate final classification system. This means that a meta-learning architecture must produce a final outcome that is at least as accurate as a conventional learning algorithm applied to all available data. Second, it must be fast, relative to an individual sequential learning algorithm when applied to massive databases of examples, and operate in a reasonable amount of time. This paper focussed primarily on issues related to the accuracy and efficacy of meta-learning as a general strategy. A number of empirical results are presented demonstrating that meta-learning is technically feasible in wide-area, network computing environments.
引用
收藏
页码:5 / 28
页数:23
相关论文
共 23 条
[1]  
Ali K., Pazzani M., Error reduction through learning multiple descriptions, Machine Learning, (1996)
[2]  
Breiman L., Friedman J.H., Olshen R.A., Stone C.J., Classification and Regression Trees, (1984)
[3]  
Buntine W., Caruana R., Introduction to IND and Recursive Partitioning, (1991)
[4]  
Catlett J., Megainduction: A test flight, Proc. Eighth Intl. Work. Machine Learning, pp. 596-599, (1991)
[5]  
Chan P., Stolfo S., Experiments on multistrategy learning by meta-learning, Proc. Second Intl. Conf. Info. Know. Manag., pp. 314-323, (1993)
[6]  
Chan P., Stolfo S., Meta-learning for multistrategy and parallel learning, Proc. Second Intl. Work. on Multistrategy Learning, pp. 150-165, (1993)
[7]  
Chan P., Stolfo S., Toward parallel and distributed learning by meta-learning, Working Notes AAAI Work. Know. Disc. Databases, pp. 227-240, (1993)
[8]  
Chan P., Stolfo S., Scaling learning by meta-learning over disjoint and partially replicated data, Proc. Ninth Florida AI Research Symposium, pp. 151-155, (1996)
[9]  
Clark P., Niblett T., The CN2 induction algorithm, Machine Learning, 3, pp. 261-285, (1989)
[10]  
Cost S., Salzberg S., A weighted nearest neighbor algorithm for learning with symbolic features, Machine Learning, 10, pp. 57-78, (1993)