On comparing classifiers: Pitfalls to avoid and a recommended approach

被引:573
作者
Salzberg, SL [1 ]
机构
[1] Johns Hopkins Univ, Dept Comp Sci, Baltimore, MD 21218 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
classification; comparative studies; statistical methods;
D O I
10.1023/A:1009752403260
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An important component of many data mining projects is finding a good classification algorithm, a process that requires very careful thought about experimental design. If not done very carefully comparative studies of classification and other types of algorithms can easily result in statistically invalid conclusions. This is especially true when one is using data mining techniques to analyze very large databases. which inevitably contain some statistically unlikely data. This paper describes several phenomena that can. if ignored, invalidate an experimental comparison. These phenomena and the conclusions that follow apply nor only to classification but to computational experiments in almost any aspect of data mining. The paper also discusses why comparative analysis is more important in evaluating some types of algorithms than for others. and provides some suggestions about how to avoid the pitfalls suffered by many experimental studies.
引用
收藏
页码:317 / 328
页数:12
相关论文
共 24 条
[1]  
Aha D. W., 1992, P 9 INT C MACH LEARN, P1
[2]  
[Anonymous], 1993, P 13 INT JOINT C ART, DOI DOI 10.1109/TKDE.2011.181
[3]  
Cochran W.G. G.M. Cox., 1957, Experimental Design
[4]  
COHEN PR, 1997, 6 INT WORKSH ART INT, P115
[5]   DATA MINING AS AN INDUSTRY [J].
DENTON, FT .
REVIEW OF ECONOMICS AND STATISTICS, 1985, 67 (01) :124-127
[6]  
Dietterich T., 1996, STAT TESTS COMP SUPE
[7]  
Everitt B.S., 1977, The analysis of contingency tables
[8]  
FEELDERS A, 1995, 5 INT WORKSH ART INT, P219
[9]  
Flexer A., 1996, Cybernetics and Systems '96. Proceedings of the Thirteenth European Meeting on Cybernetics and Systems Research, P1005
[10]  
GASCUEL O, 1992, P 10 EUR C ART INT E, P435