Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power

被引:1703
作者
Garcia, Salvador [1 ]
Fernandez, Alberto [2 ]
Luengo, Julian [2 ]
Herrera, Francisco [2 ]
机构
[1] Univ Jaen, Dept Comp Sci, Jaen, Spain
[2] Univ Granada, Dept Comp Sci & Artificial Intelligence, E-18071 Granada, Spain
关键词
Statistical analysis; Computational intelligence; Data mining; Nonparametric statistics; Multiple comparisons procedures; Genetics-based machine learning; Fuzzy classification systems; EVOLUTIONARY ALGORITHMS; STATISTICAL COMPARISONS; NEURAL-NETWORKS; CLASSIFIERS; PERFORMANCE; SELECTION;
D O I
10.1016/j.ins.2009.12.010
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Experimental analysis of the performance of a proposed method is a crucial and necessary task in an investigation. In this paper, we focus on the use of nonparametric statistical inference for analyzing the results obtained in an experiment design in the field of computational intelligence. We present a case study which involves a set of techniques in classification tasks and we study a set of nonparametric procedures useful to analyze the behavior of a method with respect to a set of algorithms, such as the framework in which a new proposal is developed. Particularly, we discuss some basic and advanced nonparametric approaches which improve the results offered by the Friedman test in some circumstances. A set of post hoc procedures for multiple comparisons is presented together with the computation of adjusted p-values. We also perform an experimental analysis for comparing their power, with the objective of detecting the advantages and disadvantages of the statistical tests described. We found that some aspects such as the number of algorithms, number of data sets and differences in performance offered by the control method are very influential in the statistical tests studied. Our final goal is to offer a complete guideline for the use of nonparametric statistical procedures for performing multiple comparisons in experimental studies. (C) 2009 Elsevier Inc. All rights reserved.
引用
收藏
页码:2044 / 2064
页数:21
相关论文
共 54 条
  • [1] KEEL: a software tool to assess evolutionary algorithms for data mining problems
    Alcala-Fdez, J.
    Sanchez, L.
    Garcia, S.
    del Jesus, M. J.
    Ventura, S.
    Garrell, J. M.
    Otero, J.
    Romero, C.
    Bacardit, J.
    Rivas, V. M.
    Fernandez, J. C.
    Herrera, F.
    [J]. SOFT COMPUTING, 2009, 13 (03) : 307 - 318
  • [2] [Anonymous], 1999, Biostatistical Analysis
  • [3] [Anonymous], 2005, Data Mining: Concepts and Techniques
  • [4] [Anonymous], 2006, Introduction to Data Mining
  • [5] [Anonymous], 1974, HDB MATH FUNCTIONS F, DOI DOI 10.5555/1098650
  • [6] [Anonymous], 1990, 2001 P AAZV AAWV ARA
  • [7] [Anonymous], P 6 ONL WORLD C SOFT
  • [8] [Anonymous], 2002, Computational Intelligence an Introduction
  • [9] [Anonymous], 2007, Uci machine learning repository
  • [10] Improving the performance of a Pittsburgh learning classifier system using a default rule
    Bacardit, Jaume
    Goldberg, David E.
    Butz, Martin V.
    [J]. LEARNING CLASSIFIER SYSTEMS, 2007, 4399 : 291 - 307