An accurate comparison of methods for quantifying variable importance in artificial neural networks using simulated data

被引:734
作者
Olden, JD [1 ]
Joy, MK
Death, RG
机构
[1] Colorado State Univ, Dept Biol, Grad Degree Program Ecol, Ft Collins, CO 80523 USA
[2] Massey Univ, Nat Resources Inst, Palmerston North, New Zealand
基金
加拿大自然科学与工程研究理事会;
关键词
statistical models; explanatory power; connection weights; Garson's algorithm; sensitivity analysis;
D O I
10.1016/j.ecolmodel.2004.03.013
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Artificial neural networks (ANNs) are receiving greater attention in the ecological sciences as a powerful statistical modeling technique; however, they have also been labeled a "black box" because they are believed to provide little explanatory insight into the contributions of the independent variables in the prediction process. A recent paper published in Ecological Modelling [Review and comparison of methods to study the contribution of variables in artificial neural network models, Ecol. Model. 160 (2003) 249-264] addressed this concern by providing a comprehensive comparison of eight different methodologies for estimating variable importance in neural networks that are commonly used in ecology. Unfortunately, comparisons of the different methodologies were based on an empirical dataset, which precludes the ability to establish generalizations regarding the true accuracy and precision of the different approaches because the true importance of the variables is unknown. Here, we provide a more appropriate comparison of the different methodologies by using Monte Carlo simulations with data exhibiting defined (and consequently known) numeric relationships. Our results show that a Connection Weight Approach that uses raw input-hidden and hidden-output connection weights in the neural network provides the best methodology for accurately quantifying variable importance and should be favored over the other approaches commonly used in the ecological literature. Average similarity between true and estimated ranked variable importance using this approach was 0.92, whereas, similarity coefficients ranged between 0.28 and 0.74 for the other approaches. Furthermore, the Connection Weight Approach was the only method that consistently identified the correct ranked importance of all predictor variables, whereas, the other methods either only identified the first few important variables in the network or no variables at all. The most notably result was that Garson's Algorithm was the poorest performing approach, yet is the most commonly used in the ecological literature. In conclusion, this study provides a robust comparison of different methodologies for assessing variable importance in neural networks that can be generalized to other data and from which valid recommendations can be made for future studies. (C) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:389 / 397
页数:9
相关论文
共 27 条
[1]   Identifying the density-dependent structure underlying ecological time series [J].
Berryman, A ;
Turchin, P .
OIKOS, 2001, 92 (02) :265-270
[2]  
Bishop C. M., 1996, Neural networks for pattern recognition
[3]   Neural network models to study relationships between lead concentration in grasses and permanent urban descriptors in Athens city (Greece) [J].
Dimopoulos, I ;
Chronopoulos, J ;
Chronopoulou-Sereli, A ;
Lek, S .
ECOLOGICAL MODELLING, 1999, 120 (2-3) :157-165
[4]   USE OF SOME SENSITIVITY CRITERIA FOR CHOOSING NETWORKS WITH GOOD GENERALIZATION ABILITY [J].
DIMOPOULOS, Y ;
BOURRET, P ;
LEK, S .
NEURAL PROCESSING LETTERS, 1995, 2 (06) :1-4
[5]  
Garson GD., 1991, AI EXPERT, V6, P46, DOI DOI 10.5555/129449.129452
[6]   Review and comparison of methods to study the contribution of variables in artificial neural network models [J].
Gevrey, M ;
Dimopoulos, L ;
Lek, S .
ECOLOGICAL MODELLING, 2003, 160 (03) :249-264
[7]   MULTILAYER FEEDFORWARD NETWORKS ARE UNIVERSAL APPROXIMATORS [J].
HORNIK, K ;
STINCHCOMBE, M ;
WHITE, H .
NEURAL NETWORKS, 1989, 2 (05) :359-366
[8]   STOPPING RULES IN PRINCIPAL COMPONENTS-ANALYSIS - A COMPARISON OF HEURISTIC AND STATISTICAL APPROACHES [J].
JACKSON, DA .
ECOLOGY, 1993, 74 (08) :2204-2214
[9]  
JOY MK, 2004, IN PRESS MODELING CO
[10]  
LEGENDRE L., 1983, NUMERICAL ECOLOGY, DOI DOI 10.1017/CBO9781107415324.004