Assessing transferability of ecological models: an underappreciated aspect of statistical validation

被引:453
作者
Wenger, Seth J. [1 ]
Olden, Julian D. [2 ]
机构
[1] Trout Unltd, Boise, ID 83702 USA
[2] Univ Washington, Sch Aquat & Fishery Sci, Seattle, WA 98195 USA
来源
METHODS IN ECOLOGY AND EVOLUTION | 2012年 / 3卷 / 02期
关键词
cross-validation; generality; niche model; performance; species distribution model; statistical; SPECIES DISTRIBUTION; SPATIAL AUTOCORRELATION; FLOW REGIME; HABITAT; DISTRIBUTIONS; PREDICTION; CLIMATE; TROUT; CLASSIFICATION; TEMPERATURE;
D O I
10.1111/j.2041-210X.2011.00170.x
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
1. Ecologists have long sought to distinguish relationships that are general from those that are idiosyncratic to a narrow range of conditions. Conventional methods of model validation and selection assess in- or out-of-sample prediction accuracy but do not assess model generality or transferability, which can lead to overestimates of performance when predicting in other locations, time periods or data sets. 2. We propose an intuitive method for evaluating transferability based on techniques currently in use in the area of species distributionmodelling. The method involves cross-validation inwhich data are assigned non-randomly to groups that are spatially, temporally or otherwise distinct, thus using heterogeneity in the data set as a surrogate for heterogeneity among data sets. 3. We illustrate the method by applying it to distribution modelling of brook trout (Salvelinus fontinalis Mitchill) and brown trout (Salmo trutta Linnaeus) in western United States. We show that machine-learning techniques such as random forests and artificial neural networks can produce models with excellent in-sample performance but poor transferability, unless complexity is constrained. In our example, traditional linear models have greater transferability. 4. We recommend the use of a transferability assessment whenever there is interest in making inferences beyond the data set used for model fitting. Such an assessment can be used both for validation and for model selection and provides important information beyond what can be learned fromconventional validation and selection techniques.
引用
收藏
页码:260 / 267
页数:8
相关论文
共 67 条
[1]   Validation of species-climate impact models under climate change [J].
Araújo, MB ;
Pearson, RG ;
Thuiller, W ;
Erhard, M .
GLOBAL CHANGE BIOLOGY, 2005, 11 (09) :1504-1513
[2]   Ensemble forecasting of species distributions [J].
Araujo, Miguel B. ;
New, Mark .
TRENDS IN ECOLOGY & EVOLUTION, 2007, 22 (01) :42-47
[3]   A survey of cross-validation procedures for model selection [J].
Arlot, Sylvain ;
Celisse, Alain .
STATISTICS SURVEYS, 2010, 4 :40-79
[4]   Spatial prediction of species distribution: an interface between ecological theory and statistical modelling [J].
Austin, MP .
ECOLOGICAL MODELLING, 2002, 157 (2-3) :101-118
[5]   Transferability of environmental favourability models in geographic space: The case of the Iberian desman (Galemys pyrenaicus) in Portugal and Spain [J].
Barbosa, A. Marcia ;
Real, Raimundo ;
Vargas, J. Mario .
ECOLOGICAL MODELLING, 2009, 220 (05) :747-754
[6]  
Bishop CM., 1995, NEURAL NETWORKS PATT
[7]   Evaluating resource selection functions [J].
Boyce, MS ;
Vernier, PR ;
Nielsen, SE ;
Schmiegelow, FKA .
ECOLOGICAL MODELLING, 2002, 157 (2-3) :281-300
[8]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[9]   Uncertainty in ensemble forecasting of species distribution [J].
Buisson, Laetitia ;
Thuiller, Wilfried ;
Casajus, Nicolas ;
Lek, Sovan ;
Grenouillet, Gael .
GLOBAL CHANGE BIOLOGY, 2010, 16 (04) :1145-1157
[10]   Multimodel inference - understanding AIC and BIC in model selection [J].
Burnham, KP ;
Anderson, DR .
SOCIOLOGICAL METHODS & RESEARCH, 2004, 33 (02) :261-304