A note on the validity of cross-validation for evaluating autoregressive time series prediction

被引:324
作者
Bergmeir, Christoph [1 ]
Hyndman, Rob J. [2 ]
Koo, Bonsoo [2 ]
机构
[1] Monash Univ, Fac Informat Technol, POB 63, Melbourne, Vic 3800, Australia
[2] Monash Univ, Dept Econometr & Business Stat, Melbourne, Vic, Australia
基金
澳大利亚研究理事会;
关键词
Cross-validation; Time series; Autoregression;
D O I
10.1016/j.csda.2017.11.003
中图分类号
TP39 [计算机的应用];
学科分类号
080201 [机械制造及其自动化];
摘要
One of the most widely used standard procedures for model evaluation in classification and regression is K-fold cross-validation (CV). However, when it comes to time series forecasting, because of the inherent serial correlation and potentialnon-stationarity of the data, its application is not straightforward and often replaced by practitioners in favour of an out-of-sample (OOS) evaluation. It is shown that for purely autoregressive models, the use of standard K-fold CV is possible provided the models considered have uncorrelated errors. Such a setup occurs, for example, when the models nest a more appropriate model. This is very common when Machine Learning methods are used for prediction, and where CV can control for overfitting the data. Theoretical insights supporting these arguments are presented, along with a simulation study and a real-world example. It is shown empirically that K-fold CV performs favourably compared to both OOS evaluation and other time series-specific techniques such as non-dependent cross-validation. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:70 / 83
页数:14
相关论文
共 13 条
[1]
[Anonymous], DP150104292 AUSTR RE
[2]
Study on the Impact of Partition-Induced Dataset Shift on k-fold Cross-Validation [J].
Garcia Moreno-Torres, Jose ;
Saez, Jose A. ;
Herrera, Francisco .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2012, 23 (08) :1304-1312
[3]
Hyndman RJ, 2008, SPRINGER SER STAT, P3
[4]
Kunst RM, 2008, AUST J STAT, V37, P271
[5]
MEASURE OF LACK OF FIT IN TIME-SERIES MODELS [J].
LJUNG, GM ;
BOX, GEP .
BIOMETRIKA, 1978, 65 (02) :297-303
[6]
Mcquarrie A. D. R., 1998, Regression and time series model selection
[7]
MIXING PROPERTIES OF ARMA PROCESSES [J].
MOKKADEM, A .
STOCHASTIC PROCESSES AND THEIR APPLICATIONS, 1988, 29 (02) :309-315
[8]
Opsomer J, 2001, STAT SCI, V16, P134
[9]
Pournelle G. H., 1953, Journal of Mammalogy, V34, P133, DOI 10.1890/0012-9658(2002)083[1421:SDEOLC]2.0.CO
[10]
2