A reality check for data snooping

被引：822

作者：

White, H

机构：

[1] Univ Calif San Diego, San Diego, CA 92121 USA

[2] QuantMetr R&D Associates, LLC, San Diego, CA 92121 USA

来源：

ECONOMETRICA | 2000年 / 68卷 / 05期

关键词：

data mining; multiple hypothesis testing; bootstrap; forecast evaluation; model selection; prediction;

D O I：

10.1111/1468-0262.00152

中图分类号：

F [经济];

学科分类号：

02 ;

摘要：

Data snooping occurs when a given set of data is used more than once for purposes of inference or model selection. When such data reuse occurs, there is always the possibility that any satisfactory results obtained may simply be due to chance rather than to any merit inherent in the method yielding the results. This problem is practically unavoidable in the analysis of time-series data, as typically only a single history measuring a given phenomenon of interest is available for analysis: It is widely acknowledged by empirical researchers that data snooping is a dangerous practice to be avoided, but in fact it is endemic. The main problem has been a lack of sufficiently simple practical methods capable of assessing the potential dangers of data snooping in a given situation. Our purpose here is to provide such methods by specifying a straightforward procedure for testing the null hypothesis that the best model encountered in a specification search has no predictive superiority over a given benchmark model. This permits data snooping to be undertaken with some degree of confidence that one will not mistake results that could have been generated by chance for genuinely good results.

引用

页码：1097 / 1126

页数：30

共 53 条

[1] ALTISSIMO F, 1996, LIL M ESTIMATORS APP
[2] TESTS FOR PARAMETER INSTABILITY AND STRUCTURAL-CHANGE WITH UNKNOWN CHANGE-POINT
ANDREWS, DWK
[J]. ECONOMETRICA, 1993, 61 (04) : 821 - 856
[3] [Anonymous], 1993, Resampling-based multiple testing: Examples and methods for P-value adjustment
[4] BILLINGSLEY P., 1999, Convergence of Probability Measures, V2nd, DOI 10.1002/9780470316962
[5] Billingsley P., 1995, Probability and measure, VThird
[6] SIMPLE TECHNICAL TRADING RULES AND THE STOCHASTIC PROPERTIES OF STOCK RETURNS
BROCK, W
LAKONISHOK, J
LEBARON, B
[J]. JOURNAL OF FINANCE, 1992, 47 (05) : 1731 - 1764
[7] MODEL UNCERTAINTY, DATA MINING AND STATISTICAL-INFERENCE
CHATFIELD, C
[J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 1995, 158 : 419 - 466
[8] CORRADI V, 1998, PREDICTIVE ABILITY
[9] CAN STOCK MARKET FORECASTERS FORECAST?
Cowles, Alfred, III
[J]. ECONOMETRICA, 1933, 1 (03) : 309 - 324
[10] STATISTICAL SIGNIFICANCE TESTS
COX, DR
[J]. BRITISH JOURNAL OF CLINICAL PHARMACOLOGY, 1982, 14 (03) : 325 - 331

← 1 2 3 4 5 6 →