Dangers of data mining: The case of calendar effects in stock returns

被引:115
作者
Sullivan, R
Timmermann, A
White, H
机构
[1] Univ Calif San Diego, Dept Econ, La Jolla, CA 92093 USA
[2] Bates White & Ballentine LLC, Washington, DC 20005 USA
关键词
data mining; market efficiency; bootstrap testing; calendar effects;
D O I
10.1016/S0304-4076(01)00077-X
中图分类号
F [经济];
学科分类号
02 ;
摘要
Economics is primarily a non-experimental science. Typically, we cannot generate new data sets on which to test hypotheses independently of the data that may have led to a particular theory. The common practice of using the same data set to formulate and test hypotheses introduces data-mining biases that, if not accounted for, invalidate the assumptions underlying classical statistical inference. A striking example of a data-driven discovery is the presence of calendar effects in stock returns. There appears to be very substantial evidence of systematic abnormal stock returns related to the day of the week, the week of the month, the month of the year, the turn of the month, holidays, and so forth. However, this evidence has largely been considered without accounting for the intensive search preceding it. In this paper we use 100 years of daily data and a new bootstrap procedure that allows us to explicitly measure the distortions in statistical inference induced by data mining. We find that although nominal p-values for individual calendar rules are extremely significant, once evaluated in the context of the full universe from which such rules were drawn, calendar effects no longer remain significant. (C) 2001 Elsevier Science S.A. Ail rights reserved.
引用
收藏
页码:249 / 286
页数:38
相关论文
共 53 条