P-Value Precision and Reproducibility

被引:119
作者
Boos, Dennis D. [1 ]
Stefanski, Leonard A. [1 ]
机构
[1] N Carolina State Univ, Dept Stat, Raleigh, NC 27695 USA
基金
美国国家科学基金会;
关键词
Log p-value; Measure of evidence; Prediction interval; Reproducibility probability; BOOTSTRAP PREDICTION INTERVALS; SIGNIFICANCE LEVEL; RANDOM-VARIABLES; PROBABILITY;
D O I
10.1198/tas.2011.10129
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
070103 [概率论与数理统计]; 140311 [社会设计与社会创新];
摘要
P-values are useful statistical measures of evidence against a null hypothesis. In contrast to other statistical estimates, however, their sample-to-sample variability is usually not considered or estimated, and therefore not fully appreciated. Via a systematic study of log-scale p-value standard errors, bootstrap prediction bounds, and reproducibility probabilities for future replicate p-values, we show that p-values exhibit surprisingly large variability in typical data situations. In addition to providing context to discussions about the failure of statistical results to replicate, our findings shed light on the relative value of exact p-values vis-a-vis approximate p-values, and indicate that the use of *, **, and *** to denote levels 0.05, 0.01, and 0.001 of statistical significance in subject-matter journals is about the right level of precision for reporting p-values when judged by widely accepted rules for rounding statistical estimates.
引用
收藏
页码:213 / 221
页数:9
相关论文
共 27 条
[1]
[Anonymous], 1993, An introduction to the bootstrap
[2]
[Anonymous], 2010, Science News, DOI [DOI 10.1002/SCIN.5591770721, 10.1002/scin.5591770721, DOI 10.1002/scin.5591770721, 10.1002/scin.5591770721C, DOI 10.1002/SCIN.5591770721C]
[3]
STOCHASTIC COMPARISON OF TESTS [J].
BAHADUR, RR .
ANNALS OF MATHEMATICAL STATISTICS, 1960, 31 (02) :276-295
[4]
Boos D. D., 1981, J AM STAT ASSOC, V76, p[633, 216]
[5]
Reproducibility probability estimation for testing statistical hypotheses [J].
De Martini, Daniele .
STATISTICS & PROBABILITY LETTERS, 2008, 78 (09) :1056-1061
[6]
EXPECTED SIGNIFICANCE LEVEL AS A SENSITIVITY INDEX FOR TEST STATISTICS [J].
DEMPSTER, AP ;
SCHATZOFF, M .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1965, 60 (310) :420-436
[7]
A note on information seldom reported via the P value [J].
Donahue, RMJ .
AMERICAN STATISTICIAN, 1999, 53 (04) :303-306
[8]
RUDIMENTS OF NUMERACY [J].
EHRENBERG, ASC .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 1977, 140 :277-+
[9]
The difference between "significant" and "not significant" is not itself statistically significant [J].
Gelman, Andrew ;
Stern, Hal .
AMERICAN STATISTICIAN, 2006, 60 (04) :328-331
[10]
A COMMENT ON REPLICATION, P-VALUES AND EVIDENCE [J].
GOODMAN, SN .
STATISTICS IN MEDICINE, 1992, 11 (07) :875-879