The p-Value Requires Context, Not a Threshold

被引:117
作者
Betensky, Rebecca A. [1 ]
机构
[1] Harvard TH Chan Sch Publ Hlth, Dept Biostat, 655 Huntington Ave, Boston, MA 02115 USA
关键词
Effect size; Sample size; Statistical significance;
D O I
10.1080/00031305.2018.1529624
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
It is widely recognized by statisticians, though not as widely by other researchers, that the p-value cannot be interpreted in isolation, but rather must be considered in the context of certain features of the design and substantive application, such as sample size and meaningful effect size. I consider the setting of the normal mean and highlight the information contained in the p-value in conjunction with the sample size and meaningful effect size. The p-value and sample size jointly yield 95% confidence bounds for the effect of interest, which can be compared to the predetermined meaningful effect size to make inferences about the true effect. I provide simple examples to demonstrate that although the p-value is calculated under the null hypothesis, and thus seemingly may be divorced from the features of the study from which it arises, its interpretation as a measure of evidence requires its contextualization within the study. This implies that any proposal for improved use of the p-value as a measure of the strength of evidence cannot simply be a change to the threshold for significance.
引用
收藏
页码:115 / 117
页数:3
相关论文
共 8 条
[1]  
[Anonymous], 2013, INNOVATIONS CLIN SA
[2]   Redefine statistical significance [J].
Benjamin, Daniel J. ;
Berger, James O. ;
Johannesson, Magnus ;
Nosek, Brian A. ;
Wagenmakers, E. -J. ;
Berk, Richard ;
Bollen, Kenneth A. ;
Brembs, Bjoern ;
Brown, Lawrence ;
Camerer, Colin ;
Cesarini, David ;
Chambers, Christopher D. ;
Clyde, Merlise ;
Cook, Thomas D. ;
De Boeck, Paul ;
Dienes, Zoltan ;
Dreber, Anna ;
Easwaran, Kenny ;
Efferson, Charles ;
Fehr, Ernst ;
Fidler, Fiona ;
Field, Andy P. ;
Forster, Malcolm ;
George, Edward I. ;
Gonzalez, Richard ;
Goodman, Steven ;
Green, Edwin ;
Green, Donald P. ;
Greenwald, Anthony ;
Hadfield, Jarrod D. ;
Hedges, Larry V. ;
Held, Leonhard ;
Ho, Teck Hua ;
Hoijtink, Herbert ;
Hruschka, Daniel J. ;
Imai, Kosuke ;
Imbens, Guido ;
Ioannidis, John P. A. ;
Jeon, Minjeong ;
Jones, James Holland ;
Kirchler, Michael ;
Laibson, David ;
List, John ;
Little, Roderick ;
Lupia, Arthur ;
Machery, Edouard ;
Maxwell, Scott E. ;
McCarthy, Michael ;
Moore, Don ;
Morgan, Stephen L. .
NATURE HUMAN BEHAVIOUR, 2018, 2 (01) :6-10
[3]   The t-Test p Value and Its Relationship to the Effect Size and P(X > Y) [J].
Browne, Richard H. .
AMERICAN STATISTICIAN, 2010, 64 (01) :30-33
[4]   Dabigatran versus Warfarin in Patients with Atrial Fibrillation. [J].
Connolly, Stuart J. ;
Ezekowitz, Michael D. ;
Yusuf, Salim ;
Eikelboom, John ;
Oldgren, Jonas ;
Parekh, Amit ;
Pogue, Janice ;
Reilly, Paul A. ;
Themeles, Ellison ;
Varrone, Jeanne ;
Wang, Susan ;
Alings, Marco ;
Xavier, Denis ;
Zhu, Jun ;
Diaz, Rafael ;
Lewis, Basil S. ;
Darius, Harald ;
Diener, Hans-Christoph ;
Joyner, Campbell D. ;
Wallentin, Lars .
NEW ENGLAND JOURNAL OF MEDICINE, 2009, 361 (12) :1139-1151
[5]   Why most published research findings are false [J].
Ioannidis, JPA .
PLOS MEDICINE, 2005, 2 (08) :696-701
[6]  
Lesaffre E, 2008, BULL HOSP JOINT DIS, V66, P150
[7]   Effect sizes for experimenting psychologists [J].
Rosnow, RL ;
Rosenthal, R .
CANADIAN JOURNAL OF EXPERIMENTAL PSYCHOLOGY-REVUE CANADIENNE DE PSYCHOLOGIE EXPERIMENTALE, 2003, 57 (03) :221-237
[8]  
Wasserstein RL, 2016, AM STAT, V70, P129