Beyond Power Calculations: Assessing Type S (Sign) and Type M (Magnitude) Errors

被引:809
作者
Gelman, Andrew [1 ,2 ]
Carlin, John [3 ,4 ,5 ]
机构
[1] Columbia Univ, Dept Stat, New York, NY 10027 USA
[2] Columbia Univ, Dept Polit Sci, New York, NY 10027 USA
[3] Murdoch Childrens Res Inst, Clin Epidemiol & Biostat Unit, Parkville, Vic, Australia
[4] Univ Melbourne, Dept Paediat, Melbourne, Vic 3010, Australia
[5] Univ Melbourne, Sch Populat & Global Hlth, Melbourne, Vic 3010, Australia
基金
美国国家科学基金会;
关键词
design calculation; exaggeration ratio; power analysis; replication crisis; statistical significance; Type M error; Type S error; LIFE EXPECTANCY; SIZE;
D O I
10.1177/1745691614551642
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Statistical power analysis provides the conventional approach to assess error rates when designing a research study. However, power analysis is flawed in that a narrow emphasis on statistical significance is placed as the primary focus of study design. In noisy, small-sample settings, statistically significant results can often be misleading. To help researchers address this problem in the context of their own studies, we recommend design calculations in which (a) the probability of an estimate being in the wrong direction (Type S [sign] error) and (b) the factor by which the magnitude of an effect might be overestimated (Type M [magnitude] error or exaggeration ratio) are estimated. We illustrate with examples from recent published research and discuss the largest challenge in a design calculation: coming up with reasonable estimates of plausible effect sizes based on external information.
引用
收藏
页码:641 / 651
页数:11
相关论文
共 31 条
[1]  
[Anonymous], 2012, GALL POLL
[2]   Distinguishing true from false positives in genomic studies: p values [J].
Broer, Linda ;
Lill, Christina M. ;
Schuur, Maaike ;
Amin, Najaf ;
Roehr, Johannes T. ;
Bertram, Lars ;
Ioannidis, John P. A. ;
van Duijn, Cornelia M. .
EUROPEAN JOURNAL OF EPIDEMIOLOGY, 2013, 28 (02) :131-138
[3]   Power failure: why small sample size undermines the reliability of neuroscience [J].
Button, Katherine S. ;
Ioannidis, John P. A. ;
Mokrysz, Claire ;
Nosek, Brian A. ;
Flint, Jonathan ;
Robinson, Emma S. J. ;
Munafo, Marcus R. .
NATURE REVIEWS NEUROSCIENCE, 2013, 14 (05) :365-376
[4]   Evidence on the impact of sustained exposure to air pollution on life expectancy from China's Huai River policy [J].
Chen, Yuyu ;
Ebenstein, Avraham ;
Greenstone, Michael ;
Li, Hongbin .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2013, 110 (32) :12936-12941
[5]  
Cohen J., 1988, Statistical power analysis for the behavioral sciences, VSecond
[6]   The Fluctuating Female Vote: Politics, Religion, and the Ovulatory Cycle [J].
Durante, Kristina M. ;
Rae, Ashley ;
Griskevicius, Vladas .
PSYCHOLOGICAL SCIENCE, 2013, 24 (06) :1007-1016
[7]   REEXAMINING THE MINIMAL EFFECTS MODEL IN RECENT PRESIDENTIAL CAMPAIGNS [J].
FINKEL, SE .
JOURNAL OF POLITICS, 1993, 55 (01) :1-21
[8]  
Froehlich G W, 1999, Eff Clin Pract, V2, P234
[9]   An Examination of Stereotype Threat Effects on Girls' Mathematics Performance [J].
Ganley, Colleen M. ;
Mingle, Leigh A. ;
Ryan, Allison M. ;
Ryan, Katherine ;
Vasilyeva, Marina ;
Perry, Michelle .
DEVELOPMENTAL PSYCHOLOGY, 2013, 49 (10) :1886-1897
[10]   Type S error rates for classical and Bayesian single and multiple comparison procedures [J].
Gelman, A ;
Tuerlinckx, FA .
COMPUTATIONAL STATISTICS, 2000, 15 (03) :373-390