Centring in regression analyses: a strategy to prevent errors in statistical inference

被引:410
作者
Kraemer, HC [1 ]
Blasey, CM [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
关键词
regression; centring; multicollinearity;
D O I
10.1002/mpr.170
中图分类号
R749 [精神病学];
学科分类号
100205 ;
摘要
Regression analyses are perhaps the most widely used statistical tools in medical research. Centring in regression analyses seldom appears to be covered in training and is not commonly reported in research papers. Centring is the process of selecting a reference value for each predictor and coding the data based on that reference value so that each regression coefficient that is estimated and tested is relevant to the research question. Using non-centred data in regression analysis, which refers to the common practice of entering predictors in their original score format, often leads to inconsistent and misleading results. There is very little cost to unnecessary centring, but the costs of not centring when it is necessary can be major. Thus, it would be better always to centre in regression analyses. We propose a simple default centring strategy: (1) code all binary independent variables +1/2; (2) code all ordinal independent variables as deviations from their median; (3) code all 'dummy variables' for categorical independent variables having m possible responses as 1-1/m and -1/m instead of 1 and 0; (4) compute interaction terms from centred predictors. Using this default strategy when there is no compelling evidence to centre protects against most errors in statistical inference and its routine use sensitizes users to centring issues.
引用
收藏
页码:141 / 151
页数:11
相关论文
共 12 条
[1]  
Aiken L. S., 1991, Multiple regression: Testing and interpreting interactions
[2]   PROBLEMS IN NONORTHOGONAL ANALYSIS OF VARIANCE [J].
APPELBAUM, MI ;
CRAMER, EM .
PSYCHOLOGICAL BULLETIN, 1974, 81 (06) :335-343
[3]   PARTIALED PRODUCTS ARE INTERACTIONS - PARTIALED POWERS ARE CURVE COMPONENTS [J].
COHEN, J .
PSYCHOLOGICAL BULLETIN, 1978, 85 (04) :858-866
[4]  
Cohen JCP, 2003, APPL MULTIPLE REGRES
[5]  
CRAMER EM, 1980, PSYCHOL BULL, V87, P51
[6]   FREQUENCY OF SELECTING NOISE VARIABLES IN SUBSET REGRESSION-ANALYSIS - A SIMULATION STUDY [J].
FLACK, VF ;
CHANG, PC .
AMERICAN STATISTICIAN, 1987, 41 (01) :84-86
[7]  
Glantz Stanton A., 2001, Primer of Applied Regression Analysis of Variance 3e
[8]  
GROSS RT, 1990, JAMA-J AM MED ASSOC, V263, P3035
[9]   Mediators and moderators of treatment effects in randomized clinical trials [J].
Kraemer, HC ;
Wilson, GT ;
Fairburn, CG ;
Agras, WS .
ARCHIVES OF GENERAL PSYCHIATRY, 2002, 59 (10) :877-883
[10]   How do risk factors work together? Mediators, moderators, and independent, overlapping, and proxy risk factors [J].
Kraemer, HC ;
Stice, E ;
Kazdin, A ;
Offord, D ;
Kupfer, D .
AMERICAN JOURNAL OF PSYCHIATRY, 2001, 158 (06) :848-856