Indicator and stratification methods for missing explanatory variables in multiple linear regression

被引:201
作者
Jones, MP
机构
[1] UNIV IOWA, DEPT STAT & ACTUARIAL SCI, IOWA CITY, IA 52242 USA
[2] UNIV IOWA, DEPT PREVENT MED, IOWA CITY, IA 52242 USA
关键词
epidemiology; incomplete data; missing data; psychology;
D O I
10.2307/2291399
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The statistical literature and folklore contain many methods for handling missing explanatory variable data in multiple linear regression. One such approach is to incorporate into the regression model an indicator variable for whether an explanatory variable is observed. Another approach is to stratify the model based on the range of values for an explanatory variable, with a separate stratum for those individuals in which the explanatory variable is missing. For a least squares regression analysis using either of these two missing-data approaches, the exact biases of the estimators for the regression coefficients and the residual variance are derived and reported. The complete-case analysis, in which individuals with any missing data are omitted, is also investigated theoretically and is found to be free of bias in many situations, though often wasteful of information. A numerical evaluation of the bias of two missing-indicator methods and the complete-case analysis is reported. The missing-indicator methods show unacceptably large biases in practical situations and are not advisable in general.
引用
收藏
页码:222 / 230
页数:9
相关论文
共 9 条
[1]  
AFIFI AA, 1967, J AM STAT ASSOC, V62, P10
[2]  
Anderson A. B., 1983, Handbook of survey research, P415
[3]  
*APT SYST INC, 1991, GAUSS VERS 2 2
[4]  
Chow W., 1979, P BUSINESS EC SECTIO, P417
[5]  
Cohen J, 1984, APPL MULTIPLE REGRES, V2, P183
[6]  
LITTLE R.J., 1987, Statistical Analysis With Missing Data, P381, DOI 10.1002/9781119013563
[7]   REGRESSION WITH MISSING XS - A REVIEW [J].
LITTLE, RJA .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1992, 87 (420) :1227-1237
[8]  
Miettinen O. S., 1985, Theoretical epidemiology: principles of occurrence research in medicine
[9]  
Seber, 1977, LINEAR REGRESSION AN