Imputation of data values that are less than a detection limit

被引:143
作者
Succop, PA
Clark, S
Chen, M
Galke, W
机构
[1] Univ Cincinnati, Dept Environm Hlth, Cincinnati, OH 45267 USA
[2] Natl Ctr Healthy Housing, Columbia, MD USA
关键词
censored distributions; imputation; method detection limit; maximum likelihood; structural equation modeling;
D O I
10.1080/15459620490462797
中图分类号
X [环境科学、安全科学];
学科分类号
08 [工学]; 0830 [环境科学与工程];
摘要
Results of the analyses of occupational and environmental samples are frequently reported as "less than a specified value," a practice followed by many analytical laboratories. A left-censored distribution occurs when analytical laboratories do not report results that fall below their limits of detection or quantification. Approximately 37% of the household interior dust lead loadings collected in a large-scale, multisite, longitudinal study of lead-based paint hazard controls were reported to be below the "method detection limit." These unreported values are unusable in any statistical analysis of the data and must be replaced by a valid dust lead loading estimate, a process called data imputation. This investigation tested how well data imputed using a newly formulated procedure for estimating the data below the method detection limit were correlated with dust lead loadings reported by the participating laboratories after special request. These results were also compared with those obtained by imputing the minimum detectable level by the square root of 2. Imputation of the low lead loadings was accomplished by substituting the value associated with the median percentile below each laboratory's method detection limit. A correlation of r = 0.50 was calculated between the predicted and reported dust lead loadings, with only slight bias (2.9%) in the predicted values. An alternative imputation procedure that used the predicted value from structural equation models fit to the noncensored dust lead loadings performed about as well, although the predictions had to be "centered" to correspond to the censored data. An estimator that combined both of these imputation procedures only slightly improved the correlation between the predicted and laboratory values (r = 0.51). These results support the use of the new procedure rather than the commonly used imputed values of the method detection limit divided by 2 or by the square root of 2. Imputing values based on either of these common approaches may result in much more biased predictions for the censored data; in the case of these data, the dust lead loadings were overestimated by 348%. The results also suggest that analytical laboratories should provide a numerical result for all analyzed samples, with a "flag" of those values below their detection limit, since these results may be more accurate than any imputed value, particularly those provided by the commonly used method of dividing the minimum detection limit by the square root of 2.
引用
收藏
页码:436 / 441
页数:6
相关论文
共 11 条
[1]
[Anonymous], SAS STAT US GUID VER
[2]
BOTNICK E, 1998, SYNERGIST OCT, P15
[4]
Evaluation of the HUD Lead Hazard Control grant program: Early overall findings [J].
Galke, W ;
Clark, S ;
Wilson, J ;
Jacobs, D ;
Succop, P ;
Dixon, S ;
Bornschein, B ;
McLaine, P ;
Chen, M .
ENVIRONMENTAL RESEARCH, 2001, 86 (02) :149-156
[5]
ESTIMATION FOR SMALL NORMAL DATA SETS WITH DETECTION LIMITS [J].
GLEIT, A .
ENVIRONMENTAL SCIENCE & TECHNOLOGY, 1985, 19 (12) :1201-1206
[6]
LESS THAN OBVIOUS - STATISTICAL TREATMENT OF DATA BELOW THE DETECTION LIMIT [J].
HELSEL, DR .
ENVIRONMENTAL SCIENCE & TECHNOLOGY, 1990, 24 (12) :1766-1774
[7]
Hornung R.W., 1990, Appl. Occupat. Environ. Hygiene, V5, P46, DOI DOI 10.1080/1047322X.1990.10389587
[8]
*NAT CTR LEAD SAF, EV HUD LEAD BAS PAIN
[9]
*SAS I INC, 1989, SAS ETS US GUID VERS
[10]
SCHNEIDER H, 1986, BIOMETRIKA, V73, P741