EFFICIENCY LOSS FROM CATEGORIZING QUANTITATIVE EXPOSURES INTO QUALITATIVE EXPOSURES IN CASE-CONTROL STUDIES

被引:79
作者
ZHAO, LP [1 ]
KOLONEL, LN [1 ]
机构
[1] UNIV HAWAII, SCH PUBL HLTH, BIOSTAT PROGRAM, HONOLULU, HI 96822 USA
关键词
CASE-CONTROL STUDIES; EPIDEMIOLOGIC METHODS; MODELS; STATISTICAL; ODDS RATIO; STATISTICS;
D O I
10.1093/oxfordjournals.aje.a116520
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
In the analysis of data from case-control studies, quantitative exposure variables are frequently categorized into qualitative exposure variables, such as quarters. The qualitative exposure variables may be scalar variables that take the median values of each quantile interval, or they may be vectors of indicator variables that represent each quantile interval. In a qualitative analysis, the scalar variables may be used to test the dose-response relation, while the indicator variables may be used to estimate odds ratios for each higher quantile interval versus the lowest. Qualitative analysis, implicitly and explicitly documented by many epidemiologists and biostatisticians, has several desirable advantages (including simple interpretation and robustness in the presence of a misspecified model or outlier values). In a quantitative analysis, the quantitative exposure variables may be directly regressed to test the dose-response relation, as well as to estimate odds ratios of interest. As this paper demonstrates, quantitative analysis is generally more efficient than qualitative analysis. Through a Monte Carlo simulation study, the authors estimated the loss of efficiency that results from categorizing a quantitative exposure variable by quartiles in case-control studies with a total of 200 cases and 200 controls. In the analysis of the dose-response relation, this loss is about 30% or more; the percentage may reach about 50% when the odds ratio for the fourth quartile interval versus the lowest is around 4. In estimating odds ratios, the loss of efficiency for the second, third, and fourth quartile intervals versus the lowest is around 90%, 75%, and 40%, respectively. The authors consider the pros and cons of each analytic approach, and they recommend that 1) qualitative analysis be used initially to estimate the odds ratios for each higher quantile interval versus the lowest to examine the dose-response relation and determine the appropriateness of the assumed underlying model; and 2) quantitative analysis be used to test the dose-response relation under a plausible log odds ratio model.
引用
收藏
页码:464 / 474
页数:11
相关论文
共 6 条
[1]   EFFICIENCY OF LOGISTIC REGRESSION COMPARED TO NORMAL DISCRIMINANT-ANALYSIS [J].
EFRON, B .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1975, 70 (352) :892-898
[2]  
Hsieh C C, 1991, Epidemiology, V2, P137, DOI 10.1097/00001648-199103000-00008
[3]   AN EPIDEMIOLOGIC-STUDY OF THYROID-CANCER IN HAWAII [J].
KOLONEL, LN ;
HANKIN, JH ;
WILKENS, LR ;
FUKUNAGA, FH ;
HINDS, MW .
CANCER CAUSES & CONTROL, 1990, 1 (03) :223-234
[5]  
MANTEL N, 1959, J NATL CANCER I, V22, P719
[6]  
1984, GAUSS SYSTEM