A two-stage regression model for epidemiological studies with multivariate disease classification data

被引:39
作者
Chatterjee, N [1 ]
机构
[1] NCI, Div Canc Epidemiol & Genet, NIH, Rockville, MD 20852 USA
关键词
colorectal ademona; genetic marker; polytomous logistic regressions; protein expression; pseudo-conditional-likelihood; semiparametric inference;
D O I
10.1198/016214504000000124
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Polytomous logistic regression is commonly used to analyze epidemiological data with disease subtype information. In this approach effects of exposures on different disease subtypes are studied through separate exposure odds ratios comparing different case groups to the common control group. This article considers the situation where disease subtypes can be defined Using multiple characteristics of a disease. For efficient analysis of such data, a two-stage modeling approach is proposed. At the first stage, a standard polytomous logistic regression model is considered for all possible distinct disease subtypes that can be defined by the cross-classification of the different disease characteristics. At the second stage, the exposure odds ratio parameters for the first-stage disease subtypes are further modeled in terms of the defining characteristics of the Subtypes. When the total number of first-stage disease subtypes is small, standard maximum likelihood methods can be used for inference in the proposed model. For dealing with a large number of disease Subtypes. a novel semiparametric pseudo-conditional-likelihood approach is proposed that does not require any model assumption about the baseline probabilities for the different disease subtypes. This article develops the asymptotic theory for the estimator and studies its small-sample properties using simulation experiments. The proposed method is applied to study the effect of fiber on the risk of various forms of colorectal adenoma using data available from a large screening study, the Prostate, Lung, Colorectal and Ovarian Cancer (PLCO) Screening Trial.
引用
收藏
页码:127 / 138
页数:12
相关论文
共 15 条
[1]  
Agresti A., 2018, INTRO CATEGORICAL DA
[2]  
ANDERSON JA, 1984, J R STAT SOC B, V46, P1
[3]  
BEGG CB, 1994, CANCER EPIDEM BIOMAR, V3, P173
[4]   RISK ASSESSMENT FOR CASE-CONTROL SUBGROUPS BY POLYCHOTOMOUS LOGISTIC-REGRESSION [J].
DUBIN, N ;
PASTERNACK, BS .
AMERICAN JOURNAL OF EPIDEMIOLOGY, 1986, 123 (06) :1101-1117
[5]   UNIQUE CONSISTENT SOLUTION TO LIKELIHOOD EQUATIONS [J].
FOUTZ, RV .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1977, 72 (357) :147-148
[6]   CHOICE OF COLUMN SCORES FOR TESTING INDEPENDENCE IN ORDERED 2XK CONTINGENCY-TABLES [J].
GRAUBARD, BI ;
KORN, EL .
BIOMETRICS, 1987, 43 (02) :471-476
[7]   ALTERNATIVE MODELS FOR ORDINAL LOGISTIC-REGRESSION [J].
GREENLAND, S .
STATISTICS IN MEDICINE, 1994, 13 (16) :1665-1677
[8]  
Hosmer D. W., 1989, APPL LOGISTIC REGRES, DOI DOI 10.1097/00019514-200604000-00003
[9]  
LIANG KY, 1992, J R STAT SOC B, V54, P3
[10]  
McCullagh P., 1989, GEN LINEAR MODELS, V2nd edn, DOI [DOI 10.1007/978-1-4899-3242-6, 10.1007/978-1-4899-3242-6, DOI 10.2307/2347392, 10.1201/9780203753736]