Feature-specific penalized latent class analysis for genomic data

被引:22
作者
Houseman, E. Andres [1 ]
Coull, Brent A. [1 ]
Betensky, Rebecca A. [1 ]
机构
[1] Harvard Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
关键词
constrained estimation; LASSO; loss of heterozygosity; mixture models; penalized likelihood; ridge regression;
D O I
10.1111/j.1541-0420.2006.00566.x
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Genomic data are often characterized by a moderate to large number of categorical variables observed for relatively few subjects. Some of the variables may be missing or noninformative. An example of such data is loss of heterozygosity (LOH), a dichotomous variable, observed on a moderate number of genetic markers. We first consider a latent class model where, conditional on unobserved membership in one of k classes, the variables are independent with probabilities determined by a regression model of low dimension q. Using a family of penalties including the ridge and LASSO, we extend this model to address higher-dimensional problems. Finally, we present an orthogonal map that transforms marker space to a space of "features" for which the constrained model has better predictive power. We demonstrate these methods on LOH data collected at 19 markers from 93 brain tumor patients. For this data set, the existing unpenalized latent class methodology does not produce estimates. Additionally, we show that posterior classes obtained from this method are associated with survival for these patients.
引用
收藏
页码:1062 / 1070
页数:9
相关论文
共 23 条
[1]   QUASI-SYMMETRICAL LATENT CLASS MODELS, WITH APPLICATION TO RATER AGREEMENT [J].
AGRESTI, A ;
LANG, JB .
BIOMETRICS, 1993, 49 (01) :131-139
[2]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[3]   Latent variable regression for multiple discrete outcomes [J].
Bandeen-Roche, K ;
Miglioretti, DL ;
Zeger, SL ;
Rathouz, PJ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1997, 92 (440) :1375-1386
[4]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[6]   Specific genetic predictors of chemotherapeutic response and survival in patients with anaplastic oligodendrogliomas [J].
Cairncross, JG ;
Ueki, K ;
Zlatescu, MC ;
Lisle, DK ;
Finkelstein, DM ;
Hammond, RR ;
Silver, JS ;
Stark, PC ;
Macdonald, DR ;
Ino, Y ;
Ramsay, DA ;
Louis, DN .
JOURNAL OF THE NATIONAL CANCER INSTITUTE, 1998, 90 (19) :1473-1479
[7]   RAPID DETECTION OF ALLELE LOSS IN COLORECTAL TUMORS USING MICROSATELLITES AND FLUORESCENT DNA TECHNOLOGY [J].
CAWKWELL, L ;
BELL, SM ;
LEWIS, FA ;
DIXON, MF ;
TAYLOR, GR ;
QUIRKE, P .
BRITISH JOURNAL OF CANCER, 1993, 67 (06) :1262-1267
[8]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[9]   Identification of two contiguous minimally deleted regions on chromosome 1p36.31-p36.32 in oligodendroglial tumours [J].
Dong, Z ;
Pang, JCS ;
Ng, MH ;
Poon, WS ;
Zhou, L ;
Ng, HK .
BRITISH JOURNAL OF CANCER, 2004, 91 (06) :1105-1111
[10]  
GOODMAN LA, 1974, BIOMETRIKA, V61, P215, DOI 10.1093/biomet/61.2.215