Latent classes of objects and variable selection

被引:2
作者
Galimberti, Giuliano [1 ]
Montanari, Angela [1 ]
Viroli, Cinzia [1 ]
机构
[1] Univ Bologna, Dept Stat, I-40126 Bologna, Italy
来源
COMPSTAT 2008: PROCEEDINGS IN COMPUTATIONAL STATISTICS | 2008年
关键词
factor analysis; LASSO; finite Gaussian mixtures;
D O I
10.1007/978-3-7908-2084-3_31
中图分类号
F [经济];
学科分类号
02 ;
摘要
In this paper we present a model based clustering approach which contextually performs dimension reduction and variable selection. In particular we assume that the data have been generated by a linear factor model with latent variables modeled as gaussian mixtures (thus obtaining dimension reduction) and. we shrink the factor loadings, resorting to a penalized likelihood method, with an L1 penalty (thus realizing automatic variable selection). We derive an EM algorithm to obtain the penalized model estimates and a modified BIC criterion to select the penalization parameter. We evaluate the performance of the proposed method on simulated data.
引用
收藏
页码:373 / 383
页数:11
相关论文
共 16 条
[1]   MODEL-BASED GAUSSIAN AND NON-GAUSSIAN CLUSTERING [J].
BANFIELD, JD ;
RAFTERY, AE .
BIOMETRICS, 1993, 49 (03) :803-821
[2]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[3]   Variable selection via nonconcave penalized likelihood and its oracle properties [J].
Fan, JQ ;
Li, RZ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1348-1360
[4]   Model-based clustering, discriminant analysis, and density estimation [J].
Fraley, C ;
Raftery, AE .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (458) :611-631
[5]   Enhanced model-based clustering, density estimation, and discriminant analysis software: MCLUST [J].
Fraley, C ;
Raftery, AE .
JOURNAL OF CLASSIFICATION, 2003, 20 (02) :263-286
[6]   MCLUST: Software for model-based cluster analysis [J].
Fraley, C ;
Raftery, AE .
JOURNAL OF CLASSIFICATION, 1999, 16 (02) :297-306
[7]  
Fraley C., 2002, 415 U WASH DEP STAT
[8]   Subset clustering of binary sequences, with an application to genomic abnormality data [J].
Hoff, PD .
BIOMETRICS, 2005, 61 (04) :1027-1036
[9]   Variable selection using MM algorithms [J].
Hunter, DR ;
Li, RZ .
ANNALS OF STATISTICS, 2005, 33 (04) :1617-1642
[10]   Variable selection in finite mixture of regression models [J].
Khalili, Abbas ;
Chen, Jiahua .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2007, 102 (479) :1025-1038