Multiple factor analysis and clustering of a mixture of quantitative, categorical and frequency data

被引:140
作者
Becue-Bertaut, Monica [1 ]
Pages, Jerome [1 ]
机构
[1] Univ Politecn Cataluna, EIO, ES-08034 Barcelona, Spain
关键词
mixed data; textual data; distance; multiple factor analysis; multiple factor analysis for contingency tables; clustering; survey;
D O I
10.1016/j.csda.2007.09.023
中图分类号
TP39 [计算机的应用];
学科分类号
081203 [计算机应用技术]; 0835 [软件工程];
摘要
Analysing and clustering units described by a mixture of sets of quantitative, categorical and frequency variables is a relevant challenge. Multiple factor analysis is extended to include these three types of variables in order to balance the influence of the different sets when a global distance between units is computed. Suitable coding is adopted to keep as close as possible to the approach offered by principal axes methods, that is, principal component analysis for quantitative sets, multiple correspondence analysis for categorical sets and correspondence analysis for frequency sets. In addition, the presence of frequency sets poses the problem of selecting the unit weighting, since this is fixed by the user (usually uniform) in principal component analysis and multiple correspondence analysis, but imposed by the table margin in correspondence analysis. The method's main steps are presented and illustrated by an example extracted from a survey that aimed to cluster respondents to a questionnaire that included both closed and open-ended questions. (c) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:3255 / 3268
页数:14
相关论文
共 23 条
[1]
[Anonymous], 2016, Analyses Factorielles Simples et Multiples
[2]
[Anonymous], J SOC STAT PARIS
[3]
[Anonymous], 2002, Revue Statistique Appliquee
[4]
[Anonymous], MULTIVAR BEHAV RES
[5]
A principal axes method for comparing contingency tables:: MFACT [J].
Bécue-Bertaut, M ;
Pagès, J .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2004, 45 (03) :481-503
[6]
BECUEBERTAUT M, 2001, J SOC FRANCAISE STAT, V42, P91
[7]
BECUEBERTAUT M, 1999, APPL STOCH MODEL BUS, P72
[8]
Benzecri JP., 1983, CAHIERS ANAL DONNEES, V8, P351
[9]
CAZES P., 2000, ANAL CORRESPONDANCES, P87
[10]
CAZES R, 1991, SYMBOLIC NUMERIC DAT, P271