Selection of Variables for Cluster Analysis and Classification Rules

被引:46
作者
Fraiman, Ricardo [1 ,2 ]
Justel, Ana [3 ]
Svarc, Marcela [1 ]
机构
[1] Univ San Andres, Buenos Aires, DF, Argentina
[2] Univ Republica, Ctr Matemat, Montevideo, Uruguay
[3] Univ Autonoma Madrid, Madrid, Spain
关键词
Finding relevant variables; Forward-backward algorithm; Pattern recognition;
D O I
10.1198/016214508000000544
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this article we introduce two procedures for variable selection in cluster analysis and classification rules. One is mainly aimed at detecting the ''noisy'' noninformative variables, while the other also deals with multicolinearity and general dependence. Both methods are designed to be used after a ''satisfactory'' grouping procedure has been carried out. A forward-backward algorithm is proposed to make such procedures feasible in large datasets. A small simulation is performed and some real data examples are analyzed.
引用
收藏
页码:1294 / 1303
页数:10
相关论文
共 25 条