Variable selection in qualitative models via an entropic explanatory power

被引:39
作者
Dupuis, JA
Robert, CP
机构
[1] Univ Paris 09, CEREMADE, F-75775 Paris 16, France
[2] Univ Toulouse 3, F-31062 Toulouse, France
关键词
additivity property; entropy; Kullback-Leibler distance; logit model; transitivity;
D O I
10.1016/S0378-3758(02)00286-0
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The variable selection method proposed in the paper is based on the evaluation of the Kullback-Leibler distance between the full (or encompassing) model and its submodels. The Bayesian implementation of the method does not require a separate prior modeling on the submodels since the corresponding parameters for the submodels are defined as the Kullback-Leibler projections of the full model parameters. The result of the selection procedure is the submodel with the smallest number of covariates which is at an acceptable distance of the full model. We introduce the notion of explanatory power of a model and scale the maximal acceptable distance in terms of the explanatory power of the full model. Moreover, an additivity property between embedded submodels shows that our selection procedure is equivalent to select the submodel with the smallest number of covariates which has a sufficient explanatory power. We illustrate the performances of this method on a breast cancer dataset (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:77 / 94
页数:18
相关论文
共 21 条
[1]   BAYESIAN-ANALYSIS OF BINARY AND POLYCHOTOMOUS RESPONSE DATA [J].
ALBERT, JH ;
CHIB, S .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (422) :669-679
[2]  
BERNARDO JM, 1979, J R STAT SOC B, V41, P113
[3]  
Bernardo JM, 1999, BAYESIAN STATISTICS 6, P101
[4]  
Christensen R., 1997, Log-linear models and logistic regression, V2nd
[5]  
Clyde MA, 1999, BAYESIAN STATISTICS 6, P157
[6]   Bayesian test of homogeneity for Markov chains [J].
Dupuis, JA .
STATISTICS & PROBABILITY LETTERS, 1997, 31 (04) :333-338
[7]  
DUPUIS JA, 1994, 9457 CREST
[8]  
GELFAND AE, 1994, J ROY STAT SOC B MET, V56, P501
[9]  
Gelman A., 1996, BAYESIAN STAT, P599
[10]  
GODSILL S, 2001, J COMPUT GRAPHICAL S, V10