Some theory for Fisher's linear discriminant function, 'naive Bayes', and some alternatives when there are many more variables than observations

被引:379
作者
Bickel, PJ [1 ]
Levina, E
机构
[1] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
[2] Univ Michigan, Dept Stat, Ann Arbor, MI 48109 USA
关键词
Fisher's linear discriminant; Gaussian coloured noise; minimax regret; naive Bayes;
D O I
10.3150/bj/1106314847
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We show that the 'naive Bayes' classifier which assumes independent covariates greaty ourperforms the Fisher linear discriminant rule under broad conditions when the number of variable grows,; faster than the number of observations, in the classical problem of discriminating between two normal populations. We also introduce a class of rules spanning the range between independence and arbitrary dependence. These rules are shown to achieve Bayes consistency for the Gaussian 'coloured noise' model and to adapt to a spectrum of convergence rates, which we Conjecture to be minimax.
引用
收藏
页码:989 / 1010
页数:22
相关论文
共 13 条