Robust learning with missing data

被引:107
作者
Ramoni, M [1 ]
Sebastiani, P
机构
[1] Harvard Univ, Sch Med, Childrens Hosp, Informat Program, Boston, MA 02115 USA
[2] Univ Massachusetts, Dept Math & Stat, Amherst, MA 01002 USA
关键词
Bayesian learning; Bayesian networks; Bayesian classifiers; probability intervals; missing data;
D O I
10.1023/A:1010968702992
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces a new method, called the robust Bayesian estimator (RBE), to learn conditional probability distributions from incomplete data sets. The intuition behind the RBE is that, when no information about the pattern of missing data is available, an incomplete database constrains the set of all possible estimates and this paper provides a characterization of these constraints. An experimental comparison with two popular methods to estimate conditional probability distributions from incomplete data-Gibbs sampling and the EM algorithm-shows a gain in robustness. An application of the RBE to quantify a naive Bayesian classifier from an incomplete data set illustrates its practical relevance.
引用
收藏
页码:147 / 170
页数:24
相关论文
共 29 条
[1]  
[Anonymous], 1968, ESTIMATION PROBABILI
[2]  
Blake C.L., 1998, UCI repository of machine learning databases
[3]  
Castillo E., 1997, Expert Systems and Probabilistic Network Models
[4]  
Cheeseman P.C., 1996, ADV KNOWLEDGE DISCOV, V180, P153, DOI https://doi.org/10.5555/257938.257954
[5]  
COOPER GF, 1992, MACH LEARN, V9, P309, DOI 10.1007/BF00994110
[6]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[7]   On the optimality of the simple Bayesian classifier under zero-one loss [J].
Domingos, P ;
Pazzani, M .
MACHINE LEARNING, 1997, 29 (2-3) :103-130
[8]   PROBABILITY INTERVALS OVER INFLUENCE DIAGRAMS [J].
FERTIG, KW ;
BREESE, JS .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1993, 15 (03) :280-286
[9]   Bayesian network classifiers [J].
Friedman, N ;
Geiger, D ;
Goldszmidt, M .
MACHINE LEARNING, 1997, 29 (2-3) :131-163
[10]  
FRIEDMAN N, 1997, P 13 C UNC ART INT S, P1277