Learning Bayesian network parameters under incomplete data with domain knowledge

被引:90
作者
Liao, Wenhui [1 ]
Ji, Qiang [2 ]
机构
[1] Thomson Reuters, Eagan, MN 55123 USA
[2] Rensselaer Polytech Inst, ECSE, Troy, NY 12180 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Bayesian network parameter learning; Missing data; EM algorithm; Facial action unit (AU) recognition;
D O I
10.1016/j.patcog.2009.04.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Bayesian networks (BNs) have gained increasing attention in recent years. One key issue in Bayesian networks is parameter learning. When training data is incomplete or sparse or when multiple hidden nodes exist, learning parameters in Bayesian networks becomes extremely difficult. Under these circumstances, the learning algorithms are required to operate in a high-dimensional search space and they could easily get trapped among copious local maxima. This paper presents a learning algorithm to incorporate domain knowledge into the learning to regularize the otherwise ill-posed problem, to limit the search space, and to avoid local optima. Unlike the conventional approaches that typically exploit the quantitative domain knowledge such as prior probability distribution, our method systematically incorporates qualitative constraints on some of the parameters into the learning process. Specifically, the problem is formulated as a constrained optimization problem, where an objective function is defined as a combination of the likelihood function and penalty functions constructed from the qualitative domain knowledge. Then, a gradient-descent procedure is systematically integrated with the E-step and M-step of the EM algorithm, to estimate the parameters iteratively until it converges. The experiments with both synthetic data and real data for facial action recognition show our algorithm improves the accuracy of the learned BN parameters significantly over the conventional EM algorithm. (C) 2009 Elsevier Ltd. All rights reserved.
引用
收藏
页码:3046 / 3056
页数:11
相关论文
共 28 条
[1]  
[Anonymous], 2005, P 21 C UNC ART INT
[2]  
Bartlett MS, 2005, PROC CVPR IEEE, P568
[3]  
Bauer Eric., 1997, Proceedings of the 13th Conference on Uncertainty in Artifical Intelligence, P3
[4]   Adaptive probabilistic networks with hidden variables [J].
Binder, J ;
Koller, D ;
Russell, S ;
Kanazawa, K .
MACHINE LEARNING, 1997, 29 (2-3) :213-244
[5]   Operations for Learning with Graphical Models [J].
Buntine, Wray L. .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1994, 2 :159-225
[6]  
COWELL RG, 1999, PARAMETER LEARNING I, V21
[7]   Bayesian networks and information retrieval:: an introduction to the special issue [J].
de Campos, LM ;
Fernández-Luna, JM ;
Huete, JF .
INFORMATION PROCESSING & MANAGEMENT, 2004, 40 (05) :727-733
[8]  
DELAGE E, 2006, P IEEE INT C COMP VI
[9]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[10]  
DRUZDZEL MJ, 1995, P 11 C UNC ART INT, P141