Adaptive probabilistic networks with hidden variables

被引：196

作者：

Binder, J ^{[1
]}

Koller, D

Russell, S

Kanazawa, K

机构：

[1] Univ Calif Berkeley, Div Comp Sci, Berkeley, CA 94720 USA

[2] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA

[3] Microsoft Corp, Redmond, WA 98052 USA

来源：

MACHINE LEARNING | 1997年 / 29卷 / 2-3期

关键词：

Bayesian networks; gradient descent; prior knowledge; dynamic networks; hybrid networks;

D O I：

10.1023/A:1007421730016

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Probabilistic networks (also known as Bayesian belief networks) allow a compact description of complex stochastic relationships among several random variables. They are used widely for uncertain reasoning in artificial intelligence. In this paper, we investigate the problem of learning probabilistic networks with known structure and hidden variables. This is an important problem, because structure is much easier to elicit from experts than numbers, and the world is rarely fully observable. We present a gradient-based algorithm and show that the gradient can be computed locally, using information that is available as a byproduct of standard inference algorithms for probabilistic networks. Our experimental results demonstrate that using prior knowledge about the structure, even with hidden variables, can significantly improve the learning rate of probabilistic networks. We extend the method to networks in which the conditional probability tables are described using a small number of parameters. Examples include noisy-OR nodes and dynamic probabilistic networks. We show how this additional structure can be exploited by our algorithm to speed up the learning even further. We also outline an extension to hybrid networks, in which some of the nodes take on values in a continuous domain.

引用

页码：213 / 244

页数：32

共 55 条

[1]

ANDERSEN SK, 1989, P 11 INT JOINT C ART

[2]

[Anonymous], NEUROCOMPUTING ALGOR

[3] Learning by Asymmetric Parallel Boltzmann Machines [J].

Apolloni, Bruno ;

de Falco, Diego .

NEURAL COMPUTATION, 1991, 3 (03) :402-408

[4]

BAUM EB, 1988, NEURAL INFORMATION P, P52

[5]

Bishop C. M., 1995, Neural networks for pattern recognition

[6] A guide to the literature on learning probabilistic networks from data [J].

Buntine, W .

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1996, 8 (02) :195-210

[7] Operations for Learning with Graphical Models [J].

Buntine, Wray L. .

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1994, 2 :159-225

[8] A BAYESIAN METHOD FOR THE INDUCTION OF PROBABILISTIC NETWORKS FROM DATA [J].

COOPER, GF ;

HERSKOVITS, E .

MACHINE LEARNING, 1992, 9 (04) :309-347

[9]

Daganzo C., 2014, MULTINOMIAL PROBIT T

[10] The sample complexity of learning fixed-structure Bayesian networks [J].

Dasgupta, S .

MACHINE LEARNING, 1997, 29 (2-3) :165-180

← 1 2 3 4 5 6 →