Factorial hidden Markov models

被引:511
作者
Ghahramani, Z [1 ]
Jordan, MI
机构
[1] Univ Toronto, Dept Comp Sci, Toronto, ON M5S 3H5, Canada
[2] MIT, Dept Brain & Cognit Sci, Cambridge, MA 02139 USA
关键词
Hidden Markov models; time series; EM algorithm; graphical models; Bayesian networks; mean field theory;
D O I
10.1023/A:1007425814087
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hidden Markov models (HMMs) have proven to be one of the most widely used tools for learning probabilistic models of time series data. In an HMM, information about the past is conveyed through a single discrete variable-the hidden state. We discuss a generalization of HMMs in which this state is factor-ed into multiple state variables and is therefore represented in a distributed manner. We describe an exact algorithm far inferring the posterior probabilities of the hidden state variables given the observations, and relate it to the forward-backward algorithm for HMMs and to algorithms for more general graphical models. Doe to the combinatorial nature of the hidden state representation, this exact algorithm is intractable. As in other intractable systems, approximate inference can be carried out using Gibbs sampling or Variational methods. Within the variational framework, we present a structured approximation in which the the state variables are decoupled, yielding a tractable algorithm for learning the parameters of the model. Empirical comparisons suggest that these approximations are efficient and provide accurate alternatives to the exact methods. Finally, we use the structured approximation to model Bach's chorales and show that factorial HMMs can capture statistical structure in this data set which an unconstrained HMM cannot.
引用
收藏
页码:245 / 273
页数:29
相关论文
共 39 条
[1]   A MAXIMIZATION TECHNIQUE OCCURRING IN STATISTICAL ANALYSIS OF PROBABILISTIC FUNCTIONS OF MARKOV CHAINS [J].
BAUM, LE ;
PETRIE, T ;
SOULES, G ;
WEISS, N .
ANNALS OF MATHEMATICAL STATISTICS, 1970, 41 (01) :164-&
[2]  
Bengio Y., 1995, Advances in Neural Information Processing Systems 7, P427
[3]  
CACCIATORE TW, 1994, ADV NEURAL INFORMATI, V6, P719
[4]   MULTIPLE VIEWPOINT SYSTEMS FOR MUSIC PREDICTION [J].
CONKLIN, D ;
WITTEN, IH .
JOURNAL OF NEW MUSIC RESEARCH, 1995, 24 (01) :51-73
[5]  
Cover T. M., 2005, ELEM INF THEORY, DOI 10.1002/047174882X
[6]  
Dawid A. P., 1992, Statistics and Computing, V2, P25, DOI 10.1007/BF01890546
[7]  
Dean T., 1989, Computational Intelligence, V5, P142, DOI 10.1111/j.1467-8640.1989.tb00324.x
[8]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[9]   NEURAL NETWORKS AND THE BIAS VARIANCE DILEMMA [J].
GEMAN, S ;
BIENENSTOCK, E ;
DOURSAT, R .
NEURAL COMPUTATION, 1992, 4 (01) :1-58
[10]   STOCHASTIC RELAXATION, GIBBS DISTRIBUTIONS, AND THE BAYESIAN RESTORATION OF IMAGES [J].
GEMAN, S ;
GEMAN, D .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1984, 6 (06) :721-741