Accelerating EM for large databases

被引：65

作者：

Thiesson, B ^{[1
]}

Meek, C ^{[1
]}

Heckerman, D ^{[1
]}

机构：

[1] Microsoft Corp, Res, Redmond, WA 98052 USA

来源：

MACHINE LEARNING | 2001年 / 45卷 / 03期

关键词：

Expectation Maximization algorithm; incremental EM; lazy EM; online EM; data blocking; mixture models; clustering;

D O I：

10.1023/A:1017986506241

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The EM algorithm is a popular method for parameter estimation in a variety of problems involving missing data. However, the EM algorithm often requires significant computational resources and has been dismissed as impractical for large databases. We present two approaches that significantly reduce the computational cost of applying the EM algorithm to databases with a large number of cases, including databases with large dimensionality. Both approaches are based on partial E-steps for which we can use the results of Neal and Hinton (In Jordan, M. (Ed.), Learning in Graphical Models, pp. 355-371. The Netherlands: Kluwer Academic Publishers) to obtain the standard convergence guarantees of EM. The first approach is a version of the incremental EM algorithm, described in Neal and Hinton (1998), which cycles through data cases in blocks. The number of cases in each block dramatically effects the efficiency of the algorithm. We provide a method for selecting a near optimal block size. The second approach, which we call lazy EM, will, at scheduled iterations, evaluate the significance of each data case and then proceed for several iterations actively using only the significant cases. We demonstrate that both methods can significantly reduce computational costs through their application to high-dimensional real-world and synthetic mixture modeling problems for large databases.

引用

页码：279 / 299

页数：21

共 22 条

[1]

Agresti A., 1990, Analysis of categorical data

[2]

Bradley PS, 1998, SCALING EXPECTATION, P0

[3]

CHEESEMAN P, 1995, ADV KNOWLEDGE DISCOV, P153

[4] Efficient approximations for the marginal likelihood of Bayesian networks with hidden variables [J].

Chickering, DM ;

Heckerman, D .

MACHINE LEARNING, 1997, 29 (2-3) :181-212

[5]

Chickering DM, 1999, UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, P109

[6] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].

DEMPSTER, AP ;

LAIRD, NM ;

RUBIN, DB .

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38

[7]

GREEN PJ, 1990, J ROY STAT SOC B MET, V52, P443

[8]

HUANG X, 1995, IEEE INT C AC SPEECH, V1, P93

[9] CONJUGATE-GRADIENT ACCELERATION OF THE EM ALGORITHM [J].

JAMSHIDIAN, M ;

JENNRICH, RI .

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (421) :221-228

[10]

LOUIS TA, 1982, J ROY STAT SOC B MET, V44, P226

← 1 2 3 →