Singularities in mixture models and upper bounds of stochastic complexity

被引：72

作者：

Yamazaki, K

Watanabe, S

机构：

[1] Tokyo Inst Technol, Precis Intelligence Lab, Midori Ku, Yokohama, Kanagawa 2268503, Japan

[2] Tokyo Inst Technol, Dept Computat Intelligence & Syst Sci, Midori Ku, Yokohama, Kanagawa 2268503, Japan

来源：

NEURAL NETWORKS | 2003年 / 16卷 / 07期

关键词：

learning machine; stochastic complexity; Gaussian mixture; algebraic geometry;

D O I：

10.1016/S0893-6080(03)00005-4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A learning machine which is a mixture of several distributions, for example, a gaussian mixture or a mixture of experts, has a wide range of applications. However, such a machine is a non-identifiable statistical model with a lot of singularities in the parameter space, hence its generalization property is left unknown. Recently an algebraic geometrical method has been developed which enables us to treat such learning machines mathematically. Based on this method, this paper rigorously proves that a mixture learning machine has the smaller Bayesian stochastic complexity than regular statistical models. Since the generalization error of a learning machine is equal to the increase of the stochastic complexity, the result of this paper shows that the mixture model can attain the more precise prediction than regular statistical models if Bayesian estimation is applied in statistical inference. (C) 2003 Elsevier Science Ltd. All rights reserved.

引用

页码：1029 / 1038

页数：10

共 19 条

[1]

Akaike H., 1980, BAYESIAN STAT, P143, DOI [DOI 10.1007/BF02888350, DOI 10.1007/978-1-4612-1694-0_24]

[2]

Amari S, 2001, IEICE T FUND ELECTR, VE84A, P31

[3]

ATIYAH MF, 1970, COMMUNICATIONS PURE, V13, P145

[4]

DACUNHACASTELLE D, 1997, PROBABILITY STAT, V1, P285

[5]

HARTIGAN J. A., 1985, Proceedings of the Berkley Conference in Honor of Jerzy Neyman and Jack Kiefer, V2, P807

[6] RESOLUTION SINGULARITIES OF ALGEBRAIC VARIETY OVER FIELD OF CHARACTERISTIC ZERO .I. [J].

HIRONAKA, H .

ANNALS OF MATHEMATICS, 1964, 79 (01) :109-&

[7] A STATISTICAL APPROACH TO LEARNING AND GENERALIZATION IN LAYERED NEURAL NETWORKS [J].

LEVIN, E ;

TISHBY, N ;

SOLLA, SA .

PROCEEDINGS OF THE IEEE, 1990, 78 (10) :1568-1574

[8] A PRACTICAL BAYESIAN FRAMEWORK FOR BACKPROPAGATION NETWORKS [J].

MACKAY, DJC .

NEURAL COMPUTATION, 1992, 4 (03) :448-472

[9] BOUNDS FOR PREDICTIVE ERRORS IN THE STATISTICAL-MECHANICS OF SUPERVISED LEARNING [J].

OPPER, M ;

HAUSSLER, D .

PHYSICAL REVIEW LETTERS, 1995, 75 (20) :3772-3775

[10] STOCHASTIC COMPLEXITY AND MODELING [J].

RISSANEN, J .

ANNALS OF STATISTICS, 1986, 14 (03) :1080-1100

← 1 2 →