Stability of stochastic approximation under verifiable conditions

被引:141
作者
Andrieu, C
Moulines, É
Priouret, P
机构
[1] Univ Bristol, Sch Math, Bristol BS8 1TW, Avon, England
[2] Ecole Natl Super Telecommun Bretagne, CNRS, URA 820, F-75634 Paris, France
[3] Univ Paris 06, Lab Probabil & Modelisat Aleatoire, CNRS, URA 224, F-75252 Paris, France
关键词
stochastic approximation; state-dependent noise; randomly varying truncation; adaptive Markov chain Monte Carlo;
D O I
10.1137/S0363012902417267
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we address the problem of the stability and convergence of the stochastic approximation procedure theta(n+1) = theta(n) + gamma(n+1)[h(theta(n)) + xi(n+1)]. The stability of such sequences {theta(n)} is known to heavily rely on the behavior of the mean field h at the boundary of the parameter set and the magnitude of the stepsizes used. The conditions typically required to ensure convergence, and in particular the boundedness or stability of {theta(n)}, are either too difficult to check in practice or not satisfied at all. This is the case even for very simple models. The most popular technique for circumventing the stability problem consists of constraining {theta(n)} to a compact subset K in the parameter space. This is obviously not a satisfactory solution, as the choice of K is a delicate one. In this paper we first prove a "deterministic" stability result, which relies on simple conditions on the sequences {xi(n)} and {gamma(n)}. We then propose and analyze an algorithm based on projections on adaptive truncation sets, which ensures that the aforementioned conditions required for stability are satisfied. We focus in particular on the case where {xi(n)} is a so-called Markov state-dependent noise. We establish both the stability and convergence with probability 1 (w.p.1) of the algorithm under a set of simple and verifiable assumptions. We illustrate our results with an example related to adaptive Markov chain Monte Carlo algorithms.
引用
收藏
页码:283 / 312
页数:30
相关论文
共 31 条
[1]   Stochastic approximation or nonexpansive maps:: Application to Q-learning algorithms [J].
Abounadi, J ;
Bertsekas, DP ;
Borkar, V .
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2002, 41 (01) :1-22
[2]  
ANDRIEU C, 2001, CAHIERS CEREMADE, V125
[3]  
ANDRIEU C, IN PRESS ANN APPL PR
[4]  
[Anonymous], 1997, APPL MATH
[5]  
BARTUSEK J, 2000, THESIS U MARYLAND CO
[6]  
Benveniste A, 1990, Adaptive algorithms and stochastic approximations
[7]   Stability of annealing schemes and related processes [J].
Borkar, VS .
SYSTEMS & CONTROL LETTERS, 2000, 41 (05) :325-331
[8]   The ODE method for convergence of stochastic approximation and reinforcement learning [J].
Borkar, VS ;
Meyn, SP .
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2000, 38 (02) :447-469
[9]  
Buche R, 2001, SIAM J CONTROL OPTIM, V40, P1011, DOI 10.1137/S0363012999361639
[10]  
Chen H.-F., 2002, NONCON OPTIM ITS APP, V64