ADAPTIVE CONTROL OF CONSTRAINED MARKOV CHAINS: CRITERIA AND POLICIES

被引:23
作者
Altman, Eitan [1 ]
Shwartz, Adam [1 ]
机构
[1] Technion Israel Inst Technol, IL-32000 Haifa, Israel
关键词
D O I
10.1007/BF02055577
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
We consider the constrained optimization of a finite-state, finite action Markov chain. In the adaptive problem, the transition probabilities are assumed to be unknown, and no prior distribution on their values is given. We consider constrained optimization problems in terms of several cost criteria which are asymptotic in nature. For these criteria we show that it is possible to achieve the same optimal cost as in the non-adaptive case. We first formulate a constrained optimization problem under each of the cost criteria and establish the existence of optimal stationary policies. Since the adaptive problem is inherently non-stationary, we suggest a class of Asymptotically Stationary (AS) policies, and show that, under each of the cost criteria, the costs of an AS policy depend only on its limiting behavior. This property implies that there exist optimal AS policies. A method for generating adaptive policies is then suggested, which leads to strongly consistent estimators for the unknown transition probabilities. A way to guarantee that these policies are also optimal is to couple them with the adaptive algorithm of [3]. This leads to optimal policies for each of the adaptive constrained optimization problems under discussion.
引用
收藏
页码:101 / 134
页数:34
相关论文
共 25 条
[1]  
Altman E., 1991, IEEE T AUTO IN PRESS
[2]  
Altman E., 1991, SIAM J CONT IN PRESS, V29
[3]  
Altman E., 1987, EE PUB, V633
[4]   OPTIMAL POLICIES FOR CONTROLLED MARKOV-CHAINS WITH A CONSTRAINT [J].
BEUTLER, FJ ;
ROSS, KW .
JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1985, 112 (01) :236-252
[5]   A CONVEX ANALYTIC APPROACH TO MARKOV DECISION-PROCESSES [J].
BORKAR, VS .
PROBABILITY THEORY AND RELATED FIELDS, 1988, 78 (04) :583-602
[6]   ON MINIMUM COST PER UNIT TIME CONTROL OF MARKOV-CHAINS [J].
BORKAR, VS .
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1984, 22 (06) :965-978
[7]   ON CONTINUITY OF MINIMUM SET OF A CONTINUOUS FUNCTION [J].
DANTZIG, GB ;
FOLKMAN, J ;
SHAPIRO, N .
JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1967, 17 (03) :519-&
[8]  
DERMAN C, 1970, FINITE STATE MARKOVI
[9]  
Hall P., 1980, MARTINGALE LIMIT THE
[10]  
Hernandez-Lerma O., J MATH ANAL IN PRESS