A Span Seminorm Approach to Controlled Markov Set-Chains
スポンサーリンク
概要
- 論文の詳細を見る
In a controlled Markov set-chain with finite state and action spaces, we find a policy, called average-optimal, which maximizes Cesaro sums of each time's reward over all stationaly policies under some partial order. Under uniformly scrambling conditions, the dynamic programming operator for our model is proved to be a contraction in a span seminorm. And, analysing the behavior of expected total rewards over the T-horizon as T approaches ∞ by a fixed point of a span-contraction operator we give a constructive proof for the existence of an average-optimal policy.
- 千葉大学の論文
- 1998-02-28
著者
-
Kurano Masami
Faculty Of Education Chiba University
-
Hosaka Masanori
Graduate School Of Science And Technology
-
Song Jinjie
Graduate School of Science and Technology
-
Hosaka Masanori
Faculty of Education, Chiba University