Profit Sharing Based Reinforcement Learning Systems in Continuous State Spaces

概要

論文の詳細を見る
Reinforcement Learning is a kind of machine learning. We know Profit Sharing, the Rational Policy Making algorithm (RPM), the Penalty Avoiding Rational Policy Making algorithm and PS-r* to guarantee the rationality in a typical class of the Partially Observable Markov Decision Processes. However they cannot treat continuous state spaces. In this paper, we present a solution to adapt them in continuous state spaces. Previously, we give RPM a prototype mechanism to treat continuous state spaces. However it cannot treat any penalty. In this paper, we extend it to the environment where there is a reward and a penalty. We show the effectiveness of the proposed method in numerical examples.
日本知能情報ファジィ学会の論文