Parallel Reinforcement Learing Systems Including Exploration-Oriented Agents
スポンサーリンク
概要
- 論文の詳細を見る
We propose a new strategy for parallel reinforcement learning; using this strategy, the optimal value function and policy can be constructed more quickly than by using traditional strategies. We define two types of agents: the main agent and the exploration-oriented agent. The main agent selects actions mainly for exploitation, and the exploration-oriented agent concentrates on exploration using the k-certainty exploration method. These agents learn in the same environment in parallel and update the shared value function alternately. By using this strategy, the construction of the optimal value function is expected, and the optimal actions can be selected by the main agents quickly. The experimental results of the n-armed bandit problems showed the availability of our method.
- 日本知能情報ファジィ学会の論文
日本知能情報ファジィ学会 | 論文
- FCNによる自律エージェントの行動制御と行動解析 : タルタロス問題への応用
- コンフリクト, 迷いと意思決定(意思決定)
- 認知心理学における類似性研究(類似尺度と情報検索)
- アメリカ留学体験記
- 文脈への意味の位置付けを用いた対話システムとその評価(言語,テキストの知能情報処理)