2B1-4 強化学習を用いたコンピュータ将棋における状態表現に関する考察(OS7:エージェントの学習・進化)

概要

論文の詳細を見る
Recently, evaluation functions for Shogi by using computer has attracted much attention due to Bonanza based on machine learning. The Bonanza has achieved one of the strongest computer players for Shogi, which often defeat human players. In order to learn the evaluation functions, Bonanza utilizes a considerable number of game records. Meanwhile, reinforcement learning can learn evaluation values based on experiences. The reinforcement learning, however, has not succeeded in learning with a large number of fine-grained feature values. In this paper, we investigate the effects of the state representations in the evaluation functions for learning results, where the state representations are derived from the ones of 'Bonanza'.
一般社団法人日本機械学会の論文
2011-09-01