Model-Based Policy Gradients with Parameter-Based Exploration by Least-Squares Conditional Density Estimation
スポンサーリンク
概要
- 論文の詳細を見る
The goal of reinforcement learning (RL)is to let an agent acquire the optimal control policy in an unknown environment so that the future expected rewards are maximized. The model-free RL approach directly learns the policy based on data samples. Although using many samples tends to improve the accuracy of policy learning, collecting a large number of samples is often expensive in practice. 0n the other hand, the model-based RL approach first estimates the transition model of the environment and then leams the policy based on the estimated transition model. Thus, if the transition model is accurately learned from a small amount of data, the model-based approach can perform better than the model-free approach. In this paper, we propose a novel model-based RL method by combining a recently proposed model-free policy search method called the policy gradients with parameter-based exploration and the state-of-the-art transition model estimator called least-squares conditional density estimation. Through experiments, we demonstrate the usefulness of the proposed method.
- 一般社団法人電子情報通信学会の論文
- 2013-02-25
著者
-
Morimoto Jun
Atr Computational Neuroscience Laboratories
-
Sugiyama Masashi
Tokyo Inst. Of Technol.
-
Zhao Tingting
Tokyo Institute of Technology
-
Mori Syogo
Tokyo Institute of Technology
-
Tangkaratt Voot
Tokyo Institute of Technology
関連論文
- Statistical active learning for efficient value function approximation in reinforcement learning (ニューロコンピューティング)
- Lighting Condition Adaptation for Perceived Age Estimation
- Computationally Efficient Multi-task Learning with Least-squares Probabilistic Classifiers
- Learning to Acquire Whole-Body Humanoid Center of Mass Movements to Achieve Dynamic Tasks
- CB : a humanoid research platform for exploring neuroscience
- A Unified Framework of Density Ratio Estimation under Bregman Divergence
- Adaptive importance sampling with automatic model selection in value function approximation (ニューロコンピューティング)
- Reinforcement learning with via-point representation
- Improving Model-based Reinforcement Learning with Multitask Learning
- Improving Model-based Reinforcement Learning with Multitask Learning
- Least-Squares Conditional Density Estimation
- Direct Importance Estimation with a Mixture of Probabilistic Principal Component Analyzers
- カーネル密度比推定の統計的解析(学習問題の解析,テキスト・Webマイニング,一般)
- A Semi-Supervised Approach to Perceived Age Prediction from Face Images
- Conditional Density Estimation Based on Density Ratio Estimation
- Conditional Density Estimation Based on Density Ratio Estimation
- A density ratio approach to two-sample test (パターン認識・メディア理解)
- A density ratio approach to two-sample test (情報論的学習理論と機械学習)
- Theoretical Analysis of Density Ratio Estimation
- FOREWORD
- Superfast-Trainable Multi-Class Probabilistic Classifier by Least-Squares Posterior Fitting
- Direct Importance Estimation with Gaussian Mixture Models
- Improving the Accuracy of Least-Squares Probabilistic Classifiers
- Artist agent A[2]: stroke painterly rendering based on reinforcement learning (パターン認識・メディア理解)
- Artist agent A[2]: stroke painterly rendering based on reinforcement learning (情報論的学習理論と機械学習)
- Least-Squares Independence Test
- Density Difference Estimation
- Winning the Kaggle Algorithmic Trading Challenge with the Composition of Many Models and Feature Engineering
- Artist Agent: A Reinforcement Learning Approach to Automatic Stroke Generation in Oriental Ink Painting
- Early stopping Heuristics in Pool-Based Incremental Active Learning for Least-Squares Probabilistic Classifier
- Computationally Efficient Multi-Label Classification by Least-Squares Probabilistic Classifiers
- Multi-Task Approach to Reinforcement Learning for Factored-State Markov Decision Problems
- Constrained Least-Squares Density-Difference Estimation
- A Density-ratio Framework for Statistical Data Processing
- Computationally Efficient Multi-task Learning with Least-squares Probabilistic Classifiers
- Model-Based Policy Gradients with Parameter-Based Exploration by Least-Squares Conditional Density Estimation
- A Density-ratio Framework for Statistical Data Processing
- FOREWORD
- On Kernel Parameter Selection in Hilbert-Schmidt Independence Criterion