Reinforcement Learning_By David Silver笔记一: Introduction

最新推荐文章于 2020-06-27 17:47:59 发布

wang2008start

最新推荐文章于 2020-06-27 17:47:59 发布

阅读量291

点赞数

分类专栏：深度学习强化学习自然语言处理文章标签： RL MDP

本文链接：https://blog.csdn.net/wang2008start/article/details/78773502

版权

67 篇文章 0 订阅

订阅专栏

28 篇文章 3 订阅

订阅专栏

9 篇文章 0 订阅

订阅专栏

这篇博客介绍了强化学习的基础概念，包括智能体与环境、历史和状态、完全可观测与部分可观测环境。此外，还讨论了政策、价值函数、模型在强化学习中的作用，以及探索与利用、预测与控制的概念。

摘要由CSDN通过智能技术生成

Agent and Environment,History and state, Agent state, Environment state, Information state, Fully observable enviroments, Partially observable enviroments

环境完全可观测

环境部分可观测
Policy: agents’ behaviour function, a map from state to action
Value Function: how good is each state/action, a prediction of future reward
Model: agents’ representation of enviroment, predicts what the enviroment will do next. predicts the next state/reward
Categorys: Value based, Policy based, Actor Critic, Model Free, Model based
E&E: Exploration finds more information about the enviroment; Exploitation exploits known information to maximise reward.
Prediction: evaluate the future. Control: optimise the future.