Reinforcement Learning: An Introduction - Richard S. Sutton Part 0: Introduction

最新推荐文章于 2022-11-27 08:12:08 发布

chitoseyono

最新推荐文章于 2022-11-27 08:12:08 发布

阅读量521

点赞数 1

分类专栏： MachineLearning 文章标签：强化学习

本文链接：https://blog.csdn.net/chitoseyono/article/details/86624436

版权

MachineLearning 专栏收录该内容

15 篇文章 0 订阅

订阅专栏

只是据说是无痛入门…坚持看下来会有收获，但只看也不够的。

Chapter 1 Introduction

Distinguishing features

trial-and-error search
delayed reward

The agent has to exploit what it has already experienced in order to
obtain reward, but in also has to explore in order to make better
action selections in the future.

Challenges

trade-off problem between exploration and exploitation
consider the whole problem of a goal-directed agent interacting with an uncertain environment

Elements

policy
reward
value function
model of the environment

Without rewards there could be no values, and the only purpose of
estimating values is to achieve more reward. Nevertheless, it is
values with which we are most concerned when making and evaluating
decisions. Action choices are made based on value judgments. We seek
actions that bring about states of highest value, not highest reward,
because these actions obtain the greatest amount of reward for us over
the long run.

Tic-Tac-Toe Example
https://blog.csdn.net/JerryLife/article/details/81385766

Reinforcement learning uses the formal framework of Markov decision
processes to define the interaction between a learning agent and its
environment in terms of states, actions, and rewards. This framework
is intended to be a simple way of representing essential features of
the artificial intelligence problem

chitoseyono

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
Reinforcement Learning: An Introduction - Richard S. Sutton Part 0: Introduction

RL: An Introduction Part 0: Introduction
复制链接

扫一扫

专栏目录