Reinforcement Learning

最新推荐文章于 2023-11-26 17:00:00 发布

metasearch

最新推荐文章于 2023-11-26 17:00:00 发布

阅读量1k

点赞数

分类专栏：人工智能文章标签： transition action algorithm input types output

本文链接：https://blog.csdn.net/metasearch/article/details/7212249

版权

人工智能专栏收录该内容

13 篇文章 0 订阅

订阅专栏

Reinforcement learning:

1 Introduction:

Inmachine learning area, there are three types of learning, Supervised learning,Reinforcement learning, Unsupervised learning. The difference between them arebelow:

1) Supervised Learning: In Supervised Learning,teacher provides a desired response (output) for a given situation (input) andlearner is suppose to learn mapping of best response given a situation.

2) Reinforcement Learning (RL): InRL, given a situation (state) and associated action a reward is known tolearner. Based on these rewards, learner is supposed to learn the best actiongiven a situation.

3) Unsupervised Learning: Inunsupervised learning, learner tries to learn the pattern in data representing differentsituations (input).

2 The model of environments

The environments are modeled in MDP. That contains the following parts:

1) State: S={S₁, S₂ ,... ,S_n}

2) Action: A={a1, a2,...an}

3) Reward of immediately action: R(S)

4) probability of State Transition: Pr(S'|S, a)

And our goal is to find an optimal policy:

policy: is mapping of states to actions: π(s)

3 How to find policy in environments

Our goal is to find optimal policy given astart point and a goal in an environment.

RL have two flavor of method, one isthe agent know well the environments, which are represented by MDP. The otheris the agent know part of the environments.