强化学习入门--甄景贤

最新推荐文章于 2021-02-16 01:56:14 发布

土豆西瓜大芝麻

最新推荐文章于 2021-02-16 01:56:14 发布

阅读量266

点赞数

分类专栏： AI

原文链接：https://www.zhihu.com/question/41775291/answer/93276779

版权

AI 专栏收录该内容

28 篇文章 2 订阅

订阅专栏

Demo的地址：https://studywolf.wordpress.com/2012/11/25/reinforcement-learning-q-learning-and-exploration/

这里写图片描述

我在外国博客上写过一些基於人工智能的文章，有些搬到了博客园，上面转录的是其中一篇：
什么是强化学习？

我正在研究的 AI architecture 是用强化学习控制 recurrent 神经网络，我相信这个设置可以做到逻辑推理和答问题的功能，基本上就是 strong AI。但还有一些未解决的细节。论文的标题是《游荡在思考的迷宫中》，即将发表。

補充：還有一點，就是監督學習的問題可以很容易化為強化學習的問題（雖然這樣增加了複雜性而沒有益處），但反之則沒有一般的辦法。見：Reinforcement Learning and its Relationship to Supervised Learning，Barto and Dietterich, 2004.
"But is it possible to do this the other way around: to convert a reinforcement learning task into a supervised learning task?
"In general, there is no way to do this. The key difficulty is that whereas in supervised learning, the goal is to reconstruct the unknown function f that assigns output values y to data points x, in reinforcement learning, the goal is to find the input x* that gives the maximum reward R(x*).
"Nonetheless, is there a way that we could apply ideas from supervised learning to perform reinforcement learning? Suppose, for example, that we are given a set of training examples of the form (xi, R(xi)), where the xi are points and the R(xi) are the corresponding observed rewards. In supervised learning, we would attempt to find a function h that approximates R well. If h were a perfect approximation of R, then we could find x* by applying standard optimization algorithms to h."

土豆西瓜大芝麻

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
强化学习入门--甄景贤

我在外国博客上写过一些基於人工智能的文章，有些搬到了博客园，上面转录的是其中一篇：什么是强化学习？我正在研究的 AI architecture 是用强化学习控制 recurrent 神经网络，我相信这个设置可以做到逻辑推理和答问题的功能，基本上就是 strong AI。但还有一些未解决的细节。论文的标题是《游荡在思考的迷宫中》，即将发表。...
复制链接

扫一扫

专栏目录