构建强化学习_如何构建强化学习项目(第1部分)

构建强化学习

Ten months ago, I started my work as an undergraduate researcher. What I can clearly say is that it is true that working on a research project is hard, but working on an Reinforcement Learning (RL) research project is even harder!

牛逼恩个月前,我开始了我的工作,作为一个大学生研究员。 我可以明确地说的是, 从事研究项目确实很辛苦,但是从事强化学习(RL)研究项目的确更难!

What made it challenging to work on such a project was the lack of proper online resources for structuring such type of projects;

从事这样一个项目的挑战是缺乏适当的在线资源来构造这种类型的项目 ;

  • Structuring a Web Development project? Check!

    构建Web开发项目? 检查!

  • Structuring a Mobile Development project? Check!

    构建移动开发项目? 检查!

  • Structuring a Machine Learning project? Check!

    构建机器学习项目? 检查!

  • Structuring a Reinforcement Learning project? Not really!

    构建强化学习项目? 并不是的!

To better guide future novice researchers, beginner machine learning engineers, and amateur software developers to start their RL projects, I pulled up this non-comprehensive step-by-step guide for structuring an RL project which will be divided as follows:

为了更好地指导未来的新手研究人员,初学者机器学习工程师和业余软件开发人员启动RL项目,我整理了这份非全面的分步指南,以构建RL项目 ,该指南分为以下几部分:

  1. Start the Journey: Frame your Problem as an RL Problem

    开始旅程:将您的问题定为RL问题

  2. Choose your Weapons: All the Tools You Need to Build a Working RL Environment

    选择武器:建立有效的RL环境所需的所有工具

  3. Face the Beast: Pick your RL (or Deep RL) Algorithm

    面对野兽:选择您的RL(或深度RL)算法

  4. Tame the Beast: Test the Performance of the Algorithm

    驯服野兽:测试算法的性能

  5. Set it Free: Prepare your Project for Deployment/Publishing

    免费设置:为部署/发布准备项目

In this post, we will discuss the first part of this series:

在本文中,我们将讨论本系列的第一部分:

开始旅程:将您的问题定为RL问题 (Start the Journey: Frame your Problem as an RL Problem)

Image for post
giphy giphy

This step is the most crucial in the whole project. First, we need to make sure whether Reinforcement Learning can be actually used to solve your problem or not.

这是整个项目中最关键的一步。 首先,我们需要确定强化学习是否可以真正用于解决您的问题

1.将问题视为马尔可夫决策过程(MDP) (1. Framing the Problem as a Markov Decision Process (MDP))

For a problem to be framed as an RL problem, it must be first modeled as a Markov Decision Process (MDP).

对于要被构造为RL问题的问题,必须首先将其建模为马尔可夫决策过程(MDP)。

A Markov Decision Process (MDP) is a representation of the sequence of actions of an agent in an environment and their consequences on not only the immediate rewards but also future states and rewards.

马尔可

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值