CS285 Lecture 1笔记

本文介绍了强化学习的基本概念,探讨了深度学习如何通过端到端学习解决决策问题,强调了深度强化学习在复杂环境中的优势,并讨论了现代研究热点,如从奖励学习进阶、示范学习和预测学习。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Lecture 1 Introduction and Course Overview

What is RL? And why should we care?

  • Deep learning helps us handle unstructured environments

    • learn from amount of data
    • Not decision-making problems,usually recognition problems, passive problems
  • Reinforcement learning provides a formalism for behavior

    • is essentially a mathematical formalization of a decision-making problem.

    image-20210407152350883

What is Deep RL? And why should we care?

Deep RL

  • One of the main advantages of Useing deep learning is that, computer can get features(which are not designed by human) through this kind of end-to-end learning, and don’t need we human to discover them.
  • For standard RL:
    • unlikely to have features in common——probably need to design features for each task.
    • the main limiting factor for the application of RL
  • For Deep RL:
    • don’t have to design features by hands. it is a automated process

What does end-to-end learning mean for sequential decision making?

  • Traditional decision-making system:

    • consist of many different parts
    • perception, then action
    • sometimes is very difficult——some features may be very helpful and relevant to the task but may not be apparent

    traditional

  • End-to-end:

    • map the perceptron to action directly
    • allow the models to discover for itself what are valuable things to pay attention to in the image.

    image-20210407152045363

Reinforcement Learning: algorithmic foudation

Deep models: allow reinforcement learning algorithms to solve complex problems end to end——apply RL to general problems

Deep = can process complex sensory input

Reinforcement learning = can choose complex actions

Why should we study this now?

  1. Advances in deep learning
  2. Advances in reinforcement learning
  3. Advances in deep computational capbility

Beyond learning from reward

  • • Basic reinforcement learning deals with maximizing rewards
  • Advanced topics:
    • Learning reward functions from example (inverse reinforcement learning)
    • Transferring knowledge between domains (transfer learning, meta-learning)
    • Learning to predict and using prediction to act

Are there other forms of supervision?

  • Learning from demonstrations
    • Directly copying observed behavior
    • Inferring rewards from observed behavior (inverse reinforcement learning)
  • Learning from observing the world
    • Learning to predict
    • Unsupervised learning
  • Learning from other tasks
    • Transfer learning
    • Meta-learning: learning to learn
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值