深度强化学习
Adam婷
笔者在人工智能/机器学习领域中默默探索,时而迷惘,时而欣喜。
展开
-
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments基于混合合作竞争环境的多代理演员评论家算法
AbstractWe explore deep reinforcement learning methods for multi-agent domains. We begin by analyzing the difficulty of traditional algorithms in the multi-agent case:Q-learning is challenged by an inherent non-stationarity of the environment, while polic原创 2020-09-28 16:18:27 · 2415 阅读 · 0 评论 -
Hybrid Reward Architecture for Reinforcement Learning
Hybrid Reward Architecture for Reinforcement Learning31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA.AbstractOne of the main challenges in reinforcemen...原创 2019-07-01 14:21:27 · 1250 阅读 · 0 评论 -
THE BODY IS NOT A GIVEN: JOINT AGENT POLICY LEARNING AND MORPHOLOGY EVOLUTION
ABSTRACTReinforcement learning (RL) has proven to be a powerful paradigm for deriving complex behaviors from simple reward signals in a wide range of environments. When applying RL to continuous cont...原创 2019-06-30 17:16:12 · 1119 阅读 · 0 评论 -
POLICY GENERALIZATION IN CAPACITY-LIMITED REINFORCEMENT LEARNING
能力有限的加强学习中的政策一般化ABSTRACTMotivated by the study of generalization in biological intelligence, we examine reinforcement learning (RL) in settings where there are information-theoretic constraints plac...原创 2019-06-30 12:53:32 · 480 阅读 · 0 评论 -
转移价值?还是 策略? 一个可转移的连续强化学习的中心框架
TRANSFER VALUE OR POLICY? A AVALUE-CENTRIC FRAMEWORK TOWARDS TRANSFERRABLE CONTINUOUS REINFORCEMENT LEARNINGABSTRACTTransferring learned knowledge from one environment to another is an important ...原创 2019-06-30 11:14:59 · 3753 阅读 · 0 评论 -
学习控制深度加固学习中结构探索的视觉抽象
LEARNING TO CONTROL VISUAL ABSTRACTIONS FOR STRUCTURED EXPLORATION IN DEEP REINFORCEMENT LEARNINGABSTRACTExploration in environments with sparse rewards is a key challenge for reinforcement learn...原创 2019-06-30 09:59:03 · 1320 阅读 · 0 评论 -
LEARNING GOAL-CONDITIONED VALUE FUNCTIONS WITH ONE-STEP PATH REWARDS RATHER THAN GOAL- REWARDS
ABSTRACTMulti-goal reinforcement learning (MGRL) addresses tasks where the desired goal state can change for every trial. State-of-the-art algorithms model these problems such that the reward formula...原创 2019-06-30 08:22:47 · 758 阅读 · 0 评论 -
Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives
利用信息约束基元的竞争集合强化学习Anirudh Goyal1, Shagun Sodhani1, Jonathan Binas1, Xue Bin Peng2Sergey Levine2, Yoshua Bengio1y1Mila, Université de Montréal2University of California, Berkeley yCIFAR Senior Fello...原创 2019-06-27 19:50:29 · 1560 阅读 · 0 评论 -
THE WISDOM OF THE CROWD: RELIABLE DEEP REINFORCEMENT LEARNING THROUGH ENSEMBLES OF Q--FUNCTIONS
ABSTRACTReinforcement learning agents learn by exploring the environment and then ex-ploiting what they have learned. This frees the human trainers from having to know the preferred action or intrins...原创 2019-06-27 11:20:52 · 1087 阅读 · 0 评论 -
TARMAC: TARGETED MULTI-AGENT COMMUNICATION(TARMAC:目标多代理通信)
ABSTRACTWe explore a collaborative multi-agent reinforcement learning setting where a team of agents attempts to solve cooperative tasks in partially-observable environ-ments. In this scenario, learn...原创 2019-06-27 10:16:27 · 1864 阅读 · 1 评论 -
LEARNING TO SCHEDULE COMMUNICATION IN MULTI-AGENT REINFORCEMENT LEARNING
ABSTRACTMany real-world reinforcement learning tasks require multiple agents to make se- quential decisions under the agents’ interaction, where well-coordinated actions among the agents are crucial ...原创 2019-06-24 10:56:55 · 3962 阅读 · 0 评论 -
UNIVERSAL SUCCESSOR FEATURES FOR TRANSFER REINFORCEMENT LEARNING(转移强化学习的通用后继特征)
ABSTRACTTransfer in Reinforcement Learning (RL) refers to the idea of applying knowledge gained from previous tasks to solving related tasks. Learning a universal value function (Schaul et al., 2015)...原创 2019-06-27 08:23:17 · 1621 阅读 · 0 评论 -
Learning to Communicate with Deep Multi-Agent Reinforcement Learning
AbstractWe consider the problem of multiple agents sensing and acting in environments with the goal of maximising their shared utility. In these environments, agents must learn communication protocol...原创 2019-06-23 23:59:28 · 2181 阅读 · 0 评论 -
REINFORCEMENT LEARNING USING QUANTUM BOLTZMANN MACHINES利用量子波兹曼机进行强化学习
REINFORCEMENT LEARNING USING QUANTUM BOLTZMANN MACHINESAbstract. We investigate whether quantum annealers with select chip layouts can outperform classical computers in reinforcement learning tasks. ...原创 2019-07-07 20:43:04 · 1750 阅读 · 0 评论 -
Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning
Averaged-DQN:深度强化学习的方差减少和稳定性AbstractInstability and variability of Deep Reinforcement Learning (DRL) algorithms tend to adversely af-fect their performance. Averaged-DQN is a sim-ple extension to th...原创 2019-07-01 19:40:39 · 1936 阅读 · 0 评论 -
基于深度强化学习的路径规划笔记
MazePathFinder using deep Q Networks该程序将由几个封锁(由块颜色表示)组成的图像作为输入,起始点由蓝色表示,目的地由绿色表示。 它输出一个由输入到输出的可能路径之一组成的图像。 下面显示的是程序的输入和输出。输入图像被馈送到由2个conv和2个fc层组成的模型,其输出对应于底部和右侧动作的Q值。 代理根据哪个Q值更大而向右或向下移动,并且使用代理的新位置...原创 2019-04-24 20:57:31 · 29693 阅读 · 19 评论 -
强化学习:使用OpenAI Gym Toolkit进行蒙特卡洛学习简介
Introduction当你听到“强化学习”这个词时,你首先想到的是什么? 最常见的想法是 - 太复杂而且数学太多。 但我在此向您保证,这是一个非常迷人的研究领域 - 我的目标是将我的文章中的这些技术分解为易于理解的概念。我相信你一定听说过OpenAI和DeepMind。 这是两个领先的人工智能组织,他们在这一领域取得了重大进展。 OpenAI机器人团队能够击败Dota 2中的业余游戏玩家团队...原创 2019-04-27 21:07:07 · 2244 阅读 · 1 评论 -
Nuts and Bolts of Reinforcement Learning:时间差异(TD)学习简介
Introduction当DeepMind提出一种在ATARI游戏中达到超人级别的算法时,Q-learning成为数据科学中的家喻户晓的名字。 它是强化学习(RL)的核心组成部分之一。 每当我读到RL时,我经常会遇到Q-learning。但是,Q学习与我们的时差学习主题有什么关系呢? 让我举一个例子来说明时间差异学习是什么。Rajesh计划在他的车里从德里前往斋浦尔。 快速查看Google地...原创 2019-04-27 19:14:28 · 822 阅读 · 0 评论 -
使用Python中的OpenAI Gym进行深度Q-Learning的实践介绍
Introduction我一直对游戏着迷。 看似无限的选择可以在紧迫的时间线下执行一个动作 - 这是一个惊心动魄的经历。 没有什么比得上它了。因此,当我读到DeepMind想出的令人难以置信的算法(如AlphaGo和AlphaStar)时,我被迷住了。 我想学习如何在自己的机器上制作这些系统。 这使我进入深度强化学习的世界(Deep RL)。即使您不参与游戏,Deep RL也很重要。 只需查...原创 2019-04-27 18:45:37 · 2856 阅读 · 0 评论