Adam婷
笔者在人工智能/机器学习领域中默默探索,时而迷惘,时而欣喜。
展开
-
多智能体强化学习:Learning to Cooperate, Compete, and Communicate
Multiagent environments where agents compete for resources are stepping stones on the path to AGI. Multiagent environments have two useful properties: first, there is a natural curriculum—the difficulty of the environment is determined by the skill of your原创 2020-09-28 17:06:07 · 1386 阅读 · 2 评论 -
UNIVERSAL SUCCESSOR FEATURES FOR TRANSFER REINFORCEMENT LEARNING(转移强化学习的通用后继特征)
ABSTRACTTransfer in Reinforcement Learning (RL) refers to the idea of applying knowledge gained from previous tasks to solving related tasks. Learning a universal value function (Schaul et al., 2015)...原创 2019-06-27 08:23:17 · 1642 阅读 · 0 评论 -
LEARNING TO SCHEDULE COMMUNICATION IN MULTI-AGENT REINFORCEMENT LEARNING
ABSTRACTMany real-world reinforcement learning tasks require multiple agents to make se- quential decisions under the agents’ interaction, where well-coordinated actions among the agents are crucial ...原创 2019-06-24 10:56:55 · 4002 阅读 · 0 评论 -
TARMAC: TARGETED MULTI-AGENT COMMUNICATION(TARMAC:目标多代理通信)
ABSTRACTWe explore a collaborative multi-agent reinforcement learning setting where a team of agents attempts to solve cooperative tasks in partially-observable environ-ments. In this scenario, learn...原创 2019-06-27 10:16:27 · 1914 阅读 · 1 评论 -
THE WISDOM OF THE CROWD: RELIABLE DEEP REINFORCEMENT LEARNING THROUGH ENSEMBLES OF Q--FUNCTIONS
ABSTRACTReinforcement learning agents learn by exploring the environment and then ex-ploiting what they have learned. This frees the human trainers from having to know the preferred action or intrins...原创 2019-06-27 11:20:52 · 1101 阅读 · 0 评论 -
Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives
利用信息约束基元的竞争集合强化学习Anirudh Goyal1, Shagun Sodhani1, Jonathan Binas1, Xue Bin Peng2Sergey Levine2, Yoshua Bengio1y1Mila, Université de Montréal2University of California, Berkeley yCIFAR Senior Fello...原创 2019-06-27 19:50:29 · 1598 阅读 · 0 评论 -
LEARNING GOAL-CONDITIONED VALUE FUNCTIONS WITH ONE-STEP PATH REWARDS RATHER THAN GOAL- REWARDS
ABSTRACTMulti-goal reinforcement learning (MGRL) addresses tasks where the desired goal state can change for every trial. State-of-the-art algorithms model these problems such that the reward formula...原创 2019-06-30 08:22:47 · 766 阅读 · 0 评论 -
学习控制深度加固学习中结构探索的视觉抽象
LEARNING TO CONTROL VISUAL ABSTRACTIONS FOR STRUCTURED EXPLORATION IN DEEP REINFORCEMENT LEARNINGABSTRACTExploration in environments with sparse rewards is a key challenge for reinforcement learn...原创 2019-06-30 09:59:03 · 1327 阅读 · 0 评论 -
转移价值?还是 策略? 一个可转移的连续强化学习的中心框架
TRANSFER VALUE OR POLICY? A AVALUE-CENTRIC FRAMEWORK TOWARDS TRANSFERRABLE CONTINUOUS REINFORCEMENT LEARNINGABSTRACTTransferring learned knowledge from one environment to another is an important ...原创 2019-06-30 11:14:59 · 3872 阅读 · 0 评论 -
POLICY GENERALIZATION IN CAPACITY-LIMITED REINFORCEMENT LEARNING
能力有限的加强学习中的政策一般化ABSTRACTMotivated by the study of generalization in biological intelligence, we examine reinforcement learning (RL) in settings where there are information-theoretic constraints plac...原创 2019-06-30 12:53:32 · 490 阅读 · 0 评论 -
THE BODY IS NOT A GIVEN: JOINT AGENT POLICY LEARNING AND MORPHOLOGY EVOLUTION
ABSTRACTReinforcement learning (RL) has proven to be a powerful paradigm for deriving complex behaviors from simple reward signals in a wide range of environments. When applying RL to continuous cont...原创 2019-06-30 17:16:12 · 1145 阅读 · 0 评论 -
Hybrid Reward Architecture for Reinforcement Learning
Hybrid Reward Architecture for Reinforcement Learning31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA.AbstractOne of the main challenges in reinforcemen...原创 2019-07-01 14:21:27 · 1281 阅读 · 0 评论 -
Learning to Communicate with Deep Multi-Agent Reinforcement Learning
AbstractWe consider the problem of multiple agents sensing and acting in environments with the goal of maximising their shared utility. In these environments, agents must learn communication protocol...原创 2019-06-23 23:59:28 · 2217 阅读 · 0 评论 -
REINFORCEMENT LEARNING USING QUANTUM BOLTZMANN MACHINES利用量子波兹曼机进行强化学习
REINFORCEMENT LEARNING USING QUANTUM BOLTZMANN MACHINESAbstract. We investigate whether quantum annealers with select chip layouts can outperform classical computers in reinforcement learning tasks. ...原创 2019-07-07 20:43:04 · 1792 阅读 · 0 评论 -
Nuts and Bolts of Reinforcement Learning:时间差异(TD)学习简介
Introduction当DeepMind提出一种在ATARI游戏中达到超人级别的算法时,Q-learning成为数据科学中的家喻户晓的名字。 它是强化学习(RL)的核心组成部分之一。 每当我读到RL时,我经常会遇到Q-learning。但是,Q学习与我们的时差学习主题有什么关系呢? 让我举一个例子来说明时间差异学习是什么。Rajesh计划在他的车里从德里前往斋浦尔。 快速查看Google地...原创 2019-04-27 19:14:28 · 833 阅读 · 0 评论 -
强化学习:使用OpenAI Gym Toolkit进行蒙特卡洛学习简介
Introduction当你听到“强化学习”这个词时,你首先想到的是什么? 最常见的想法是 - 太复杂而且数学太多。 但我在此向您保证,这是一个非常迷人的研究领域 - 我的目标是将我的文章中的这些技术分解为易于理解的概念。我相信你一定听说过OpenAI和DeepMind。 这是两个领先的人工智能组织,他们在这一领域取得了重大进展。 OpenAI机器人团队能够击败Dota 2中的业余游戏玩家团队...原创 2019-04-27 21:07:07 · 2252 阅读 · 1 评论 -
强化学习指南:用Python解决Multi-Armed Bandit问题
Introduction你在镇上有一个最喜欢的咖啡馆吗? 当你想喝咖啡时,你可能会去这个地方,因为你几乎可以肯定你会得到最好的咖啡。 但这意味着你错过了这个地方的跨城镇竞争对手所提供的咖啡。如果你一个接一个地尝试所有咖啡的地方,品尝你生活中更糟糕的咖啡的可能性会非常高! 但话说回来,你有可能找到一个更好的咖啡酿造者。 但是所有这些与强化学习有什么关系呢?我很高兴你问。我们的咖啡品尝实验中的...原创 2019-04-28 08:42:46 · 4092 阅读 · 2 评论 -
Learning to play snake at 1 million FPS Playing snake with advantage actor-critic
In this blog post I’ll guide you through my most recent project, which combines two things I find fascinating — computer games and machine learning. For quite a while now I’ve wanted to get to grips w...翻译 2019-05-21 23:29:42 · 303 阅读 · 0 评论 -
强化学习代码研读
Tensorflow——占位符我们通过为输入图像和目标输出类别创建节点,来开始构建计算图。x = tf.placeholder("float", shape=[None, 784])y = tf.placeholder("float", shape=[None, 10])这里的x和y并不是特定的值,相反,他们都只是一个占位符,可以在TensorFlow运行某一计算时根据该占位符输入具体的...原创 2019-06-10 09:07:53 · 368 阅读 · 0 评论 -
基于深度强化学习的路径规划笔记
MazePathFinder using deep Q Networks该程序将由几个封锁(由块颜色表示)组成的图像作为输入,起始点由蓝色表示,目的地由绿色表示。 它输出一个由输入到输出的可能路径之一组成的图像。 下面显示的是程序的输入和输出。输入图像被馈送到由2个conv和2个fc层组成的模型,其输出对应于底部和右侧动作的Q值。 代理根据哪个Q值更大而向右或向下移动,并且使用代理的新位置...原创 2019-04-24 20:57:31 · 29968 阅读 · 19 评论 -
A Unified Game-Theoretic Approach to Multi-agent Reinforcement Learning
Today we will dig into a paper ripped of A Unified Game-Theoretic Approach to Multi-agent Reinforcement Learning , one of the core ideas that has been used for the development of #AlphaStar . There a...原创 2019-06-22 10:15:59 · 1414 阅读 · 0 评论 -
About communication in Multi-Agent Reinforcement Learning
Communication is one of the components of MARL and an active area of research itself, as it might influence the final performance of agents, and it affects coordination or negotiation directly. Effect...原创 2019-06-22 10:28:52 · 946 阅读 · 0 评论 -
博弈论与多智能体强化学习
Ann Nowe´, Peter Vrancx, and Yann-Michae¨l De HauwereAbstract. Reinforcement Learning was originally developed for Markov Decision Processes (MDPs). It allows a single agent to learn a policy that ma...原创 2019-06-22 11:22:50 · 12043 阅读 · 2 评论 -
现代博弈论与多智能体强化学习系统
如今,大多数人工智能(AI)系统都是基于处理任务的单个代理,或者在对抗模型的情况下,是一些相互竞争以改善系统整体行为的代理。然而,现实世界中的许多认知问题是大群人建立的知识的结果。以自动驾驶汽车场景为例,任何座席的决策都是场景中许多其他座席行为的结果。金融市场或经济中的许多情景也是大型实体之间协调行动的结果。我们如何模仿人工智能(AI)代理中的行为?多智能体强化学习(MARL)是深度学习学科,...原创 2019-06-22 17:50:30 · 4754 阅读 · 1 评论 -
Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning
Averaged-DQN:深度强化学习的方差减少和稳定性AbstractInstability and variability of Deep Reinforcement Learning (DRL) algorithms tend to adversely af-fect their performance. Averaged-DQN is a sim-ple extension to th...原创 2019-07-01 19:40:39 · 1963 阅读 · 0 评论 -
使用Python中的OpenAI Gym进行深度Q-Learning的实践介绍
Introduction我一直对游戏着迷。 看似无限的选择可以在紧迫的时间线下执行一个动作 - 这是一个惊心动魄的经历。 没有什么比得上它了。因此,当我读到DeepMind想出的令人难以置信的算法(如AlphaGo和AlphaStar)时,我被迷住了。 我想学习如何在自己的机器上制作这些系统。 这使我进入深度强化学习的世界(Deep RL)。即使您不参与游戏,Deep RL也很重要。 只需查...原创 2019-04-27 18:45:37 · 2899 阅读 · 0 评论