过去10年NIPS顶会强化学习论文(100多篇)汇总(2008-2019年)

深度强化学习报道

来源:nips.cc

NIPS(NeurIPS),全称神经信息处理系统大会(Conference and Workshop on Neural Information Processing Systems),是一个关于机器学习和计算神经科学的国际会议。该会议固定在每年的12月举行,由NIPS基金会主办。NIPS是机器学习领域的顶级会议。在中国计算机学会的国际学术会议排名中,NIPS为人工智能领域的A类会议,自1987年到2000年,NIPS都在美国的丹佛举办。而在2001年到2010年,NIPS的举办地则是在加拿大的温哥华。此后,NIPS分别在在西班牙的格兰纳达(2011年),太浩湖(Lake Tahoe)(2012年到2013年),以及加拿大的蒙特利尔(2014到2015年)

自从数年前深度学习流行以来,NIPS 成为学术界、产业界重点关注的学术会议之一,参会人数从 5 年前的 2000 人一度飙升到 2018 年的 8000 多人。除参会人员,2018 年 NIPS 的论文投稿也创造了历史新高,达到了 3240 篇。最近的统计显示,NIPS 2019 论文投稿数量高达 5800 篇,比去年又多了 1700 多篇,在过去几年中,各个领域文章特别多,本文汇总了过去10年NIPS会议接收的108篇强化学习领域的文章内容,具体总结如下:

从表中我们可以看出强化学习在2012年是一个分水岭,经历了火热之后开始衰退,然后从2015年开始一路攀升,达到了录取38篇的数量,下面是历届accept paper的题目list

2008年(3)

  • Near-optimal Regret Bounds for Reinforcement Learning

  • Stress, noradrenaline, and realistic prediction of mouse behaviour using reinforcement learning

  • Optimization on a Budget: A Reinforcement Learning Approach

2009(3)

  • Manifold Embeddings for Model-Based Reinforcement Learning under Partial Observability

  • Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining

  • Training Factor Graphs with Reinforcement Learning for Efficient MAP Inference

2010年(5)

  • Nonparametric Bayesian Policy Priors for Reinforcement Learning

  • Constructing Skill Trees for Reinforcement Learning Agents from Demonstration Trajectories

  • Feature Construction for Inverse Reinforcement Learning

  • PAC-Bayesian Model Selection for Reinforcement Learning

  • Interval Estimation for Reinforcement-Learning Algorithms in Continuous-State Domains

2011年(7)

  • Nonlinear Inverse Reinforcement Learning with Gaussian Processes

  • A Reinforcement Learning Theory for Homeostatic Regulation

  • Action-Gap Phenomenon in Reinforcement Learning

  • Optimal Reinforcement Learning for Gaussian Systems

  • Reinforcement Learning using Kernel-Based Stochastic Factorization

  • MAP Inference for Bayesian Inverse Reinforcement Learning

  • Selecting the State-Representation in Reinforcement Learning

2012年(11)

  • Bayesian Hierarchical Reinforcement Learning

  • Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress

  • Nonparametric Bayesian Inverse Reinforcement Learning for Multiple Reward Functions

  • Inverse Reinforcement Learning through Structured Classification

  • Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search

  • On-line Reinforcement Learning Using Incremental Kernel-Based Stochastic Factorization

  • Online Regret Bounds for Undiscounted Continuous Reinforcement Learning

  • Neurally Plausible Reinforcement Learning of Working Memory Tasks

  • Transferring Expectations in Model-based Reinforcement Learning

  • Efficient Reinforcement Learning for High Dimensional Linear Quadratic Systems

  • Cost-Sensitive Exploration in Bayesian Reinforcement Learning

2013年(3)

  • Reinforcement Learning in Robust Markov Decision Processes

  • Policy Shaping: Integrating Human Feedback with Reinforcement Learning

  • (More) Efficient Reinforcement Learning via Posterior Sampling

2014年(5)

  • Model-based Reinforcement Learning and the Eluder Dimension

  • Sparse Multi-Task Reinforcement Learning

  • Difference of Convex Functions Programming for Reinforcement Learning

  • Near-optimal Reinforcement Learning in Factored MDPs

  • RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning

2015年(2)

  • Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning

  • Inverse Reinforcement Learning with Locally Consistent Reward Functions

2016年(7)

  • Tree-Structured Reinforcement Learning for Sequential Object Localization

  • Safe and Efficient Off-Policy Reinforcement Learning

  • Contextual-MDPs for PAC Reinforcement Learning with Rich Observations

  • Learning to Communicate with Deep Multi-Agent Reinforcement Learning

  • Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation

  • Cooperative Inverse Reinforcement Learning

  • Linear Feature Encoding for Reinforcement Learning

2017年(24)

  • Hybrid Reward Architecture for Reinforcement Learning

  • Shallow Updates for Deep Reinforcement Learning

  • Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning

  • Optimistic posterior sampling for reinforcement learning: worst-case regret bounds

  • Cold-Start Reinforcement Learning with Softmax Policy Gradient

  • Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning

  • Safe Model-based Reinforcement Learning with Stability Guarantees

  • Data-Efficient Reinforcement Learning in Continuous State-Action Gaussian-POMDPs

  • Deep Reinforcement Learning from Human Preferences

  • EX2: Exploration with Exemplar Models for Deep Reinforcement Learning

  • Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning

  • Compatible Reward Inverse Reinforcement Learning

  • Bridging the Gap Between Value and Policy Based Reinforcement Learning

  • Compatible Reward Inverse Reinforcement Learning

  • Online Reinforcement Learning in Stochastic Games

  • Reinforcement Learning under Model Mismatch

  • A multi-agent reinforcement learning model of common-pool resource appropriation

  • Imagination-Augmented Agents for Deep Reinforcement Learning

  • Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation

  • Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning

  • Repeated Inverse Reinforcement Learning

  • A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning

  • Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning

  • Agent Reinforcement Learning

2018年(38)

  • The Importance of Sampling inMeta-Reinforcement Learning

  • Learning Temporal Point Processes via Reinforcement Learning

  • Data-Efficient Hierarchical Reinforcement Learning

  • Fast deep reinforcement learning using online adjustments from the past

  • Inference Aided Reinforcement Learning for Incentive Mechanism Design in Crowdsourcing

  • A Lyapunov-based Approach to Safe Reinforcement Learning

  • Reinforcement Learning of Theorem Proving

  • Simple random search of static linear policies is competitive for reinforcement learning

  • Meta-Gradient Reinforcement Learning

  • Reinforcement Learning for Solving the Vehicle Routing Problem

  • Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning

  • REFUEL: Exploring Sparse Features in Deep Reinforcement Learning for Fast Disease Diagnosis

  • Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents

  • Distributed Multitask Reinforcement Learning with Quadratic Convergence

  • Constrained Cross-Entropy Method for Safe Reinforcement Learning

  • Reinforcement Learning with Multiple Experts: A Bayesian Model Combination Approach

  • Verifiable Reinforcement Learning via Policy Extraction

  • Deep Reinforcement Learning of Marked Temporal Point Processes

  • Evolution-Guided Policy Gradient in Reinforcement Learning

  • Meta-Reinforcement Learning of Structured Exploration Strategies

  • Diversity-Driven Exploration Strategy for Deep Reinforcement Learning

  • Genetic-Gated Networks for Deep Reinforcement Learning

  • Visual Reinforcement Learning with Imagined Goals

  • Unsupervised Video Object Segmentation for Deep Reinforcement Learning

  • Total stochastic gradient algorithms and applications in reinforcement learning

  • Fighting Boredom in Recommender Systems with Linear Reinforcement Learning

  • Randomized Prior Functions for Deep Reinforcement Learning

  • Scalable Coordinated Exploration in Concurrent Reinforcement Learning

  • Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization

  • Negotiable Reinforcement Learning for Pareto Optimal Sequential Decision-Making

  • Teaching Inverse Reinforcement Learners via Features and Demonstrations

  • Hierarchical Reinforcement Learning for Zero-shot Generalization with Subtask Dependencies

  • Lifelong Inverse Reinforcement Learning

  • Multiple-Step Greedy Policies in Approximate and Online Reinforcement Learning

  • Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models

  • Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion

(资源获取)

1、关注本公众号,或者扫描下面"二维码",

2、获取2019年NIPS顶会论文,后台回复:NIPS  获取

机器学习之家

算法、框架、资料、前沿信息等

长按二维码关注我们吧

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值