ICML2018论文研讨会记录

最新推荐文章于 2022-09-13 13:27:26 发布

QiufengWang424

最新推荐文章于 2022-09-13 13:27:26 发布

阅读量1.5k

点赞数 1

分类专栏：机器学习

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/guhaiteng/article/details/81347166

版权

机器学习专栏收录该内容

2 篇文章 0 订阅

订阅专栏

ps：简单记录ICML2018论文研讨会内容

2018.7.23

Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations. http://proceedings.mlr.press/v80/wang18d/wang18d.pdf
- 零和博弈（GAN受此启发）和逆强化学习
Learning to Explore via Meta-Policy Gradient. http://proceedings.mlr.press/v80/xu18d/xu18d.pdf
- 元策略梯度
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. http://proceedings.mlr.press/v80/rashid18a/rashid18a.pdf
- 值函数分解

2018.7.24

Ray: A Distributed Framework for Emerging AI Applications. Arxiv. https://arxiv.org/pdf/1712.05889.pdf
- 伯克利分布式工具，分享者并没有讲清楚如何部署分布式
Katyusha X: Practical Momentum Method for Stochastic Sum-of-Nonconvex Optimization. ICML 2018. http://proceedings.mlr.press/v80/allen-zhu18a/allen-zhu18a.pdf
- 非凸优化
Self-Imitation Learning. ICML 2018. http://proceedings.mlr.press/v80/oh18b/oh18b.pdf
- 自模仿学习

2018.7.27

Mix & Match - Agent Curricula for Reinforcement learning. ICML 2018. http://proceedings.mlr.press/v80/czarnecki18a/czarnecki18a.pdf
- transfer learning用于强化学习
- k越大，模型吸收前面模型的内容越多，训练复杂度越高
Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings. ICML 2018. http://proceedings.mlr.press/v80/co-reyes18a/co-reyes18a.pdf
- 类似于VAE用于分层强化学习
State Abstractions for Lifelong Reinforcement Learning. ICML 2018. http://proceedings.mlr.press/v80/abel18a/abel18a.pdf
- 终身强化学习相当于任务可迁移

2018.7.28

Efficient Neural Architecture Search via Parameter Sharing. ICML 2018. http://proceedings.mlr.press/v80/pham18a/pham18a.pdf
- 在NAS基础上做改进，对于给定的神经网络模块，建立DAG图，具体算法有待继续研究
- Google Brain的insight很好，但是还很weak
Implicit Quantile Networks for Distributional Reinforcement Learning. ICML 2018. http://proceedings.mlr.press/v80/dabney18a/dabney18a.pdf
- 本文分成Quantile和Distributional，可以看下作者之前两篇工作

2018.7.29

Bayesian Optimization of Combinatorial Structures. ICML 2018. http://proceedings.mlr.press/v80/baptista18a/baptista18a.pdf
- 没听懂，只是取得部分进展
Visualizing and Understanding Atari Agents. ICML 2018. http://proceedings.mlr.press/v80/greydanus18a/greydanus18a.pdf
- 高斯模糊某一片，看看这块区域对于Q值的影响
Policy Optimization with Demonstrations. ICML 2018. http://proceedings.mlr.press/v80/kang18a/kang18a.pdf
- 没怎么听

2018.7.30

Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents. ICML 2018. http://proceedings.mlr.press/v80/zhang18n/zhang18n.pdf
- 智能体之间的通信，随机选择子图，理论早已经弄好，然后实验简单设计
Structured Evolution with Compact Architectures for Scalable Policy Optimization. ICML 2018. http://proceedings.mlr.press/v80/choromanski18a/choromanski18a.pdf
- Google brain的，讲了一堆矩阵概念，理论解释不清楚，实验完备
Using Reward Machines for High-Level Task Specification and Decomposition in Reinforcement Learning. ICML 2018. http://proceedings.mlr.press/v80/icarte18a/icarte18a.pdf
- 用状态机完成与环境交互一次，就能完成多任务的reward计算

2018.7.31

Essentially No Barriers in Neural Network Energy Landscape. ICML 2018.
http://proceedings.mlr.press/v80/draxler18a/draxler18a.pdf
- 局部最优点连线
Time Limits in Reinforcement Learning. ICML 2018. http://proceedings.mlr.press/v80/pardo18a/pardo18a.pdf
- 考虑有限步长
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. ICML 2018. http://proceedings.mlr.press/v80/athalye18a/athalye18a.pdf
- best paper对抗的7篇中未被攻克的

2018.8.1

Learning with Abandonment. ICML 2018.
http://proceedings.mlr.press/v80/schmit18a/schmit18a.pdf
- 在推荐系统中用强化学习，设计了一个用户容忍度theta
Latent Space Policies for Hierarchical Reinforcement Learning. ICML 2018. http://proceedings.mlr.press/v80/haarnoja18a/haarnoja18a.pdf
- 分层强化学习主要是解决解决系数reward或者复杂情况
- 这篇文章文不对标题的分层强化学习
Coordinated Exploration in Concurrent Reinforcement Learning. ICML 2018. http://proceedings.mlr.press/v80/dimakopoulou18a/dimakopoulou18a.pdf
- 提出了seed算法，对比了之前的UCB和辛普森采样，没有解释清楚Concurrent多智能体协同运作

2018.8.2

Clipped Action Policy Gradient. ICML 2018. http://proceedings.mlr.press/v80/fujita18a/fujita18a.pdf
- 求策略梯度的时候用alpha和beta截断，是无偏估计
An Inference-Based Policy Gradient Method for Learning Options. ICML 2018. http://proceedings.mlr.press/v80/smith18a/smith18a.pdf
- 分层强化学习领域的一篇文章与
- 与ICML2017的A Laplacian Framework for Option Discovery
  in Reinforcement Learning算法类似，实验也有比较

2018.8.3

Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control. ICML 2018. http://proceedings.mlr.press/v80/srinivas18b/srinivas18b.pdf
- 引出对state抽象，做一个model-based，model-based与model-free结合
Investigating Human Priors for Playing Video Games. ICML 2018. http://proceedings.mlr.press/v80/dubey18a/dubey18a.pdf

2018.8.4

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. ICML 2018. http://proceedings.mlr.press/v80/athalye18a/athalye18a.pdf
- ICML2018 best paper
- 原来的ICLR的基于梯度的防御机制主要有三种，分别是梯度破碎，随机梯度，多轮之后爆炸和消失梯度，一三的对抗方法是找一个不定点可导函数，第二个对抗方法是期望最大化
Addressing Function Approximation Error in Actor-Critic Methods. ICML 2018. http://proceedings.mlr.press/v80/fujimoto18a/fujimoto18a.pdf
- 状态抽象，类似于vae，对状态抽象再还原，最后再最小化作者提出的loss
Efficient Model-Based Deep Reinforcement Learning with Variational State Tabulation. ICML 2018. http://proceedings.mlr.press/v80/corneil18a/corneil18a.pdf

To be continue。。。。。

关注

1
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
ICML2018论文研讨会记录

2018.7.27Mix &amp;amp;amp;amp;amp; Match - Agent Curricula for Reinforcement learning. ICML 2018. http://proceedings.mlr.press/v80/czarnecki18a/czarnecki18a.pdf transfer learning用于强化学习k越大，模型吸收前面模型的内容越多，训练复杂度越高...
复制链接

扫一扫

专栏目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。