每周 AI 速递(试行第三期)

(题图: Almost nothing | 摄影: Dave Hoefler)

你好,欢迎收看第三期(试行)的每周 AI 速递。

这是 2020 年的第 27 周,本周我照例为你筛选有价值的内容,共 9 条。主要是几篇有用的论文,总有一篇适合你。

  1. 【新闻】ACM 前主席(非正式)宣布:其数字图书馆将在 5 年内开放获取

  2. 【视频】IEEE Information Society 出品:信息论之父克劳德·香农的纪录片

  3. 【视频】机器学习暑期学校正在进行中,抓紧的话,能赶上 Yoshua Bengio 的课程直播

  4. 【项目】Hugging Face 发布了 transformers 3.0

  5. 【论文】On the training dynamics of deep networks with L2 regularization

  6. 【论文】Smooth Adversarial Training

  7. 【论文】Theoretical Limitations of Self-Attention in Neural Sequence Models

  8. 【论文】Involutive MCMC: a Unifying Framework

  9. 【论文】Model-based Reinforcement Learning: A Survey

????BIG NEWS

ACM 前主席(非正式)宣布:其数字图书馆将在 5 年内开放获取

推荐人: @Cherri Pancake (ACM前主席)

标签: ACM

ACM 前主席 Cherri Pancake 宣布,ACM 的数字图书馆将在 5 年之内开放获取。开放获取之后,它上面的所有数字资源都可以免费地下载。诚如 Cherri Pancake 所说,这是许多人期待已久的事情。网友评论道:5 年太长,明天会更好!

我会为大家持续关注这一动态。等到正式开放获取之后,欢迎大家多去 ACM 上找资源,多支持正版。对于有多个版本的论文,ACM 上的论文可能不是最新的(作者可能在论文发表很久之后补充内容再发布到 arXiv 上),但一定会是质量最好的,无论是内容的完整性,还是排版。


????VIDEOS

IEEE Information Society 出品:信息论之父克劳德·香农的纪录片

地址: https://www.bilibili.com/video/BV1754y1z71b

出品人: @IEEE Information Society

标签: 信息论, 香农

IEEE 信息学会出品了一部信息论之父克劳德·香农(Claude Shannon)的纪录片——The Bit Player。影片的风格挺梦幻的,算符合香农爱玩的气质吧。缺憾是,演员扮演香农,非真人的访谈(香农于 2001 年去世)。

如果要给我们这个时代冠名的话,信息时代必须是一个。那么,作为信息论之父,香农的贡献再怎么高估都不过分。在科学史上,香农是足以和牛顿、爱因斯坦相提并论的。按照一些学者的说法,爱因斯坦开创了相对论,但没有把我们带入相对论的时代;香农开创了信息论,并且把人类带入了信息时代。有点夸张,研究的领域不同,香农和爱因斯坦不存在可比性,但是香农的贡献可见一斑。

欲了解香农其人的,再推荐一本书,中文书名叫《香农传》,英文书名叫《A Mind at Play》。从书名和片名可以看出,香农的标签之一是 play,或者说游戏人生。我真的太喜欢香农了,他是明明可以靠脸吃饭偏偏要靠才华,被人类的科研事业耽误的杂耍艺人。

这部纪录片上周末可以 vimeo 上免费观看,之后说是会传到 Amazon Prime Video。有位同学在我之前上传了 B 站,快去围观吧。

机器学习暑期学校正在进行中,抓紧的话,能赶上 Yoshua Bengio 的课程直播

地址: http://mlss.tuebingen.mpg.de/2020/index.html

嘉宾: @Yoshua Bengio (深度学习的三驾马车之一)

标签: ML, DL

机器学习暑期学校(Machine Learning Summer School,MLSS),每年夏季都会如期举行,今年是第 19 年了。形式上类似于公开课,又像主题报告,用一两讲的篇幅来介绍一个主题。来做客的都是各个领域的专家学者大佬。

今年有两场暑期学校,正在进行中的是德国图宾根场的(Tübingen, Germany)。Yoshua Bengio 是 MLSS 的常客之一,这一次他又来了。抓紧的话,能赶上他的 6 号和 7 号的深度学习课程直播。错过的往期课程,点击 Video 可以看录播。


????PROJECTS

Hugging Face 发布了 transformers 3.0

地址: https://github.com/huggingface/transformers/releases/tag/v3.0.0

出品方: @Hugging Face

标签: NLP

Hugging Face 和 transformers 在 NLP 领域大概无人不知了吧。

话不多说。近日,Hugging Face 对 Transformers 进行了一次大的版本更新(v2.X → v3.X),主要内容包括:

  1. 新的 Tokenizer  API

  2. 改进了 TensorFlow 模型

  3. 更新了文档和教程

详情请看给出的 GitHub releases 页。


????PAPERS

Smooth Adversarial Training

地址: https://arxiv.org/abs/2006.14536

标签: 激活函数, DL

推荐理由: 本文发现,在对抗训练中,使用平滑的激活函数能够在不损害模型准确度的情况下,提高鲁棒性(我个人挺好奇鲁棒性是怎么度量的)。作者们相信,在对抗训练之外的领域,使用平滑的激活函数同样会是一个好的神经网络设计原则。建议搭配上周的 SIREN 食用。

慎看的理由: 预印本,可能是 NeurIPS 在投

It is commonly believed that networks cannot be both accurate and robust, that gaining robustness means losing accuracy. It is also generally believed that, unless making networks larger, network architectural elements would otherwise matter little in improving adversarial robustness. Here we present evidence to challenge these common beliefs by a careful study about adversarial training. Our key observation is that the widely-used ReLU activation function significantly weakens adversarial training due to its non-smooth nature. Hence we propose smooth adversarial training (SAT), in which we replace ReLU with its smooth approximations to strengthen adversarial training. The purpose of smooth activation functions in SAT is to allow it to find harder adversarial examples and compute better gradient updates during adversarial training. Compared to standard adversarial training, SAT improves adversarial robustness for "free", i.e., no drop in accuracy and no increase in computational cost. For example, without introducing additional computations, SAT significantly enhances ResNet-50's robustness from 33.0% to 42.3%, while also improving accuracy by 0.9% on ImageNet. SAT also works well with larger networks: it helps EfficientNet-L1 to achieve 82.2% accuracy and 58.6% robustness on ImageNet, outperforming the previous state-of-the-art defense by 9.5% for accuracy and 11.6% for robustness.

Theoretical Limitations of Self-Attention in Neural Sequence Models

地址: https://www.mitpressjournals.org/doi/full/10.1162/tacl_a_00306

标签: TACL, 注意力机制

推荐理由: 本文调研了 Self-Attention 的理论局限性:无法对周期性的有限状态的语言和层次状态进行建模。

Transformers are emerging as the new workhorse of NLP, showing great success across tasks. Unlike LSTMs, transformers process input sequences entirely through self-attention. Previous work has suggested that the computational capabilities of self-attention to process hierarchical structures are limited. In this work, we mathematically investigate the computational power of self-attention to model formal languages. Across both soft and hard attention, we show strong theoretical limitations of the computational abilities of self-attention, finding that it cannot model periodic finite-state languages, nor hierarchical structure, unless the number of layers or heads increases with input length. These limitations seem surprising given the practical success of self-attention and the prominent role assigned to hierarchical structure in linguistics, suggesting that natural language can be approximated well with models that are too weak for the formal languages typically assumed in theoretical linguistics.

On the training dynamics of deep networks with L2 regularization

地址: https://arxiv.org/abs/2006.08643

标签: DL, L2 正则化

推荐理由: 本文调研了 L2 正则化对于深度神经网络性能的影响,并根据观察提出了两个 L2 正则化的高级用法:

  1. 预测模型最优的 L2 正则化系数(有点类似于一些工具提供的 LR Finder,寻找最优的学习率);

  2. AUTO L2,自动调整 L2 正则化系数(这一点像 LR scheduler)。

文中的实验发现,两个方法都能加快模型的训练速度,同时取得更好的模型性能。

慎看的理由: 预印本,可能是 NeurIPS 在投

We study the role of L2 regularization in deep learning, and uncover simple relations between the performance of the model, the L2 coefficient, the learning rate, and the number of training steps. These empirical relations hold when the network is overparameterized. They can be used to predict the optimal regularization parameter of a given model. In addition, based on these observations we propose a dynamical schedule for the regularization parameter that improves performance and speeds up training. We test these proposals in modern image classification settings. Finally, we show that these empirical relations can be understood theoretically in the context of infinitely wide networks. We derive the gradient flow dynamics of such networks, and compare the role of L2 regularization in this context with that of linear models.

Involutive MCMC: a Unifying Framework

地址: https://arxiv.org/abs/2006.16653

标签: RL, ICML

推荐理由: 作者根据各种 MCMC 算法的特点,提出了一个统一性的框架,iMCMC。它可以用于实现既有的 MCMC 算法,也可以在此基础上更方便地进行实验探索。

Markov Chain Monte Carlo (MCMC) is a computational approach to fundamental problems such as inference, integration, optimization, and simulation. The field has developed a broad spectrum of algorithms, varying in the way they are motivated, the way they are applied and how efficiently they sample. Despite all the differences, many of them share the same core principle, which we unify as the Involutive MCMC (iMCMC) framework. Building upon this, we describe a wide range of MCMC algorithms in terms of iMCMC, and formulate a number of "tricks" which one can use as design principles for developing new MCMC algorithms. Thus, iMCMC provides a unified view of many known MCMC algorithms, which facilitates the derivation of powerful extensions. We demonstrate the latter with two examples where we transform known reversible MCMC algorithms into more efficient irreversible ones.

Model-based Reinforcement Learning: A Survey

地址: https://arxiv.org/abs/2006.16712

标签: RL, 综述

推荐理由: 基于模型的强化学习的一篇综述文章。本文还有一篇姊妹篇,《A Framework for Reinforcement Learning and Planning》(https://arxiv.org/abs/2006.15009),可搭配食用。

Sequential decision making, commonly formalized as Markov Decision Process (MDP) optimization, is a key challenge in artificial intelligence. Two key approaches to this problem are reinforcement learning (RL) and planning. This paper presents a survey of the integration of both fields, better known as model-based reinforcement learning. Model-based RL has two main steps. First, we systematically cover approaches to dynamics model learning, including challenges like dealing with stochasticity, uncertainty, partial observability, and temporal abstraction. Second, we present a systematic categorization of planning-learning integration, including aspects like: where to start planning, what budgets to allocate to planning and real data collection, how to plan, and how to integrate planning in the learning and acting loop. After these two key ps, we also discuss the potential benefits of model-based RL, like enhanced data efficiency, targeted exploration, and improved stability. Along the survey, we also draw connections to several related RL fields, like hierarchical RL and transfer, and other research disciplines, like behavioural psychology. Altogether, the survey presents a broad conceptual overview of planning-learning combinations for MDP optimization.


如果觉得本期内容不错,求个点赞在看分享~

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值