每周AI速递 (试行第二期)

本文链接：https://blog.csdn.net/Engine_Treasure/article/details/107011482

你好，欢迎收看试行第二期的每周 AI 速递。这是 2020 年的第 26 周。魔幻的一年快要过去一半了，你是否准备好迎接另一半了呢？

本周我照例为你筛选有价值的内容，共 12 条。

【新闻】ACL2020 公布录用论文
【博客】一个顶刊审稿人的审稿体验，两个字：享受
【博客】使用强化学习构建交易系统的经验之谈
【视频】正弦函数作为激活函数，表示自然信号的能力地表最强
【视频】计算机视觉的未来在于自监督学习
【项目】NLP 学习路线图
【项目】收集了 545 个数据集的 NLP 数据库
【项目】NLE，一个强化学习游戏环境
【论文】How Context Affects Language Models' Factual Predictions
【论文】Deep Learning Based Text Classification: A Comprehensive Review
【论文】Stolen Probability: A Structural Weakness of Neural Language Models
【论文】Building One-Shot Semi-supervised (BOSS) Learning up to Fully Supervised Performance

PS: 部分内容可能是更早之前发表的，但是在过去的一周火起来的。

????NEWS

ACL2020 公布录用论文

地址: https://www.aclweb.org/anthology/events/acl-2020/

标签: NLP, ACL

每一场顶会都是一场盛宴，而每一次录用论文的放榜都是一次狂欢。????

ACL 2020 共收到 3429 份投稿，录用 779 篇（录用率 22.7%）。两年前的 ACL 2018 共收到 1544 份投稿，录用 384 篇。是不是和我在《存量阅读+增量阅读，助你在学术的海洋上乘风破浪》中说的一样，顶会论文的投稿量和录用论数两年翻一番呢！我可以把这一发现命名为赵喧典定律吗？????

????BLOGS

一个顶刊审稿人的审稿体验，两个字：享受????

地址: https://ehudreiter.com/2020/06/22/i-enjoy-reviewing-for-tacl/

推荐人: @Felix Hill (DeepMind 研究员)

标签: NLP, TACL

Ehud Reiter 是 NLP 的顶级期刊 TACL 的审稿人。他非常享受为 TACL 审稿的过程，理由总结如下：

通过与作者恰当的沟通，审稿人能够进一步拔高论文
其他审稿人也都是资深的专家，因此能学到学到许多
按年审稿的方式（一年几篇）比一次性审多篇会议论文更愉快

从第三点来看，随着科技论文的激增，审稿人真是不堪重负了。NeurIPS2018 让本科生担任审稿人，也是一种妥协，而去年以来，许多会议都加长了从截稿到放榜的审稿时间，也是一种无奈。

作者在最后呼吁，TACL 的模式应该得到普及。在我看来，竟有点心酸。可行吗？

我自己看会议论文为主，不过也看过一些期刊论文，包括 TACL 的。虽然也不需要我来正名，但是 TACL 收录的论文质量是真的高，篇篇都是满分作文。我看过的为数不多的 TACL 论文恰好包括 Felix Hill 的文章。读之令人心旷神怡。这也是我 follow 他的直接原因。

使用强化学习构建交易系统的经验之谈

地址: https://dennybritz.com/blog/ai-trading/

作者: @Denny Britz (前谷歌大脑研究员)

标签: RL, AI

在这篇文章中，Denny 总结了他使用强化学习构建交易系统的经验。不多说了，感兴趣的同学看一看，尝试需谨慎。

VIDEOS

正弦函数作为激活函数，表示自然信号的能力地表最强

地址: https://www.youtube.com/watch?v=Q2fLWGBeaiI&feature=youtu.be

推荐人: @Geoffrey Hinton (深度学习三驾马车之一)

出品人: @Stanford Computational Imaging Lab

标签: 激活函数

时长: 10 分钟

CAPTION：很想让大家都看一看，我在原视频下方留言希望搬运到B站去，暂时没有回复。我先斩后奏下载了视频，侵删。

这是论文 Implicit Neural Representations with Periodic Activation Functions (https://arxiv.org/abs/2006.09661) 配套的介绍视频，是斯坦福大学对自己论文的讲解。同样配套的项目地址在这里：https://vsitzmann.github.io/siren，配套的 Colab notebook 地址在这里：https://colab.research.google.com/github/vsitzmann/siren/blob/master/explore_siren.ipynb。

他们提出使用正弦函数 (sin) 作为激活函数的神经网络，称为SInusoidal REpresentation Networks (SIREN)。SIREN 表示图片、音频、视频等的能力，实在太感人了！Hinton 为其点赞????????道：一个比 ReLU 好太多的非线性函数，视频中的工作可能与网格细胞（存在于许多物种大脑中，让动物明白其空间位置的神经细胞）的理解有关 (A non-linearity that works much better than ReLUs. The work described in this video might also be relevant to understanding grid cells)。

计算机视觉的未来在于自监督学习

地址: https://www.facebook.com/wdeepvision2020/videos/2301388736824154/

演讲人: @Yann Lecun (深度学习三驾马车之一)

标签: CV, Self-Supervised Learning (Unsupervised Learning)

时长: 30 分钟

如果你对自监督学习 (Self-Supervised Learning) 感到陌生，这可能要怪杨乐昆杨老师。在 NLP 的预训练语言模型大火之后，杨老师修正了他的理论，将无监督学习改名为了自监督学习：不是没有监督，而是“系统基于输入的一部分去学着预测另一部分”，相比于有监督学习通常需要人工标注，自监督学习显得更加“自给自足”（直接采用了@张正的翻译）。

我不是搞 CV 的，视频看一遍下来还是云里雾里的。但是杨老师的视频，还是推荐一波。

????PROJECTS

graykode/nlp-roadmap

地址: https://github.com/graykode/nlp-roadmap

标签: ML, NLP

在这个仓库，作者梳理了 NLP 的学习路线图，包括四大块内容：概率与统计、机器学习、文本挖掘、自然语言处理。有兴趣的同学可以去观摩一下，也可以当作查漏补缺的一种途径。

The Big Bad NLP Database

地址: https://datasets.quantumstat.com/

标签: NLP

这是一个自然语言处理数据集的数据库（Database of datasets），截止 2020 年 6 月 25 日，共收录了各种语言的 545 个数据集，包括 11 个中文数据集。很多数据集保存的是 GitHub 的仓库链接，一个仓库可能托管了多个数据集，所以你在这里能找到的数据集实际是大于 545 的。

在搜索引擎高度发达的今天，每一个这样的黄页（姑且这样叫吧）都值得收藏，会带来极大的便利。

The NetHack Learning Environment (NLE)

地址: https://github.com/facebookresearch/nle

出品人: @Facebook Research

标签: RL

Facebook Research 基于视频游戏 NetHack 整了个新的强化学习的环境。研究强化学习的同学又有的玩了。

????PAPERS

How Context Affects Language Models' Factual Predictions

地址: https://openreview.net/forum?id=025X0zPfn

标签: LM, NLP

推荐理由: AKBC 的最佳论文。一个不说出来就意识不到、说出来会恍然大悟的直觉是，在大型语料上预训练的语言模型会包含一些事实性的知识。本文提出了一种以无监督的方式来提取大规模语言模型中的事实性知识的方法，在不需要监督学习的情况下，取得了与监督式机器阅读相近的结果。

When pre-trained on large unsupervised textual corpora, language models are able to store and retrieve factual knowledge to some extent, making it possible to use them directly for zero-shot cloze-style question answering. However, storing factual knowledge in a fixed number of weights of a language model clearly has limitations. Previous approaches have successfully provided access to information outside the model weights using supervised architectures that combine an information retrieval system with a machine reading component. In this paper, we go one step further and integrate information from a retrieval system with a pre-trained language model in a purely unsupervised way. We report that augmenting pre-trained language models in this way dramatically improves performance and that it is competitive with a supervised machine reading baseline without requiring any supervised training. Furthermore, processing query and context with different segment tokens allows BERT to utilize its Next Sentence Prediction pre-trained classifier to determine whether the context is relevant or not, substantially improving BERT's zero-shot cloze-style question-answering performance and making its predictions robust to noisy contexts.

Deep Learning Based Text Classification: A Comprehensive Review

地址: https://arxiv.org/abs/2004.03705

标签: 文本分类, 综述, NLP

推荐理由: 这篇综述对基于深度学习的文本分类算法进行了详尽的梳理，考察了多达 150 个模型。向对文本分类的感兴趣的同学强烈推荐。（预印本，期刊在投）

Deep learning based models have surpassed classical machine learning based approaches in various text classification tasks, including sentiment analysis, news categorization, question answering, and natural language inference. In this work, we provide a detailed review of more than 150 deep learning based models for text classification developed in recent years, and discuss their technical contributions, similarities, and strengths. We also provide a summary of more than 40 popular datasets widely used for text classification. Finally, we provide a quantitative analysis of the performance of different deep learning models on popular benchmarks, and discuss future research directions.

Stolen Probability: A Structural Weakness of Neural Language Models

地址: https://www.aclweb.org/anthology/2020.acl-main.198/

标签: LM, NLP, ACL2020

推荐理由: 一篇短文，从理论上和经验上证明了以 Softmax 为输出层的神经语言模型的结构性缺点：嵌入空间中，凸包（Convex Hull）内的词的预测概率受限于凸包上的词的概率，戏称为被偷走的概率，这将影响它们的表达能力。（只描述了问题，如何解决的问题留给了未来工作）

Neural Network Language Models (NNLMs) generate probability distributions by applying a softmax function to a distance metric formed by taking the dot product of a prediction vector with all word vectors in a high-dimensional embedding space. The dot-product distance metric forms part of the inductive bias of NNLMs. Although NNLMs optimize well with this inductive bias, we show that this results in a sub-optimal ordering of the embedding space that structurally impoverishes some words at the expense of others when assigning probability. We present numerical, theoretical and empirical analyses which show that words on the interior of the convex hull in the embedding space have their probability bounded by the probabilities of the words on the hull.

Building One-Shot Semi-supervised (BOSS) Learning up to Fully Supervised Performance

地址: https://arxiv.org/abs/2006.09363

标签: One-Shot Learning

推荐人: Jeremy Howard (fast.ai 联合创始人)

推荐理由: 在 CIFAR-10 和 SVHN 数据集上，展示了 One-Shot 半监督学习（无标签的数据+每个类标注一条数据）的威力，取得了与监督学习相近的结果。(预印本，NeurIPS2020 在投)

Reaching the performance of fully supervised learning with unlabeled data and only labeling one sample per class might be ideal for deep learning applications. We demonstrate for the first time the potential for building one-shot semi-supervised (BOSS) learning on Cifar-10 and SVHN up to attain test accuracies that are comparable to fully supervised learning. Our method combines class prototype refining, class balancing, and self-training. A good prototype choice is essential and we propose a practical technique for obtaining iconic examples. In addition, we demonstrate that class balancing methods substantially improve accuracy results in semi-supervised learning to levels that allow self-training to reach the level of fully supervised learning performance. Rigorous empirical evaluations provide evidence that labeling large datasets is not necessary for training deep neural networks. We made our code available at GitHub (https://github.com/lnsmith54/BOSS) to facilitate replication and for use with future real-world applications.