【翻译】 《Generalizing from a Few Examples: A Survey on Few-Shot》

Introduction

Machine learning has been highly successful in data-intensive applications, but is often hampered when the data set is small. Recently, Few-Shot Learning (FSL) is proposed to tackle this problem. Using prior knowledge, FSL can rapidly generalize to new tasks containing only a few samples with supervised information. In this paper, we conduct a thorough survey to fully understand FSL. Starting from a formal definition of FSL, we distinguish FSL from several relevant machine learning problems. We then point out that the core issue in FSL is that the empirical risk minimizer is unreliable. Based on how prior knowledge can be used to handle this core issue, we categorize FSL methods from three perspectives: (i) data, which uses prior knowledge to augment the supervised experience; (ii) model, which uses prior knowledge to reduce the size of the hypothesis space; and (iii) algorithm, which uses prior knowledge to alter the search for the best hypothesis in the given hypothesis space. With this taxonomy, we review and discuss the pros and cons of each category. Promising directions, in the aspects of the FSL problem setups, techniques, applications and theories, are also proposed to provide insights for future research.

机器学习在数据密集型应用中取得了很大的成功,但当数据集很小时,机器学习往往会受到阻碍。最近,小样本学习(FSL)被提出来解决这个问题。利用先验知识,FSL可以快速地推广到只包含少量有监督信息样本的新任务中。在本文中,我们进行了深入的调查,以充分了解FSL。从FSL的形式化定义出发,我们将FSL与一些相关的机器学习问题区分开来。然后指出FSL的核心问题是经验风险最小化是不可靠的。基于如何使用先验知识来处理这个核心问题,我们从三个角度对FSL方法进行了分类:(i)数据,它使用先验知识来增强监督经验;(ii)模型,它使用先验知识来缩小假设空间的大小;和(iii)算法,它利用先验知识在给定的假设空间中改变搜索最佳假设的方法。通过这个分类法,我们回顾并讨论每个类别的优缺点。在FSL问题的设置、技术、应用和理论等方面提出了一些有希望的方向,以期为今后的研究提供启示。

“Can machines think?” This is the question raised in Alan Turing’s seminal paper entitled “Computing Machinery and Intelligence” in 1950. He made the statement that “The idea behind digital computers may be explained by saying that these machines are intended to carry out any operations which could be done by a human computer”. In other words, the ultimate goal of machines is to be as intelligent as humans. In recent years, due to the emergence of powerful computing devices (e.g., GPU and distributed platforms), large data sets (e.g., ImageNet data with 1000 classes), advanced models and algorithms (e.g., convolutional neural networks (CNN) and long short-term memory (LSTM)), AI speeds up its pace to be like humans and defeats humans in many fields. To name a few, AlphaGo defeats human champions in the ancient game of Go; and residual network (ResNet) obtains better classification performance than humans on ImageNet. AI also supports the development of intelligent tools in many aspects of daily life, such as voice assistants, search engines, autonomous driving cars, and industrial robots.

“机器能思考吗?这是艾伦·图灵在1950年发表的题为“计算机器与智能”的开创性论文中提出的问题。他说:“数字计算机背后的想法可以解释为,这些机器是用来执行任何可以由人类计算机完成的操作的。”。换句话说,机器的最终目标是和人类一样聪明。近年来,由于强大的计算设备(如GPU和分布式平台)、大型数据集(如1000类的ImageNet数据)、先进的模型和算法(如卷积神经网络(CNN)和长短期记忆网络(LSTM))的出现,AI加快了与人类相似的步伐,在许多领域击败了人类。举几个例子,AlphaGo在古老的围棋游戏中击败了人类冠军;残差网络(ResNet)在ImageNet上获得了比人类更好的分类性能。人工智能还支持在日常生活的许多方面开发智能工具,如语音助手、搜索引擎、自动驾驶汽车和工业机器人。

Albeit its prosperity, current AI techniques cannot rapidly generalize from a few examples.The aforementioned successful AI applications rely on learning from large-scale data. In contrast, humans are capable of learning new tasks rapidly by utilizing what they learned in the past. For example, a child who learned how to add can rapidly transfer his knowledge to learn multiplication given a few examples (e.g., 2 × 3 = 2 + 2 + 2 and 1 × 3 = 1 + 1 + 1). Another example is that given a few photos of a stranger, a child can easily identify the same person from a large number of photos.

尽管它蓬勃发展,但目前的人工智能技术不能从少量样本中迅速拟合。上述成功的人工智能应用程序依赖于从大规模数据中学习。相反,人类能够利用过去学到的知识快速学习新任务。例如,一个学习了如何添加的孩子可以在给出几个例子(例如,2×3=2+2+2和1×3=1+1+1)的情况下,快速地传递知识学习乘法。另一个例子是,给一个陌生人的几张照片,孩子可以很容易地从大量的照片中识别出同一个人。

Bridging this gap between AI and humans is an important direction. It can be tackled by machine learning, which is concerned with the question of how to construct computer programs that automatically improve with experience. In order to learn from a limited number of examples with supervised information, a new machine learning paradigm called Few-Shot Learning (FSL) is proposed. A typical example is character generation, in which computer programs are asked to parse and generate new handwritten characters given a few examples. To handle this task, one can decompose the characters into smaller parts transferable across characters, and then aggregate these smaller components into new characters. This is a way of learning like human. Naturally, FSL can also advance robotics, which develops machines that can replicate human actions. Examples include one-shot imitation, multi-armed bandits, visual navigation, and continuous control.

弥合人工智能和人类之间的鸿沟是一个重要的方向。它可以通过机器学习来解决,机器学习关注的是如何构造能够随着经验自动改进的计算机程序的问题。为了从有限的有监督信息的样本中学习,一种被称为 小样本学习 的新的机器学习范式被提出。一个典型的例子是字符生成,在这个例子中,计算机程序被要求解析并生成新的手写字符。为了处理这个任务,人们可以将字符分解成更小的部分,然后将这些更小的部分聚合成新的字符。这是一种像人类一样的学习方式。当然,FSL也可以推进机器人技术的发展,后者开发出可以复制人类动作的机器。例如,看一眼模仿、多臂老虎机、视觉导航和连续控制。

Another classic FSL scenario is where examples with supervised information are hard or impossible to acquire due to privacy, safety or ethic issues. A typical example is drug discovery, which tries to discover properties of new molecules so as to identify useful ones as new drugs.Due to possible toxicity, low activity, and low solubility, new molecules do not have many real biological records on clinical candidates. Hence, it is important to learn effectively from a small number of samples. Similar examples where the target tasks do not have many examples include FSL translation, and cold-start item recommendation. Through FSL, learning suitable models for these rare cases can become possible.

另一个经典的FSL场景是,由于隐私、安全或道德问题,很难或不可能获得受监督信息的示例。一个典型的例子是药物发现,它试图发现新分子的特性,以便将有用的分子识别为新药。由于可能的毒性、低活性和低溶解度,新分子在临床上并没有很多真实的生物学记录。因此,有效地从少量样本中学习是非常重要的。目标任务没有太多示例的类似样本,包括FSL翻译和推荐系统冷启动技术。通过FSL,为只有少量样本学习合适的模型成为可能。

FSL can also help relieve the burden of collecting large-scale supervised data. For example, although ResNet outperforms humans on ImageNet, each class needs to have sufficient labeled images which can be laborious to collect. FSL can reduce the data gathering effort for data-intensive applications. Examples include image classification, image retrieval, object tracking, gesture recognition, image captioning, visual question answering, video event detection, language modeling, and neural architecture search.

FSL还可以帮助减轻收集大规模监督数据的负担。例如,尽管ResNet在ImageNet上的性能优于人类,但每个类都需要有足够的标记图像,而这些图像的收集可能很费劲。FSL可以减少数据密集型应用的数据收集工作。示例包括图像分类、图像检索、目标跟踪、手势识别、图像字幕、视觉问答、视频事件检测、语言建模和神经结构搜索。

Driven by the academic goal for AI to approach humans and the industrial demand for inexpensive learning, FSL has drawn much recent attention and is now a hot topic. Many related machine learning approaches have been proposed, such as meta-learning, embedding learning and generative modeling. However, currently, there is no work that provides an organized taxonomy to connect these FSL methods, explains why some methods work while others fail, nor discusses the pros and cons of different approaches. Therefore, in this paper,we conduct a survey on the FSL problem. In contrast, the survey in only focuses on concept learning and experience learning for small samples.

在人工智能接近人类的学术目标和低廉学习的工业需求的推动下,FSL最近引起了广泛的关注,现在是一个热门话题。许多相关的机器学习方法已经被提出,如元学习、嵌入式学习和生成模型。然而,目前还没有工作提供一个有组织的分类法来连接这些FSL方法,解释为什么一些方法可以成功而另一些方法失败,也没有讨论不同方法的优缺点。因此,本文对FSL问题进行了综述。相比之下,本次调查只针对小样本的概念学习和经验学习。

Contributions of this survey can be summarized as follows:
• We give a formal definition on FSL, which naturally connects to the classic machine learning definition in[92,94]. The definition is not only general enough to include existing FSL works, but also specific enough to clarify what the goal of FSL is and how we can solve it. This definition is helpful for setting future research targets in the FSL area.
• We list the relevant learning problems for FSL with concrete examples, clarifying their relatedness and differences with respect to FSL. These discussions can help better discriminate and position FSL among various learning problems.
• We point out that the core issue of FSL supervised learning problem is the unreliable empirical risk minimizer, which is analyzed based on error decomposition in machine learning. This provides insights to improve FSL methods in a more organized and systematic way.
• We perform an extensive literature review, and organize them in an unified taxonomy from the perspectives of data, model and algorithm. We also present a summary of insights and a discussion on the pros and cons of each category. These can help establish a better understanding of FSL methods.
• We propose promising future directions for FSL in the aspects of problem setup, techniques, applications and theories. These insights are based on the weaknesses of the current development of FSL, with possible improvements to make in the future.

本次综述的贡献总结如下:
•我们给出了FSL的形式化定义,它与[92,94]中经典的机器学习定义有着天然的联系。该定义不仅具有足够的一般性,可以包括现有的FSL工作,而且还具有足够的具体性,可以阐明FSL的目标是什么以及如何解决它。这一定义有助于制定未来FSL领域的研究目标。
•我们用具体的例子列出了FSL的相关学习问题,阐明了它们与FSL的联系和区别。这些讨论有助于更好地区分和定位各种学习问题中的FSL。
•我们指出FSL监督学习问题的核心问题是不可靠的经验风险最小化问题,这是基于机器学习中的误差分解进行分析的。这为以更有组织和系统的方式改进FSL方法提供了见解。
•我们进行了广泛的文献综述,并从数据、模型和算法的角度对它们进行了统一的分类。我们还总结了一些见解,并讨论了每个类别的优缺点。这些有助于更好地理解FSL方法。
•我们从问题设置、技术、应用和理论等方面提出了FSL未来的发展方向。这些见解基于FSL当前发展的弱点,以及未来可能的改进。

Organization of the Survey

The remainder of this survey is organized as follows. Section 2 provides an overview for FSL, including its formal definition, relevant learning problems, core issue, and a taxonomy of existing works in terms of data, model and algorithm. Section 3 is for methods that augment data to solve FSL problem. Section 4 is for methods that reduce the size of hypothesis space so as to make FSL feasible. Section 5 is for methods that alter the search strategy of algorithm to deal with the FSL problem. In Section 6, we propose future directions for FSL in terms of problem setup, techniques, applications and theories. Finally, the survey closes with conclusion in Section 7.

本次综述的其余部分组织如下。第二节概述了FSL的形式定义、相关学习问题、核心问题以及现有工作的数据、模型和算法分类。第三节是针对增强数据以解决FSL问题的方法。第四部分是为了使假设空间的大小减小,使FSL可行的方法。第五部分是针对FSL问题,对算法搜索策略进行了改进。第六部分从问题设置、技术、应用和理论等方面提出了FSL的未来发展方向。最后,综述在第7节以结论结束。

Notation and Terminology

Consider a learning task T , FSL deals with a data set D = {D_{train},D_{test}} consisting of a training set D_{train} = {(x_{i},y_{i})}^I _i=1 where I is small, and a testing set D_{test}= {xtest}. Let p(x,y) be the ground-truth joint probability distribution of input x and output y, and hˆ be the optimal hypothesis from x to y. FSL learns to discover hˆ by fitting Dtrain and testing on Dtest. To approximate hˆ, the FSL model determines a hypothesis space H of hypotheses h(·; θ)’s, where θ denotes all the parameters used by h. Here, a parametric h is used, as a nonparametric model often requires large data sets, and thus not suitable for FSL. A FSL algorithm is an optimization strategy that searches H in order to find the θ that parameterizes the best h∗ ∈ H. The FSL performance is measured by a loss function ℓ(yˆ,y) defined over the prediction yˆ = h(x; θ) and the observed output y.

 


OVERVIEW

In this section, we first provide a formal definition of the FSL problem in Section 2.1 with concrete examples. To differentiate the FSL problem from relevant machine learning problems, we discuss their relatedness and differences in Section 2.2. In Section 2.3, we discuss the core issue that makes FSL difficult. Section 2.4 then presents a unified taxonomy according to how existing works handle the core issue.

在本节中,我们首先在第2.1节中用具体的例子给出了FSL问题的正式定义。为了区分FSL问题和相关的机器学习问题,我们在第2.2节讨论了它们之间的联系和区别。在第2.3节中,我们将讨论使FSL变得困难的核心问题。然后,第2.4节根据现有工作如何处理核心问题提出了一个统一的分类法。

Problem Definition

As FSL is a sub-area in machine learning, before giving the definition of FSL, let us recall how machine learning is defined in the literature.

Definition 2.1 (Machine Learning [92, 94]). A computer program is said to learn from experience E with respect to some classes of task T and performance measure P if its performance can improve with E on T measured by P.

由于FSL是机器学习的一个子领域,在给出FSL的定义之前,让我们回顾一下文献中如何定义机器学习。

定义2.1(机器学习[92,94])。如果用P来衡量E对T的影响能提高计算机程序的性能,则可以说计算机程序在某些类的任务T和性能度量P方面可以从经验E中学习。

For example, consider an image classification task (T), a machine learning program can improve its classification accuracy (P) through E obtained by training on a large number of labeled images (e.g., the ImageNet data set ). Another example is the recent computer program AlphaGo, which has defeated the human champion in playing the ancient game of Go (T). It improves its winning rate (P) against opponents by training on a database (E) of around 30 million recorded moves of human experts as well as playing against itself repeatedly. These are summarized in Table 1.

 

例如,考虑图像分类任务(T),机器学习程序可以通过对大量标记图像(例如ImageNet数据集)进行训练而获得的E来提高其分类精度(P)。另一个例子是最近的计算机程序AlphaGo,它在下围棋(T)时击败了人类冠军。它通过在一个数据库(E)上训练大约3000万个人类专家记录的动作,以及反复与自己比赛,来提高自己对对手的获胜率(P)。表1概述了这些问题。

表1 基于定义2.1的机器学习问题示例

任务T经验E性能P
图像分类每个类大规模的带标签图片分类正确率
围棋一个包括大约3000万个人类专家和自己下棋记录的动作数据库获胜率

Typical machine learning applications, as in the examples mentioned above, require a lot of examples with supervised information. However, as mentioned in the introduction, this may be difficult or even not possible. FSL is a special case of machine learning, which targets at obtaining good learning  performance given limited supervised information provided in the training set D_{train},which consists of examples of inputs x_{i} ’s along with their corresponding output y_{i} ’s. Formally,we define FSL in Definition 2.2.

Definition 2.2. Few-Shot Learning (FSL) is a type of machine learning problems (specified by E,T and P), where E contains only a limited number of examples with supervised information for the target T .

典型的机器学习应用,就像上面提到的例子一样,需要很多有监督信息的样本。然而,正如导言中提到的,这可能是困难的,甚至是不可能的。FSL是机器学习的一个特例,其目标是在提供的蕴涵有限监督信息的训练集D_{train}中,获取良好学习性能,包含输入样本x_{i}以及相应的输出y_{i}。正式地,我们在定义2.2中定义FSL。
定义2.2 小样本学习(FSL)是一类机器学习问题(由E,T和P指定),其中E只包含有限数量的示例,其中目标T具有监督信息。

 

 

 

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值