Taskonomy: Disentangling Task Transfer Learning
Amir R. Zamir, Alexander Sax, William Shen, Leonidas Guibas, Jitendra Malik, Silvio Savarese(Computer Vision: from 3D reconstruction to recognition (CS 231A)的老师)
Abstract
visual tasks have a relationship, having surface normals simplify estimating the depth of an image ⇒ ⇒ existence of a structure among visual tasks
问题:how to get to know this structure?
解决方案:a computational taxonomic map for task transfer learning, a fully computational approach for modeling the structure of space of visual tasks, finding (first and higher order) transfer learning dependencies across a dictionary of twenty six 2D, 2.5D, 3D, and semantic tasks in a latent space
用途:nontrivial emerged relationships, and exploit them to reduce the demand for labeled data
Introduction
目标检测、深度估计、边缘检测等任务之间有的有显然的联系,比如surface normal和深度估计相关,以及如图所示的3D边缘有助于point match,但是其他则没有显然的联系。现有的计算机视觉已经在忽视这种不同任务之间关联性的道路上越走越远,显然,如果能利用这种不同任务之间的关联性,就能降低学习时数据量的要求。
困难:this task space structure and its effects are still largely unknown
解决方案:a framework for mapping the space of visual tasks, use neural networks as the adopted computational function, class, each layer successively forms more abstract representations of the input containing the information needed for mapping the input to the output
computes an affinity matrix among tasks based on whether the solution for one task can be sufficiently easily read out of the representation trained for another task;
Binary Integer Programming formulation: extracts a globally efficient transfer policy from them
Related Work
Self-supervised learning: leverage the inherent relationships between tasks to learn a desired expensive one (e.g. object detection) via a cheap surrogate, use a manually-entered local part of the structure in the task space
Meta-learning: performing the learning at a level higher than where conventional learning occurs
Domain adaption: render a function that is developed on a certain domain applicable to another
Method
maximize the collective performance on a set of target tasks T={
t1,...,tn} T = { t 1 , . . . , t n } , subject to the constraint that we have a limited supervision budget γ γ , the maximum allowable number of source tasks S S that we are willing to train from scratch.
task dictionary
, T−S T − S the tasks that we want solved but cannot train (“target-only”), T∩S T ∩ S are the tasks that we want solved but could play as source too, S−T S − T are the “source-only” tasks which we may not directly care about to solve (e.g. jigsaw puzzle) but can be optionally used if they increase the performance on