Meta-Learning with Implicit Gradients

最新推荐文章于 2024-07-08 19:18:51 发布

pinn山里娃

最新推荐文章于 2024-07-08 19:18:51 发布

阅读量423

点赞数 9

分类专栏：元学习文章标签：深度学习

本文链接：https://blog.csdn.net/weixin_45521594/article/details/107958608

版权

12 篇文章 6 订阅

订阅专栏

论文信息

题目：Meta-Learning with Implicit Gradients

作者：

期刊会议：

年份：2019

动机：

Recent work has studied how meta-learning algorithms can acquire such a capability by learning to efficiently learn a range of tasks, thereby enabling learning of a new task with as little as a single example. We focus on this class of optimization-based methods, and in particular the model-agnostic meta-learning (MAML) formulation. MAML has been shown to be as ex- pressive as black-box approaches, is applicable to a broad range of settings, and recovers a convergent and consistent optimization procedure
Despite its appealing properties, meta-learning an initialization requires backpropagation through the inner optimization process. As a result, the meta-learning process requires higher-order derivatives, imposes a non-trivial computational and memory burden, and can suffer from vanishing gradients.
创新：
Our approach is agnostic to the choice of inner loop optimizer and can gracefully handle many gradient steps without vanishing gradients or memory constraints
These limitations make it harder to scale optimization-based meta learning methods to tasks involving medium or large datasets, or those that require many inner-loop optimization steps. Our goal is to develop an algorithm that addresses these limitations

Our approach of decoupling of meta-gradient computation and choice of inner level optimizer has a number of appealing properties.

First, the inner optimization path need not be stored nor differentiated through, thereby making implicit MAML memory efficient and scalable to a large number of inner optimization steps.
Second, implicit MAML is agnostic to the inner optimization method used, as long as it can find an approximate solution to the inner-level optimization problem. This permits the use of higher-order methods, and in principle even non-differentiable optimization methods or components like sample- based optimization, line-search, or those provided by proprietary software
Finally, we also provide the first (to our knowledge) non-asymptotic theoretical analysis of bi-level optimization.

Our algorithm aims to learn a set of parameters such that an optimization algorithm that is initialized at and regularized to this parameter vector leads to good generalization for a variety of learning tasks. By leveraging the implicit differentiation approach, we derive an analytical expression for the meta (or outer level) gradient that depends only on the solution to the inner optimization and not the path taken by the inner optimization algorithm, as depicted in Figure 1.

关注