论文阅读之：Deep Meta Learning for Real-Time Visual Tracking based on Target-Specific Feature Space...

最新推荐文章于 2021-06-18 09:32:50 发布

a1424262219

最新推荐文章于 2021-06-18 09:32:50 发布

阅读量297

点赞数

文章标签：人工智能

原文链接：http://www.cnblogs.com/wangxiaocvpr/p/8193880.html

版权

Deep Meta Learning for Real-Time Visual Tracking based on Target-Specific Feature Space

2018-01-04 15:58:15

Paper: https://arxiv.org/pdf/1712.09153.pdf

更新后的版本：Deep Meta Learning for Real-Time Target-Aware Visual Tracking

写在前面：为什么要看这个paper？这篇 paper 貌似是第一个将 meta-learning 应用到 visual tracking 领域的，取得了速度和精度较好的平衡。

Introduction：

我们知道，tracking 中比较重要的就是 target object 特征的学习以及物体外观的变化。很多算法都针对这两点一直进行改进，而最近 NN 对特征的表达提供了很好的解决，但是，物体外观的变化，仍然不能很好的处理，很多都是用跟踪的结果弄一个 target object 的集合，然后适时的进行更新。但是，这种策略是不可避免的，分类器通常都会 overfitting，然后丢失了 the generalization capabilities due to the insufficient training of samples.

　　本文基于以上背景和动机，提出了一种 end to end visual tracking network structure，主要包括了两个部分：

　　一个是：Siamese matching network for target search，

　　另一个是：meta-learning network for adaptive feature space.

　　这里我们主要关注的是这个 meta-learning network，我们提出的一个参数预测网络（parameter prediction network），当然这里是借鉴了最新的 meta-learning 用于 few-shot learning problem.

　　The proposed meta-learner network is trained to provide the matching network with additional convolutional kernels so that the feature space of the matching network can be modified adaptively to adopt new appearance templates obtained in the course of tracking. The meta-learner network only sees the gradients from the last layer of the matching network, given new training samples for the appearance.

　　We also employ a novel training scheme for the meta-learner network to maintain the generalization capability of the feature space by preventing the meta-learner network from generating new parameters that causes overfitting of the matching network. By incorporating our metalearner network, the target-specific feature space can be constructed instantly with a single forward pass without any iterative computation and optimization and free-from the innate overfitting. Fig.1 illustrates the motivation of proposed visual tracking algorithm.
　　

　　Tracking with Meta-Learner ：

　　1. Overview of Proposed Method

　　1.1. Compoent

　　本文所涉及到的网络结构有两个部分构成：the matching network and meta-learning network.

　　Siamese Matching Network 用来计算两个 image patch 之间的相应图（the response map）：

　　这部分特征提取 CNN是 fully convolutional network，损失函数就是计算：预测的响应图和 groundtruth Response map 的差异。

　　Meta-learning Network：这个网络提供的是 the matching network with target-specific weights given an image patch of the target with context patches z = {z1, ..., zM}.