论文笔记 | FaceNet: A Unified Embedding for Face Recognition and Clustering

最新推荐文章于 2020-02-19 10:47:32 发布

bea_tree

最新推荐文章于 2020-02-19 10:47:32 发布

阅读量5.1k

点赞数 1

本文链接：https://blog.csdn.net/bea_tree/article/details/52040609

版权

ConvNets 专栏收录该内容

39 篇文章 4 订阅

订阅专栏

Authors

Florian Schroff Dmitry Kalenichenko James Philbin
这里写图片描述
Florian Schroff

Abstract

本文提出了FaceNet system，直接从face images 学习到 compact Euclidean 欧几里德 space 从而得到face的相似程度。这样一来 face recognition， verification， clustering 就容易了。该方法使用深度卷积网络直接得到embedding，没有用传统的bottleneck层，训练时用到了online triplet mining method。效率更高，每脸128bytes。

1 Introduction

verification：is this the same person （thresholding）
recognition: who is this person (K-NN classification)
clustering : find common people among these faces (k-means or agglomerative clustering)
本文直接使用squared L2 距离来判断脸部的similarity。
以往使用bottleneck layer 不直接而且representation size 非常大。本文使用triplet based loss function（参考下面的文献）仅128D。其中有两个matching face thumbnail，一个non-matching face thumbnail

K. Q.Weinberger, J. Blitzer, and L. K. Saul. Distance metric
learning for large margin nearest neighbor classification. In
NIPS. MIT Press, 2006. 2, 3

tripletsd 的选择很重要，收到curriculum learning的启发，本文使用在线negative exemplar mining 策略，保证了训练过程中难度的增大

Y. Bengio, J. Louradour, R. Collobert, and J. Weston. Curriculum
learning. In Proc. of ICML, New York, NY, USA,
2009. 2

这里写图片描述

3 Method

3.1 Triplet Loss

通过L2之后得到的embedding是一个hypersphere，其约束方程为：
这里写图片描述
loss为：

3.2 Triplet selection

如果随便选则 $x^p,x^n$ 那么很多triplet会很容易就满足上式，对最终的收敛意义不大，我们需要选择与anchor最近的negative和最远的positive，如何选择呢？首先不能在全局选，因为这时数量巨大，而且个别点容易主导训练。以下是文章提出的两种方法：
1. 离线选择，每n步使用最近的网络再一个subset中选择所需要的样本；
2. 在线选择，mini-batch中选择
本文选择第二种，其中positive的选择时使用所有的正例，（这样会再开始的时候更加稳定收敛速度稍快）。hard negative的选择是这里写图片描述
batch size 是1800左右。