CurricularFace[2020-CVPR]

最新推荐文章于 2024-04-17 09:46:27 发布

ReLuJie

最新推荐文章于 2024-04-17 09:46:27 发布

阅读量4.2k

点赞数

分类专栏： # 人脸识别深度学习文章标签：神经网络 pytorch 深度学习

本文链接：https://blog.csdn.net/On_theway10/article/details/105295545

版权

深度学习同时被 2 个专栏收录

66 篇文章 1 订阅

订阅专栏

人脸识别

17 篇文章 3 订阅

订阅专栏

Prior works

MV-Softmax[2020-AAAI] [强烈建议]

Motivation

MV-Sotamax存在的问题：从training起始阶段就开始强调semi-hard/hard-sample，可能会导致模型的收敛问题！

insight : easy sample first, hard sample later!

Code

CurricularFace[Pytorchvision]

class CurricularFace(nn.Module):
    def __init__(self, in_features, out_features, m = 0.5, s = 64.):
        super(CurricularFace, self).__init__()
        self.in_features = in_features
        self.out_features = out_features
        self.m = m
        self.s = s
        self.cos_m = math.cos(m)
        self.sin_m = math.sin(m)
        self.threshold = math.cos(math.pi - m)
        self.mm = math.sin(math.pi - m) * m
        self.kernel = Parameter(torch.Tensor(in_features, out_features))
        self.register_buffer('t', torch.zeros(1))
        nn.init.normal_(self.kernel, std=0.01)

    def forward(self, embbedings, label):
        embbedings = l2_norm(embbedings, axis = 1)
        kernel_norm = l2_norm(self.kernel, axis = 0)
        cos_theta = torch.mm(embbedings, kernel_norm)
        cos_theta = cos_theta.clamp(-1, 1)  # for numerical stability
        with torch.no_grad():
            origin_cos = cos_theta.clone()
        target_logit = cos_theta[torch.arange(0, embbedings.size(0)), label].view(-1, 1)

        sin_theta = torch.sqrt(1.0 - torch.pow(target_logit, 2))
        cos_theta_m = target_logit * self.cos_m - sin_theta * self.sin_m #cos(target+margin)
        mask = cos_theta > cos_theta_m
        final_target_logit = torch.where(target_logit > self.threshold, cos_theta_m, target_logit - self.mm)

        hard_example = cos_theta[mask]
        with torch.no_grad():
            self.t = target_logit.mean() * 0.01 + (1 - 0.01) * self.t
        cos_theta[mask] = hard_example * (self.t + hard_example)
        cos_theta.scatter_(1, label.view(-1, 1).long(), final_target_logit)
        output = cos_theta * self.s
        return output, origin_cos * self.s

Details

Curricular Loss

其中，T(cos(θ_y)) = cos(θ_y + m), I (t, cos(θ_j))表示样本的权重函数，N(t, cos(θ_j))定义如下：

Training Curve

x-axis : iterations, y-axis : 难样本的调整系数[modulation coefficients];
t : adaptive parameter; M(MV-Arc-Softmax) : MV-Arc-Softmax; M(ours) : gradient modulation coefficients;
在训练早期，t --> 0, I(t, cos(θ_j)) = 1,模型可以利用easy-sample加速收敛；在训练中后期t不断增大使得I(t, cos(θ_j)) > 1，这样模型可以更多地关注hard-smaples.

eary, later

Note : (a, b), a表示在训练过程中[某个时刻] curricular_loss和arcface-loss的比值；b表示max {cos(θ_j), j ≠ yi}

Adaptive Estimation of t

r^(k)表示第k个mini-batch中positive-cosin similarity的均值，r^(0) = 0;
, α = 0.99. 【大家可以脑补一下：为什么t^(k)随着k的增加，会呈现出单调递增的趋势呢？】
different strategies or vaule of t

Experiment

Benchmark

Challenge

Reference

[1]. CurricularFace: Adaptive Curriculum Learning Loss for Deep Face Recognition[2020-CVPR]

ReLuJie

关注

0
点赞
踩
6

收藏

觉得还不错? 一键收藏
打赏
5
评论
CurricularFace[2020-CVPR]

Motivation Details1Experiment1Reference[1]. CurricularFace: Adaptive Curriculum Learning Loss for Deep Face Recognition[2020-CVPR]
复制链接

扫一扫