Proxy-NCA Loss、Proxy Anchor Loss

最新推荐文章于 2024-08-16 08:27:20 发布

大坡山小霸王

最新推荐文章于 2024-08-16 08:27:20 发布

阅读量2.3k

点赞数 1

分类专栏：度量学习文章标签：深度学习计算机视觉机器学习

本文链接：https://blog.csdn.net/weixin_44742887/article/details/125259305

版权

度量学习专栏收录该内容

6 篇文章 1 订阅

订阅专栏

17-ICCV-No Fuss Distance Metric Learning using Proxies

Proxy Ranking Loss

20-CVPR-Proxy Anchor Loss for Deep Metric Learning

Proxy Anchor Loss

17-ICCV-No Fuss Distance Metric Learning using Proxies

Heaviside step function

L梯度处处为0->替代损失

margin-based triplet loss

hinge function合页损失函数[x ]+=max(x,0)

Neighborhood Component Analysis (NCA)

采样：k个类，|D|=n个样本 x y z

Proxy Ranking Loss

|P|<<|D|，P作为所有数据点的近似
x的代表：离x最近的p。

代理近似误差：所有数据点中最差的近似。

静态代理分配

每个代表关联一个语义标签，根据标签给数据点x分配代表。

不需要采样三元组，只需要采样anchor x。

动态代理分配

没有语义标签，给x分配最近的代表。

proxy-based loss是triplet loss的严格上界，不需要采样，收敛速度快。

class ProxyNCA(torch.nn.Module):
    def __init__(self, 
        nb_classes,
        sz_embedding,
        smoothing_const = 0.1,
        scaling_x = 1,
        scaling_p = 3
    ):
        torch.nn.Module.__init__(self)
        # initialize proxies s.t. norm of each proxy ~1 through div by 8
        # i.e. proxies.norm(2, dim=1)) should be close to [1,1,...,1]
        # TODO: use norm instead of div 8, because of embedding size
        self.proxies = Parameter(torch.randn(nb_classes, sz_embedding) / 8)
        self.smoothing_const = smoothing_const
        self.scaling_x = scaling_x
        self.scaling_p = scaling_p

    def forward(self, X, T):
        P = F.normalize(self.proxies, p = 2, dim = -1) * self.scaling_p
        X = F.normalize(X, p = 2, dim = -1) * self.scaling_x
        D = torch.cdist(X, P) ** 2
        T = binarize_and_smooth_labels(T, len(P), self.smoothing_const)
        # note that compared to proxy nca, positive included in denominator
        loss = torch.sum(-T * F.log_softmax(-D, -1), -1)
        return loss.mean()

scaling_x,scaling_p？

代码：https://github.com/dichotomies/proxy-nca

20-CVPR-Proxy Anchor Loss for Deep Metric Learning

N-pair loss、Lifted Structure loss：没有利用batch中的全部数据，元组采样->调整超参数。

Proxy-NCA loss：没有利用数据-数据的关系，关联每个数据点的只有代表。

s(x,p)余弦相似度

LSE Log-Sum-Exp function

解决上溢下溢

关于LogSumExp - 知乎

Proxy Anchor Loss

每个代表作为一个anchor，和batch中的所有数据点关联。

P表示所有代表，P+正代表

根据样本的困难程度以不同的力度来拉近或者拉远embedding vectors

class Proxy_Anchor(torch.nn.Module):
    def __init__(self, nb_classes, sz_embed, mrg = 0.1, alpha = 32):
        torch.nn.Module.__init__(self)
        # Proxy Anchor Initialization
        self.proxies = torch.nn.Parameter(torch.randn(nb_classes, sz_embed).cuda())
        nn.init.kaiming_normal_(self.proxies, mode='fan_out')

        self.nb_classes = nb_classes
        self.sz_embed = sz_embed
        self.mrg = mrg
        self.alpha = alpha
        
    def forward(self, X, T):
        P = self.proxies

        cos = F.linear(l2_norm(X), l2_norm(P))  # Calcluate cosine similarity
        P_one_hot = binarize(T = T, nb_classes = self.nb_classes)
        N_one_hot = 1 - P_one_hot
    
        pos_exp = torch.exp(-self.alpha * (cos - self.mrg))
        neg_exp = torch.exp(self.alpha * (cos + self.mrg))

        with_pos_proxies = torch.nonzero(P_one_hot.sum(dim = 0) != 0).squeeze(dim = 1)   # The set of positive proxies of data in the batch
        num_valid_proxies = len(with_pos_proxies)   # The number of positive proxies
        
        P_sim_sum = torch.where(P_one_hot == 1, pos_exp, torch.zeros_like(pos_exp)).sum(dim=0) 
        N_sim_sum = torch.where(N_one_hot == 1, neg_exp, torch.zeros_like(neg_exp)).sum(dim=0)
        
        pos_term = torch.log(1 + P_sim_sum).sum() / num_valid_proxies
        neg_term = torch.log(1 + N_sim_sum).sum() / self.nb_classes
        loss = pos_term + neg_term     
        
        return loss

代码：https://github.com/tjddus9597/Proxy-Anchor-CVPR2020/tree/51db57031e38f75c03f69bbdfad1a3233afd9787