Reid度量学习Triplet loss代码解析。

该文章是对之前 Reid损失函数理论学习的补充。从代码方面进行Triplet loss(三元组损失函数)的学习。以及包含Tirplet hard是如何找最困难的正负样本。

目录

TripletLoss

计算距离矩阵

寻找困难样本


score, feat = model(img)

此时score的shape为[8,751],8指的是batch_size,751是分类,形式如下,输出均为hard形式(也就是还没有经过softmax):

tensor([[-1.3245e-02, -2.3512e-02, -1.8136e-02,  ...,  3.5739e-05,
         -2.3923e-02,  1.6440e-02],
        [-1.4504e-02, -1.9184e-02, -2.0377e-02,  ...,  4.1178e-03,
         -2.3450e-02,  1.9983e-02],
        [-1.0439e-02, -1.7005e-02, -1.7642e-02,  ...,  2.7252e-03,
         -2.0623e-02,  1.6570e-02],
        ...,
        [-1.6970e-02, -1.6319e-02, -1.7090e-02,  ...,  3.0741e-03,
         -2.4214e-02,  1.7379e-02],
        [-1.0949e-02, -2.1738e-02, -1.7383e-02,  ...,  2.5572e-03,
         -2.6021e-02,  1.7932e-02],
        [-1.7376e-02, -2.4241e-02, -1.1886e-02,  ...,  3.6473e-03,
         -2.8584e-02,  2.1102e-02]], device='cuda:0', grad_fn=<MmBackward>)

此时的feat的shape为[8,512],8是batch size,512是model输出的维度,形式如下:

tensor([[1.0856, 1.1822, 1.0917,  ..., 0.8223, 1.0282, 1.0947],
        [1.0087, 0.8157, 0.9607,  ..., 0.9637, 0.9081, 1.1780],
        [0.8724, 0.9109, 0.8726,  ..., 0.8570, 0.9679, 0.8008],
        ...,
        [0.7935, 0.9719, 1.0124,  ..., 0.7361, 1.1128, 0.8532],
        [1.0150, 0.9681, 0.9360,  ..., 1.0709, 0.9801, 1.0442],
        [1.0555, 1.1638, 0.8265,  ..., 1.2366, 1.1049, 1.0189]],
       device='cuda:0', grad_fn=<ViewBackward>)

此时的target,shape为8,表示为ID:

tensor([119, 119, 119, 119, 714, 714, 714, 714], device='cuda:0')

TripletLoss

计算距离矩阵

这里计算欧氏距离。

x,y是特征向量,shape均为[batch_size,512]。

m,n均为batch大小,这里为8.

xx是:a²,pow计算每个tensor的平方,sum中的1表示在1维度上,也就是在512所处维度,所以torch.pow(x,2).sum(1,True)的shape为[8,1],表示为按行求和,expand(m,n)表示扩展为[8,8]维度。

torch.pow(x,2).sum(1,True):

tensor([[539.8054],
        [457.5028],
        [474.0969],
        [557.4810],
        [778.4174],
        [474.9417],
        [503.8409],
        [625.3312]], device='cuda:0', grad_fn=<SumBackward1>)

torch.pow(x, 2).sum(1, keepdim=True).expand(m, n):

tensor([

[539.8054, 539.8054, 539.8054, 539.8054, 539.8054, 539.8054, 539.8054,539.8054],
[457.5028, 457.5028, 457.5028, 457.5028, 457.5028, 457.5028, 457.5028,457.5028],
[474.0969, 474.0969, 474.0969, 474.0969, 474.0969, 474.0969, 474.0969,474.0969],
[557.4810, 557.4810, 557.4810, 557.4810, 557.4810, 557.4810, 557.4810,557.4810],
[778.4174, 778.4174, 778.4174, 778.4174, 778.4174, 778.4174, 778.4174,778.4174],
[474.9417, 474.9417, 474.9417, 474.9417, 474.9417, 474.9417, 474.9417,474.9417],
[503.8409, 503.8409, 503.8409, 503.8409, 503.8409, 503.8409, 503.8409,503.8409],
[625.3312, 625.3312, 625.3312, 625.3312, 625.3312, 625.3312, 625.3312,625.3312]], device='cuda:0', grad_fn=<ExpandBackward>)

同理,yy也是一样,只不过这里有个转置操作。

dist=xx+yy就可以得到a²+b²。

注意addmm_和addmm区别。

最后dist就是我们得到的矩阵距离。

def euclidean_dist(x, y):
    """
    Args:
      x: pytorch Variable, with shape [m, d]
      y: pytorch Variable, with shape [n, d]
    Returns:
      dist: pytorch Variable, with shape [m, n]
    """
    m, n = x.size(0), y.size(0)
    xx = torch.pow(x, 2).sum(1, keepdim=True).expand(m, n)  # a²
    yy = torch.pow(y, 2).sum(1, keepdim=True).expand(n, m).t()  # b^2
    dist = xx + yy  # a^2+b^2
    dist.addmm_(1, -2, x, y.t())  # a^2+b^2 - 2ab
    dist = dist.clamp(min=1e-12).sqrt()  # for numerical stability  限定一下范围防止为最小出现0无法求导
    return dist

此时得到的矩阵距离为,shape为[batch_size,batch_size]:

tensor([[1.0000e-06, 4.3200e+00, 4.1502e+00, 3.7251e+00, 7.3499e+00, 3.9080e+00,
         3.6081e+00, 4.4757e+00],
        [4.3200e+00, 7.8125e-03, 3.5319e+00, 4.8321e+00, 9.8512e+00, 3.2775e+00,
         3.6503e+00, 6.6219e+00],
        [4.1502e+00, 3.5319e+00, 1.0000e-06, 4.3095e+00, 9.1650e+00, 3.4574e+00,
         3.3446e+00, 5.8928e+00],
        [3.7251e+00, 4.8321e+00, 4.3095e+00, 1.0000e-06, 6.7865e+00, 4.4078e+00,
         3.6200e+00, 4.1992e+00],
        [7.3499e+00, 9.8512e+00, 9.1650e+00, 6.7865e+00, 1.0000e-06, 9.0147e+00,
         8.0675e+00, 5.2237e+00],
        [3.9080e+00, 3.2775e+00, 3.4574e+00, 4.4078e+00, 9.0147e+00, 1.0000e-06,
         3.1999e+00, 5.9490e+00],
        [3.6081e+00, 3.6503e+00, 3.3446e+00, 3.6200e+00, 8.0675e+00, 3.1999e+00,
         1.0000e-06, 5.0834e+00],
        [4.4757e+00, 6.6219e+00, 5.8928e+00, 4.1992e+00, 5.2237e+00, 5.9490e+00,
         5.0834e+00, 1.1049e-02]], device='cuda:0', grad_fn=<SqrtBackward>)

寻找困难样本

dist_mat:距离矩阵。

labels:标签

return_inds:是否返回索引。

代码中的N为batch大小。

labels.expand(N,N),将labels的维度进行扩展(重复),大小为[8,8]。

labels.expand(N,N):

tensor([[119, 119, 119, 119, 714, 714, 714, 714],
        [119, 119, 119, 119, 714, 714, 714, 714],
        [119, 119, 119, 119, 714, 714, 714, 714],
        [119, 119, 119, 119, 714, 714, 714, 714],
        [119, 119, 119, 119, 714, 714, 714, 714],
        [119, 119, 119, 119, 714, 714, 714, 714],
        [119, 119, 119, 119, 714, 714, 714, 714],
        [119, 119, 119, 119, 714, 714, 714, 714]], device='cuda:0')

 labels.expand(N,N).t()为转置操作。

tensor([[119, 119, 119, 119, 119, 119, 119, 119],
        [119, 119, 119, 119, 119, 119, 119, 119],
        [119, 119, 119, 119, 119, 119, 119, 119],
        [119, 119, 119, 119, 119, 119, 119, 119],
        [714, 714, 714, 714, 714, 714, 714, 714],
        [714, 714, 714, 714, 714, 714, 714, 714],
        [714, 714, 714, 714, 714, 714, 714, 714],
        [714, 714, 714, 714, 714, 714, 714, 714]], device='cuda:0')

labels.expand(N, N).eq(labels.expand(N, N).t())。eq表示逐元素比较,判断对应位置上是否相等。为True表示为相同的ID(也就是相同的人),也就是正样本。表示为False表示为不同的人,负样本。

tensor([[ True,  True,  True,  True, False, False, False, False],
        [ True,  True,  True,  True, False, False, False, False],
        [ True,  True,  True,  True, False, False, False, False],
        [ True,  True,  True,  True, False, False, False, False],
        [False, False, False, False,  True,  True,  True,  True],
        [False, False, False, False,  True,  True,  True,  True],
        [False, False, False, False,  True,  True,  True,  True],
        [False, False, False, False,  True,  True,  True,  True]],
       device='cuda:0')

tensor.ne()表示逐元素比较不相等的。与上面相反,相同ID为False,不同ID为True。

tensor([[False, False, False, False,  True,  True,  True,  True],
        [False, False, False, False,  True,  True,  True,  True],
        [False, False, False, False,  True,  True,  True,  True],
        [False, False, False, False,  True,  True,  True,  True],
        [ True,  True,  True,  True, False, False, False, False],
        [ True,  True,  True,  True, False, False, False, False],
        [ True,  True,  True,  True, False, False, False, False],
        [ True,  True,  True,  True, False, False, False, False]],
       device='cuda:0') 

    dist_ap, relative_p_inds = torch.max(
        dist_mat[is_pos].contiguous().view(N, -1), 1, keepdim=True)

 dist_mat是我们前面得到的距离矩阵,用is_pos进行筛选就可得到正样本的距离,并找到最大距离就是正样本距离,同时还可以得到索引。最大距离为最困难的正样本

dist_ap:

tensor([[4.3200],
        [4.8321],
        [4.3095],
        [4.8321],
        [9.0147],
        [9.0147],
        [8.0675],
        [5.9490]], device='cuda:0', grad_fn=<MaxBackward0>)

同理得到最小距离就是最困的的负样本

dist_an, relative_n_inds = torch.min(
        dist_mat[is_neg].contiguous().view(N, -1), 1, keepdim=True)

 

dist_an:

tensor([[3.6081],
        [3.2775],
        [3.3446],
        [3.6200],
        [6.7865],
        [3.2775],
        [3.3446],
        [4.1992]], device='cuda:0', grad_fn=<MinBackward0>)

 

y = dist_an.new().resize_as_(dist_an).fill_(1)

dist_an.new()是创建一个type和device和dist_an一样的tensor[但此时没有任何内容]。

reisze_as_将tensor的shape与dist_an一样,fill表示用1进行填充。 

计算loss.传入三个参数,dist_an为困难负样本距离矩阵。dist_ap为困难正样本的距离矩阵。这里的用于计算tripeltloss用的是nn.MarginRankingLoss

loss = self.ranking_loss(dist_an, dist_ap, y)

 loss=tensor(2.6602, device='cuda:0', grad_fn=<MeanBackward0>)

def hard_example_mining(dist_mat, labels, return_inds=False):
    """For each anchor, find the hardest positive and negative sample.
    Args:
      dist_mat: pytorch Variable, pair wise distance between samples, shape [N, N]
      labels: pytorch LongTensor, with shape [N]
      return_inds: whether to return the indices. Save time if `False`(?)
    Returns:
      dist_ap: pytorch Variable, distance(anchor, positive); shape [N]
      dist_an: pytorch Variable, distance(anchor, negative); shape [N]
      p_inds: pytorch LongTensor, with shape [N];
        indices of selected hard positive samples; 0 <= p_inds[i] <= N - 1
      n_inds: pytorch LongTensor, with shape [N];
        indices of selected hard negative samples; 0 <= n_inds[i] <= N - 1
    NOTE: Only consider the case in which all labels have same num of samples,
      thus we can cope with all anchors in parallel.
    """

    assert len(dist_mat.size()) == 2
    assert dist_mat.size(0) == dist_mat.size(1)
    N = dist_mat.size(0)

    # shape [N, N]
    is_pos = labels.expand(N, N).eq(labels.expand(N, N).t())
    is_neg = labels.expand(N, N).ne(labels.expand(N, N).t())

    # `dist_ap` means distance(anchor, positive)
    # both `dist_ap` and `relative_p_inds` with shape [N, 1]
    dist_ap, relative_p_inds = torch.max(
        dist_mat[is_pos].contiguous().view(N, -1), 1, keepdim=True)
    # `dist_an` means distance(anchor, negative)
    # both `dist_an` and `relative_n_inds` with shape [N, 1]
    dist_an, relative_n_inds = torch.min(
        dist_mat[is_neg].contiguous().view(N, -1), 1, keepdim=True)
    # shape [N]
    dist_ap = dist_ap.squeeze(1)
    dist_an = dist_an.squeeze(1)

    if return_inds:
        # shape [N, N]
        ind = (labels.new().resize_as_(labels)
               .copy_(torch.arange(0, N).long())
               .unsqueeze(0).expand(N, N))
        # shape [N, 1]
        p_inds = torch.gather(
            ind[is_pos].contiguous().view(N, -1), 1, relative_p_inds.data)
        n_inds = torch.gather(
            ind[is_neg].contiguous().view(N, -1), 1, relative_n_inds.data)
        # shape [N]
        p_inds = p_inds.squeeze(1)
        n_inds = n_inds.squeeze(1)
        return dist_ap, dist_an, p_inds, n_inds

    return dist_ap, dist_an

完整代码:

class TripletLoss(object):
    """Modified from Tong Xiao's open-reid (https://github.com/Cysu/open-reid).
    Related Triplet Loss theory can be found in paper 'In Defense of the Triplet
    Loss for Person Re-Identification'."""

    # Triploss(Dap-Dan+α) margin就是α,ranking_loss就是计算(Dap-Dan+α)
    def __init__(self, margin=None):
        self.margin = margin
        if margin is not None:
            # 排序损失函数,D(x1,x2,y),x1,x2是给定的待排序的两个输入,y代表真实标签∈[-1,1].当y=1,x1排在x2之前,y=-1,x1排在x2之后。
            # max(0,-y * (x1-x2)+margin)
            # x1, x 2 x2x2排序正确且− y ∗ ( x 1 − x 2 ) > margin, 则loss为0
            self.ranking_loss = nn.MarginRankingLoss(margin=margin)
        # if margin为None的时候,采用softMarginLoss
        else:
            self.ranking_loss = nn.SoftMarginLoss()

    def __call__(self, global_feat, labels, normalize_feature=False):
        if normalize_feature:
            global_feat = normalize(global_feat, axis=-1)
        # 计算距离矩阵,欧式距离
        dist_mat = euclidean_dist(global_feat, global_feat)
        # 得到hard example,triHard loss.找最困难的正样本和最困难的负样本。
        dist_ap, dist_an = hard_example_mining(
            dist_mat, labels)
        y = dist_an.new().resize_as_(dist_an).fill_(1)
        if self.margin is not None:
            # dist_an:anchor和negative距离
            # dist_ap:anchor和positive距离
            # loss(x,y) = max(0,-y*(x1-x2)+margin)
            # MarginRankingLoss在计算triplet loss的时候与TripletMarginLoss不同的是传入的三个形参不再是原始的向量anchor,positive,negative,而是计算好的dap,dan
            loss = self.ranking_loss(dist_an, dist_ap, y)
        else:
            loss = self.ranking_loss(dist_an - dist_ap, y)
        return loss, dist_ap, dist_an

  • 3
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

爱吃肉的鹏

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值