Reid度量学习Triplet loss代码解析。

最新推荐文章于 2024-04-30 11:11:45 发布

爱吃肉的鹏

最新推荐文章于 2024-04-30 11:11:45 发布

阅读量1.3k

点赞数 3

本文链接：https://blog.csdn.net/z240626191s/article/details/130490628

版权

Reid 专栏收录该内容

8 篇文章 8 订阅

订阅专栏

该文章是对之前 Reid损失函数理论学习的补充。从代码方面进行Triplet loss（三元组损失函数）的学习。以及包含Tirplet hard是如何找最困难的正负样本。

TripletLoss

计算距离矩阵

寻找困难样本

score, feat = model(img)

此时score的shape为[8,751]，8指的是batch_size，751是分类，形式如下，输出均为hard形式(也就是还没有经过softmax):

tensor([[-1.3245e-02, -2.3512e-02, -1.8136e-02, ..., 3.5739e-05,
-2.3923e-02, 1.6440e-02],
[-1.4504e-02, -1.9184e-02, -2.0377e-02, ..., 4.1178e-03,
-2.3450e-02, 1.9983e-02],
[-1.0439e-02, -1.7005e-02, -1.7642e-02, ..., 2.7252e-03,
-2.0623e-02, 1.6570e-02],
...,
[-1.6970e-02, -1.6319e-02, -1.7090e-02, ..., 3.0741e-03,
-2.4214e-02, 1.7379e-02],
[-1.0949e-02, -2.1738e-02, -1.7383e-02, ..., 2.5572e-03,
-2.6021e-02, 1.7932e-02],
[-1.7376e-02, -2.4241e-02, -1.1886e-02, ..., 3.6473e-03,
-2.8584e-02, 2.1102e-02]], device='cuda:0', grad_fn=<MmBackward>)

此时的feat的shape为[8,512]，8是batch size,512是model输出的维度，形式如下:

tensor([[1.0856, 1.1822, 1.0917, ..., 0.8223, 1.0282, 1.0947],
[1.0087, 0.8157, 0.9607, ..., 0.9637, 0.9081, 1.1780],
[0.8724, 0.9109, 0.8726, ..., 0.8570, 0.9679, 0.8008],
...,
[0.7935, 0.9719, 1.0124, ..., 0.7361, 1.1128, 0.8532],
[1.0150, 0.9681, 0.9360, ..., 1.0709, 0.9801, 1.0442],
[1.0555, 1.1638, 0.8265, ..., 1.2366, 1.1049, 1.0189]],
device='cuda:0', grad_fn=<ViewBackward>)

此时的target，shape为8，表示为ID:

tensor([119, 119, 119, 119, 714, 714, 714, 714], device='cuda:0')

TripletLoss

计算距离矩阵

这里计算欧氏距离。

x,y是特征向量，shape均为[batch_size,512]。

m,n均为batch大小，这里为8.

xx是：a²，pow计算每个tensor的平方，sum中的1表示在1维度上，也就是在512所处维度，所以torch.pow(x,2).sum(1,True)的shape为[8,1]，表示为按行求和，expand(m,n)表示扩展为[8,8]维度。

torch.pow(x,2).sum(1,True):

tensor([[539.8054],
[457.5028],
[474.0969],
[557.4810],
[778.4174],
[474.9417],
[503.8409],
[625.3312]], device='cuda:0', grad_fn=<SumBackward1>)

torch.pow(x, 2).sum(1, keepdim=True).expand(m, n):

tensor([

[539.8054, 539.8054, 539.8054, 539.8054, 539.8054, 539.8054, 539.8054,539.8054],
[457.5028, 457.5028, 457.5028, 457.5028, 457.5028, 457.5028, 457.5028,457.5028],
[474.0969, 474.0969, 474.0969, 474.0969, 474.0969, 474.0969, 474.0969,474.0969],
[557.4810, 557.4810, 557.4810, 557.4810, 557.4810, 557.4810, 557.4810,557.4810],
[778.4174, 778.4174, 778.4174, 778.4174, 778.4174, 778.4174, 778.4174,778.4174],
[474.9417, 474.9417, 474.9417, 474.9417, 474.9417, 474.9417, 474.9417,474.9417],
[503.8409, 503.8409, 503.8409, 503.8409, 503.8409, 503.8409, 503.8409,503.8409],
[625.3312, 625.3312, 625.3312, 625.3312, 625.3312, 625.3312, 625.3312,625.3312]], device='cuda:0', grad_fn=<ExpandBackward>)

同理,yy也是一样，只不过这里有个转置操作。

dist=xx+yy就可以得到a²+b²。

注意addmm_和addmm区别。

最后dist就是我们得到的矩阵距离。

def euclidean_dist(x, y):
    """
    Args:
      x: pytorch Variable, with shape [m, d]
      y: pytorch Variable, with shape [n, d]
    Returns:
      dist: pytorch Variable, with shape [m, n]
    """
    m, n = x.size(0), y.size(0)
    xx = torch.pow(x, 2).sum(1, keepdim=True).expand(m, n)  # a²
    yy = torch.pow(y, 2).sum(1, keepdim=True).expand(n, m).t()  # b^2
    dist = xx + yy  # a^2+b^2
    dist.addmm_(1, -2, x, y.t())  # a^2+b^2 - 2ab
    dist = dist.clamp(min=1e-12).sqrt()  # for numerical stability  限定一下范围防止为最小出现0无法求导
    return dist

此时得到的矩阵距离为，shape为[batch_size,batch_size]：

tensor([[1.0000e-06, 4.3200e+00, 4.1502e+00, 3.7251e+00, 7.3499e+00, 3.9080e+00,
3.6081e+00, 4.4757e+00],
[4.3200e+00, 7.8125e-03, 3.5319e+00, 4.8321e+00, 9.8512e+00, 3.2775e+00,
3.6503e+00, 6.6219e+00],
[4.1502e+00, 3.5319e+00, 1.0000e-06, 4.3095e+00, 9.1650e+00, 3.4574e+00,
3.3446e+00, 5.8928e+00],
[3.7251e+00, 4.8321e+00, 4.3095e+00, 1.0000e-06, 6.7865e+00, 4.4078e+00,
3.6200e+00, 4.1992e+00],
[7.3499e+00, 9.8512e+00, 9.1650e+00, 6.7865e+00, 1.0000e-06, 9.0147e+00,
8.0675e+00, 5.2237e+00],
[3.9080e+00, 3.2775e+00, 3.4574e+00, 4.4078e+00, 9.0147e+00, 1.0000e-06,
3.1999e+00, 5.9490e+00],
[3.6081e+00, 3.6503e+00, 3.3446e+00, 3.6200e+00, 8.0675e+00, 3.1999e+00,
1.0000e-06, 5.0834e+00],
[4.4757e+00, 6.6219e+00, 5.8928e+00, 4.1992e+00, 5.2237e+00, 5.9490e+00,
5.0834e+00, 1.1049e-02]], device='cuda:0', grad_fn=<SqrtBackward>)

寻找困难样本

dist_mat:距离矩阵。

labels：标签

return_inds:是否返回索引。

代码中的N为batch大小。

labels.expand(N,N)，将labels的维度进行扩展(重复)，大小为[8,8]。

labels.expand(N,N):

tensor([[119, 119, 119, 119, 714, 714, 714, 714],
[119, 119, 119, 119, 714, 714, 714, 714],
[119, 119, 119, 119, 714, 714, 714, 714],
[119, 119, 119, 119, 714, 714, 714, 714],
[119, 119, 119, 119, 714, 714, 714, 714],
[119, 119, 119, 119, 714, 714, 714, 714],
[119, 119, 119, 119, 714, 714, 714, 714],
[119, 119, 119, 119, 714, 714, 714, 714]], device='cuda:0')

labels.expand(N,N).t()为转置操作。

tensor([[119, 119, 119, 119, 119, 119, 119, 119],
[119, 119, 119, 119, 119, 119, 119, 119],
[119, 119, 119, 119, 119, 119, 119, 119],
[119, 119, 119, 119, 119, 119, 119, 119],
[714, 714, 714, 714, 714, 714, 714, 714],
[714, 714, 714, 714, 714, 714, 714, 714],
[714, 714, 714, 714, 714, 714, 714, 714],
[714, 714, 714, 714, 714, 714, 714, 714]], device='cuda:0')

labels.expand(N, N).eq(labels.expand(N, N).t())。eq表示逐元素比较，判断对应位置上是否相等。为True表示为相同的ID(也就是相同的人)，也就是正样本。表示为False表示为不同的人，负样本。

tensor([[ True, True, True, True, False, False, False, False],
[ True, True, True, True, False, False, False, False],
[ True, True, True, True, False, False, False, False],
[ True, True, True, True, False, False, False, False],
[False, False, False, False, True, True, True, True],
[False, False, False, False, True, True, True, True],
[False, False, False, False, True, True, True, True],
[False, False, False, False, True, True, True, True]],
device='cuda:0')

tensor.ne()表示逐元素比较不相等的。与上面相反，相同ID为False，不同ID为True。

tensor([[False, False, False, False, True, True, True, True],
[False, False, False, False, True, True, True, True],
[False, False, False, False, True, True, True, True],
[False, False, False, False, True, True, True, True],
[ True, True, True, True, False, False, False, False],
[ True, True, True, True, False, False, False, False],
[ True, True, True, True, False, False, False, False],
[ True, True, True, True, False, False, False, False]],
device='cuda:0')

    dist_ap, relative_p_inds = torch.max(
        dist_mat[is_pos].contiguous().view(N, -1), 1, keepdim=True)

dist_mat是我们前面得到的距离矩阵，用is_pos进行筛选就可得到正样本的距离,并找到最大距离就是正样本距离，同时还可以得到索引。最大距离为最困难的正样本

dist_ap:

tensor([[4.3200],
[4.8321],
[4.3095],
[4.8321],
[9.0147],
[9.0147],
[8.0675],
[5.9490]], device='cuda:0', grad_fn=<MaxBackward0>)

同理得到最小距离就是最困的的负样本。

dist_an, relative_n_inds = torch.min(
        dist_mat[is_neg].contiguous().view(N, -1), 1, keepdim=True)

dist_an:

tensor([[3.6081],
[3.2775],
[3.3446],
[3.6200],
[6.7865],
[3.2775],
[3.3446],
[4.1992]], device='cuda:0', grad_fn=<MinBackward0>)

y = dist_an.new().resize_as_(dist_an).fill_(1)

dist_an.new()是创建一个type和device和dist_an一样的tensor[但此时没有任何内容]。

reisze_as_将tensor的shape与dist_an一样，fill表示用1进行填充。

计算loss.传入三个参数，dist_an为困难负样本距离矩阵。dist_ap为困难正样本的距离矩阵。这里的用于计算tripeltloss用的是nn.MarginRankingLoss

loss = self.ranking_loss(dist_an, dist_ap, y)

loss=tensor(2.6602, device='cuda:0', grad_fn=<MeanBackward0>)

def hard_example_mining(dist_mat, labels, return_inds=False):
    """For each anchor, find the hardest positive and negative sample.
    Args:
      dist_mat: pytorch Variable, pair wise distance between samples, shape [N, N]
      labels: pytorch LongTensor, with shape [N]
      return_inds: whether to return the indices. Save time if `False`(?)
    Returns:
      dist_ap: pytorch Variable, distance(anchor, positive); shape [N]
      dist_an: pytorch Variable, distance(anchor, negative); shape [N]
      p_inds: pytorch LongTensor, with shape [N];
        indices of selected hard positive samples; 0 <= p_inds[i] <= N - 1
      n_inds: pytorch LongTensor, with shape [N];
        indices of selected hard negative samples; 0 <= n_inds[i] <= N - 1
    NOTE: Only consider the case in which all labels have same num of samples,
      thus we can cope with all anchors in parallel.
    """

    assert len(dist_mat.size()) == 2
    assert dist_mat.size(0) == dist_mat.size(1)
    N = dist_mat.size(0)

    # shape [N, N]
    is_pos = labels.expand(N, N).eq(labels.expand(N, N).t())
    is_neg = labels.expand(N, N).ne(labels.expand(N, N).t())

    # `dist_ap` means distance(anchor, positive)
    # both `dist_ap` and `relative_p_inds` with shape [N, 1]
    dist_ap, relative_p_inds = torch.max(
        dist_mat[is_pos].contiguous().view(N, -1), 1, keepdim=True)
    # `dist_an` means distance(anchor, negative)
    # both `dist_an` and `relative_n_inds` with shape [N, 1]
    dist_an, relative_n_inds = torch.min(
        dist_mat[is_neg].contiguous().view(N, -1), 1, keepdim=True)
    # shape [N]
    dist_ap = dist_ap.squeeze(1)
    dist_an = dist_an.squeeze(1)

    if return_inds:
        # shape [N, N]
        ind = (labels.new().resize_as_(labels)
               .copy_(torch.arange(0, N).long())
               .unsqueeze(0).expand(N, N))
        # shape [N, 1]
        p_inds = torch.gather(
            ind[is_pos].contiguous().view(N, -1), 1, relative_p_inds.data)
        n_inds = torch.gather(
            ind[is_neg].contiguous().view(N, -1), 1, relative_n_inds.data)
        # shape [N]
        p_inds = p_inds.squeeze(1)
        n_inds = n_inds.squeeze(1)
        return dist_ap, dist_an, p_inds, n_inds

    return dist_ap, dist_an

完整代码：

class TripletLoss(object):
    """Modified from Tong Xiao's open-reid (https://github.com/Cysu/open-reid).
    Related Triplet Loss theory can be found in paper 'In Defense of the Triplet
    Loss for Person Re-Identification'."""

    # Triploss(Dap-Dan+α) margin就是α，ranking_loss就是计算(Dap-Dan+α)
    def __init__(self, margin=None):
        self.margin = margin
        if margin is not None:
            # 排序损失函数，D(x1,x2,y),x1,x2是给定的待排序的两个输入，y代表真实标签∈[-1,1].当y=1，x1排在x2之前，y=-1,x1排在x2之后。
            # max(0,-y * (x1-x2)+margin)
            # x1, x 2 x2x2排序正确且− y ∗ ( x 1 − x 2 ) > margin, 则loss为0
            self.ranking_loss = nn.MarginRankingLoss(margin=margin)
        # if margin为None的时候，采用softMarginLoss
        else:
            self.ranking_loss = nn.SoftMarginLoss()

    def __call__(self, global_feat, labels, normalize_feature=False):
        if normalize_feature:
            global_feat = normalize(global_feat, axis=-1)
        # 计算距离矩阵，欧式距离
        dist_mat = euclidean_dist(global_feat, global_feat)
        # 得到hard example,triHard loss.找最困难的正样本和最困难的负样本。
        dist_ap, dist_an = hard_example_mining(
            dist_mat, labels)
        y = dist_an.new().resize_as_(dist_an).fill_(1)
        if self.margin is not None:
            # dist_an:anchor和negative距离
            # dist_ap:anchor和positive距离
            # loss(x,y) = max(0,-y*(x1-x2)+margin)
            # MarginRankingLoss在计算triplet loss的时候与TripletMarginLoss不同的是传入的三个形参不再是原始的向量anchor,positive,negative,而是计算好的dap,dan
            loss = self.ranking_loss(dist_an, dist_ap, y)
        else:
            loss = self.ranking_loss(dist_an - dist_ap, y)
        return loss, dist_ap, dist_an

爱吃肉的鹏

关注

3
点赞
踩
7

收藏

觉得还不错? 一键收藏
打赏
0
评论
Reid度量学习Triplet loss代码解析。

该文章是对之前Reid损失函数理论学习的补充。从代码方面进行Triplet loss（三元组损失函数）的学习。以及包含Tirplet hard是如何找最困难的正负样本。
复制链接

扫一扫