Retinaface中match函数的理解

fksfdh

已于 2022-02-19 03:16:13 修改

阅读量422

点赞数

分类专栏： python 数据结构和算法机器学习与深度学习文章标签：深度学习 cnn 人工智能

于 2022-02-18 20:57:50 首次发布

本文链接：https://blog.csdn.net/fksfdh/article/details/123009295

版权

python 同时被 3 个专栏收录

55 篇文章 2 订阅

订阅专栏

机器学习与深度学习

20 篇文章 0 订阅

订阅专栏

数据结构和算法

11 篇文章 0 订阅

订阅专栏

对于match函数的理解：

def match(threshold, truths, priors, variances, labels, landms, loc_t, conf_t, landm_t, idx):
    #----------------------------------------------#
    #   计算所有的先验框和真实框的重合程度
    #----------------------------------------------#
    overlaps = jaccard(
        truths,
        point_form(priors)
    )
    #----------------------------------------------#
    #   所有真实框和先验框的最好重合程度
    #   best_prior_overlap [truth_box,1]
    #   best_prior_idx [truth_box,1]
    #----------------------------------------------#
    best_prior_overlap, best_prior_idx = overlaps.max(1, keepdim=True)
    best_prior_idx.squeeze_(1)
    best_prior_overlap.squeeze_(1)

    #----------------------------------------------#
    #   所有先验框和真实框的最好重合程度
    #   best_truth_overlap [1,prior]
    #   best_truth_idx [1,prior]
    #----------------------------------------------#
    best_truth_overlap, best_truth_idx = overlaps.max(0, keepdim=True)
    best_truth_idx.squeeze_(0)
    best_truth_overlap.squeeze_(0)

    #----------------------------------------------#
    #   用于保证每个真实框都至少有对应的一个先验框
    #----------------------------------------------#
    best_truth_overlap.index_fill_(0, best_prior_idx, 2)
    # 对best_truth_idx内容进行设置
    for j in range(best_prior_idx.size(0)):
        best_truth_idx[best_prior_idx[j]] = j

    #----------------------------------------------#
    #   获取每一个先验框对应的真实框[num_priors,4]
    #----------------------------------------------#
    matches = truths[best_truth_idx]            
    # Shape: [num_priors] 此处为每一个anchor对应的label取出来
    conf = labels[best_truth_idx]        
    matches_landm = landms[best_truth_idx]
           
    #----------------------------------------------#
    #   如果重合程度小于threhold则认为是背景
    #----------------------------------------------#
    conf[best_truth_overlap < threshold] = 0    
    #----------------------------------------------#
    #   利用真实框和先验框进行编码
    #   编码后的结果就是网络应该有的预测结果
    #----------------------------------------------#
    loc = encode(matches, priors, variances)
    landm = encode_landm(matches_landm, priors, variances)

    #----------------------------------------------#
    #   [num_priors, 4]
    #----------------------------------------------#
    loc_t[idx] = loc
    #----------------------------------------------#
    #   [num_priors]
    #----------------------------------------------#
    conf_t[idx] = conf
    #----------------------------------------------#
    #   [num_priors, 10]
    #----------------------------------------------#
    landm_t[idx] = landm

这里可以模拟一下：

import torch


truths=torch.randn(5,4) #[目标，每个边框坐标4个值]
print("truths:",truths)
labels=torch.Tensor([[1], [1],[1], [1],[1]])#因为定义的真实框的类别为1
print("labels:",labels)
overlaps = torch.rand(5, 10)#[目标，先验框的个数] iou 0-1
print("overlaps:",overlaps)

overlaps是真实框和先验框的iou矩阵。行为真实框，列为先验框。

其中truths代表的是真实框的坐标，为左上角和右下角的坐标

truths: tensor([[ 0.6497, -1.5030, 0.5384, -0.2993],
[-0.3744, 0.7603, -0.0568, 1.5666],
[ 0.2875, -1.1328, -1.5941, -0.1617],
[-0.9408, 1.1018, 1.3028, -2.0684],
[-0.2388, -1.0179, 0.6412, 1.0418]])
labels: tensor([[1.],
[1.],
[1.],
[1.],
[1.]])
overlaps: tensor([[0.5321, 0.7111, 0.2009, 0.0500, 0.6987, 0.1410, 0.5055, 0.4217, 0.5109,
0.8002],
[0.1247, 0.9751, 0.6667, 0.6870, 0.2429, 0.4588, 0.3245, 0.7241, 0.5880,
0.6663],
[0.8734, 0.7823, 0.8094, 0.1350, 0.4765, 0.2627, 0.4242, 0.9889, 0.8328,
0.6587],
[0.8570, 0.4075, 0.9889, 0.2479, 0.8518, 0.1139, 0.6657, 0.3636, 0.2662,
0.6904],
[0.5021, 0.8086, 0.7512, 0.6637, 0.2433, 0.9300, 0.8903, 0.3788, 0.4600,
0.7378]])

这里有两点需要注意：

1、首先真实框要找到和它匹配程度最大的先验框

2、然后剩余的每个先验框找到自己对应的真实框

假如某一列低于threshold也就是说明，这个先验框没有很好匹配任何的真实框，很多先验框基本为0，没有和它匹配程度很大的真实框。

真实框要找到和它匹配程度最大的先验框

通过iou矩阵按照行的方向就能找到每个真实框对应的先验框。

best_prior_overlap, best_prior_idx = overlaps.max(1, keepdim=True)
print(overlaps.max(1, keepdim=True))

max函数，1代表行的方向，0代表列方向。

torch.return_types.max(
values=tensor([[0.8002],
[0.9751],
[0.9889],
[0.9889],
[0.9300]]),
indices=tensor([[9],
[1],
[7],
[2],
[5]]))

还要在列方向找最大iou。为了后面让每个先验框找到对应于它们的真实框。

best_truth_overlap, best_truth_idx = overlaps.max(0, keepdim=True)
print(overlaps.max(0, keepdim=True))

torch.return_types.max(
values=tensor([[0.8734, 0.9751, 0.9889, 0.6870, 0.8518, 0.9300, 0.8903, 0.9889, 0.8328,
0.8002]]),
indices=tensor([[2, 1, 3, 1, 3, 4, 4, 2, 2, 0]]))

把不必要的维度删除。

best_prior_idx.squeeze_(1)  # [num_objects,1]->[num_objects]
best_prior_overlap.squeeze_(1) # [num_objects,1] -> [num_objects]


best_truth_idx.squeeze_(0)  # [1,num_priors] -> [num_priors]
best_truth_overlap.squeeze_(0)

保证了最好的iou保留了下来。

best_truth_overlap.index_fill_(0, best_prior_idx, 2)  #其中参数2，只要大于threshold就行
print("best_truth_overlap:",best_truth_overlap)

best_truth_overlap: tensor([0.8734, 2.0000, 2.0000, 0.6870, 0.8518, 2.0000, 0.8903, 2.0000, 0.8328,2.0000])

因为上面保证每一个GT匹配它的都是具有最大IoU的Prior，也要同时修改best_truth_idx中每个prior相对应。

for j in range(best_prior_idx.size(0)):
    best_truth_idx[best_prior_idx[j]] = j


 print("best_truth_idx:",best_truth_idx)

best_truth_idx: tensor([2, 1, 3, 1, 3, 4, 4, 2, 2, 0])

把每个先验框找到自己对应的真实框找到。

matches = truths[best_truth_idx]#每一个PriorBox对应的bbox取出来
print("matches:",matches)

matches: tensor([[ 0.2875, -1.1328, -1.5941, -0.1617],
[-0.3744, 0.7603, -0.0568, 1.5666],
[-0.9408, 1.1018, 1.3028, -2.0684],
[-0.3744, 0.7603, -0.0568, 1.5666],
[-0.9408, 1.1018, 1.3028, -2.0684],
[-0.2388, -1.0179, 0.6412, 1.0418],
[-0.2388, -1.0179, 0.6412, 1.0418],
[ 0.2875, -1.1328, -1.5941, -0.1617],
[ 0.2875, -1.1328, -1.5941, -0.1617],
[ 0.6497, -1.5030, 0.5384, -0.2993]])

此处为每一个anchor对应的label取出来


print("best_truth_overlap：",best_truth_overlap)

conf = labels[best_truth_idx] 

print("conf:",conf)

best_truth_overlap： tensor([0.8734, 2.0000, 2.0000, 0.6870, 0.8518, 2.0000, 0.8903, 2.0000, 0.8328,2.0000])

conf: tensor([[1.],
[1.],
[1.],
[1.],
[1.],
[1.],
[1.],
[1.],
[1.],
[1.]])

如果重合程度小于threhold则认为是背景

threshold=0.45
conf[best_truth_overlap < threshold] = 0  #过滤掉iou太低的,标记为background
print("conf:",conf)

conf: tensor([[1.],
[1.],
[1.],
[1.],
[1.],
[1.],
[1.],
[1.],
[1.],
[1.]])

Process finished with exit code 0

链接：

https://blog.csdn.net/weixin_41779359/article/details/111414567

https://blog.csdn.net/weixin_44791964/article/details/106872072#_9

fksfdh

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
Retinaface中match函数的理解

1、网络整体架构在之前首先要知道三个概念：先验框，真实框和预测框在a图中：蓝色和红色框就是真实框，是人为标注的标签。在b图和c图中，因为通过网络进行了下采样后，下采样不同的倍数，得到的每个feature map中每个格子(点)叫做anchor，然后在anchor的基础上，人为定义一些宽高比，图中定义了三种宽高比，然后每个格子都对应了三个先验框。而预测框是在有了先验框之后，通过网络前向计算得到宽高和中心的调整值，然后作用在先验框后，最后得到了预测框。前向预测的解码：在R
复制链接

扫一扫

专栏目录