DatawhalTask04：不讲武德-炼丹与品尝

最新推荐文章于 2022-12-21 10:20:55 发布

起名字什么的好难

最新推荐文章于 2022-12-21 10:20:55 发布

阅读量110

点赞数

分类专栏：人工智能 Python 数字图像处理

本文链接：https://blog.csdn.net/u012428169/article/details/111825283

版权

Python 同时被 3 个专栏收录

92 篇文章 3 订阅

订阅专栏

人工智能

37 篇文章 1 订阅

订阅专栏

数字图像处理

19 篇文章 3 订阅

订阅专栏

训练流程：

设置各种超参数
定义数据加载模块 dataloader
定义网络 model
定义损失函数 loss
定义优化器 optimizer
遍历训练数据，预测-计算loss-反向传播
对模型预测进行后处理

测试流程：

读取一张图片
预处理
模型预测
对模型预测进行后处理

后处理：

目标框信息解码
NMS非极大值抑制

######
#此代码只包含单个图像后处理，批处理参考原文
######
# Decode object coordinates from the form we regressed predicted boxes to
        decoded_locs = cxcy_to_xy(                                                                                                                            
            gcxgcy_to_cxcy(predicted_locs[i], self.priors_cxcy))  # (441, 4), these are fractional pt. coordinates

        # Lists to store boxes and scores for this image
        image_boxes = list()
        image_labels = list()
        image_scores = list()

        max_scores, best_label = predicted_scores[i].max(dim=1)  # (441)

        # Check for each class
        for c in range(1, self.n_classes):
            # Keep only predicted boxes and scores where scores for this class are above the minimum score
            class_scores = predicted_scores[i][:, c]  # (441)
            score_above_min_score = class_scores > min_score  # torch.uint8 (byte) tensor, for indexing
            n_above_min_score = score_above_min_score.sum().item()
            if n_above_min_score == 0:
                continue
            class_scores = class_scores[score_above_min_score]  # (n_qualified), n_min_score <= 441
            class_decoded_locs = decoded_locs[score_above_min_score]  # (n_qualified, 4)

            # Sort predicted boxes and scores by scores
            class_scores, sort_ind = class_scores.sort(dim=0, descending=True)  # (n_qualified), (n_min_score)
            class_decoded_locs = class_decoded_locs[sort_ind]  # (n_min_score, 4)

            # Find the overlap between predicted boxes
            overlap = find_jaccard_overlap(class_decoded_locs, class_decoded_locs)  # (n_qualified, n_min_score)

            # Non-Maximum Suppression (NMS)

            # A torch.uint8 (byte) tensor to keep track of which predicted boxes to suppress
            # 1 implies suppress, 0 implies don't suppress
            suppress = torch.zeros((n_above_min_score), dtype=torch.uint8).to(device)  # (n_qualified)

            # Consider each box in order of decreasing scores
            for box in range(class_decoded_locs.size(0)):
                # If this box is already marked for suppression
                if suppress[box] == 1:
                    continue

                # Suppress boxes whose overlaps (with current box) are greater than maximum overlap
                # Find such boxes and update suppress indices
                suppress = torch.max(suppress, (overlap[box] > max_overlap).to(torch.uint8))
                # The max operation retains previously suppressed boxes, like an 'OR' operation

                # Don't suppress this box, even though it has an overlap of 1 with itself
                suppress[box] = 0

            # Store only unsuppressed boxes for this class
            image_boxes.append(class_decoded_locs[1 - suppress])
            image_labels.append(torch.LongTensor((1 - suppress).sum().item() * [c]).to(device))
            image_scores.append(class_scores[1 - suppress])

        # If no object in any class is found, store a placeholder for 'background'
        if len(image_boxes) == 0:
            image_boxes.append(torch.FloatTensor([[0., 0., 1., 1.]]).to(device))
            image_labels.append(torch.LongTensor([0]).to(device))
            image_scores.append(torch.FloatTensor([0.]).to(device))

        # Concatenate into single tensors
        image_boxes = torch.cat(image_boxes, dim=0)  # (n_objects, 4)
        image_labels = torch.cat(image_labels, dim=0)  # (n_objects)
        image_scores = torch.cat(image_scores, dim=0)  # (n_objects)
        n_objects = image_scores.size(0)

        # Keep only the top k objects
        if n_objects > top_k:
            image_scores, sort_ind = image_scores.sort(dim=0, descending=True)
            image_scores = image_scores[:top_k]  # (top_k)
            image_boxes = image_boxes[sort_ind][:top_k]  # (top_k, 4)
            image_labels = image_labels[sort_ind][:top_k]  # (top_k)

NMS非极大值抑制

按照类别分组，依次遍历每个类别。
当前类别按分类置信度排序，并且设置一个最低置信度阈值，低于这个阈值的目标框直接舍弃。
当前概率最高的框作为候选框，其它所有与候选框的IOU高于一个阈值的框认为需要被抑制，从剩余框数组中删除。
然后在剩余的框里寻找概率第二大的框，其它所有与第二大的框的IOU高于设定阈值的框被抑制。
依次类推重复这个过程，直至遍历完所有剩余框，所有没被抑制的框即为最终检测框。

Fast R-CNN中的NMS实现

# --------------------------------------------------------
# Fast R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick
# --------------------------------------------------------
import numpy as np
# dets: 检测的 boxes 及对应的 scores；
# thresh: 设定的阈值

def nms(dets,thresh):
    # boxes 位置
    x1 = dets[:,0] 
    y1 = dets[:,1] 
    x2 = dets[:,2]
    y2 = dets[:,3]
    # boxes scores
    scores = dets[:,4]
    areas = (x2-x1+1)*(y2-y1+1)   # 各box的面积
    order = scores.argsort()[::-1]  # 分类置信度排序
    keep = []                        # 记录保留下的 boxes
    while order.size > 0:
        i = order[0]               # score最大的box对应的 index
        keep.append(i)        # 将本轮score最大的box的index保留
        \# 计算剩余 boxes 与当前 box 的重叠程度 IoU
        xx1 = np.maximum(x1[i],x1[order[1:]])
        yy1 = np.maximum(y1[i],y1[order[1:]])
        xx2 = np.minimum(x2[i],x2[order[1:]])
        yy2 = np.minimum(y2[i],y2[order[1:]])
        w = np.maximum(0.0,xx2-xx1+1) # IoU
        h = np.maximum(0.0,yy2-yy1+1)
        inter = w*h
        ovr = inter/(areas[i]+areas[order[1:]]-inter)
        \# 保留 IoU 小于设定阈值的 boxes
        inds = np.where(ovr<=thresh)[0]
        order = order[inds+1]
    return keep

VOC测试集评测指标

一级指标

 1. 真实值是positive，模型认为是positive的数量（True Positive=TP）
 2. 真实值是positive，模型认为是negative的数量（False Negative = FN）
 3. 真实值是negative，模型认为是positive的数量（False Positive = FP）
 4. 真实值是negative，模型认为是negative的数量（True Negative = TN）

混淆矩阵记录了一级指标并且便于计算二级指标。
在这里插入图片描述

二级指标

 1. 准确率（Accuracy）-----针对整个模型
 2. 精确率（Precision）
 3. 灵敏度（Sensitivity）：就是召回率（Recall）
 4. 特异度（Specificity）

在这里插入图片描述

三级指标

 1. F1 Score
		F1 Score = 2PR / P+R
		其中，P代表Precision，R代表Recall（召回率）。F1-Score指标综合了Precision与Recall的产出的结果。F1-Score的取值范围从0到1,1代表模型的输出最好，0代表模型的输出结果最差。		
 2. AP指标即Average Precision 即平均精确度。
 3. mAP即Mean Average Precision即平均AP值，是对多个验证集个体求平均AP值，作为object
    detection中衡量检测精度的指标。

P-R曲线：以precision和recall作为纵、横轴坐标的二维曲线。通过选取不同阈值时对应的精度和召回率画出。
在这里插入图片描述
在目标检测中，每一类都可以根据recall和precision绘制P-R曲线，AP就是该曲线下的面积，mAP就是所有类的AP的平均值（VOC数据集计算方法）。

参考资料：

3.6、训练与测试
 3.6、训练与测试代码

混淆矩阵、准确率、精确率/查准率、召回率/查全率、F1值、ROC曲线的AUC值
 混淆矩阵

NMS(非极大值抑制算法)
【56】目标检测之NMS非极大值抑制
 图像处理中常用的非极大值抑制是什么意思？

起名字什么的好难

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
DatawhalTask04：不讲武德-炼丹与品尝

训练流程：设置各种超参数定义数据加载模块 dataloader定义网络 model定义损失函数 loss定义优化器 optimizer遍历训练数据，预测-计算loss-反向传播对模型预测进行后处理测试流程：读取一张图片预处理模型预测对模型预测进行后处理后处理：目标框信息解码NMS非极大值抑制#######此代码只包含单个图像后处理，批处理参考原文####### Decode object coordinates from the form we regres
复制链接

扫一扫