YOLOv7输出层之间的热力图

最新推荐文章于 2024-08-31 22:46:14 发布

Limiiiing

最新推荐文章于 2024-08-31 22:46:14 发布

阅读量154

点赞数 4

分类专栏：脚本文章标签： YOLO 计算机视觉

本文链接：https://blog.csdn.net/qq_42591591/article/details/141752868

版权

脚本专栏收录该内容

5 篇文章 0 订阅

订阅专栏

我们经常看到一些论文里绘制了不同的热力图，来直观的感受其模型的有效性。特别是使用了注意力模块的网络，热力图就可以验证注意力机制是否真正聚焦到了预期的重要特征上，以便对模型的有效性和合理性进行评估。

例如Centralized Feature Pyramid for Object Detection这篇文章中展示的，就很能够表达作者改进后的模型相比之前模型的一个优越性。

在这里插入图片描述
本文就来记录一下如何使用python脚本来输出YOLOv7层之间的热力图。

添加步骤：

1️⃣ 在本地的YOLOv7项目的根目录下新建heatmap.py，将以下代码复制到其中

import warnings
warnings.filterwarnings('ignore')
warnings.simplefilter('ignore')
import torch, yaml, cv2, os, shutil
import torch.nn as nn
import numpy as np
np.random.seed(0)
import matplotlib.pyplot as plt
from tqdm import trange
from PIL import Image
from models.yolo import Model
from utils.torch_utils import intersect_dicts
from utils.datasets import letterbox
from utils.general import xywh2xyxy
from pytorch_grad_cam import GradCAMPlusPlus, GradCAM, XGradCAM
from pytorch_grad_cam.utils.image import show_cam_on_image
from pytorch_grad_cam.activations_and_gradients import ActivationsAndGradients

class yolov7_heatmap:
    def __init__(self, weight, cfg, device, method, layer, backward_type, conf_threshold, ratio):
        device = torch.device(device)
        ckpt = torch.load(weight)
        model_names = ckpt['model'].names
        csd = ckpt['model'].float().state_dict()  # checkpoint state_dict as FP32
        model = Model(cfg, ch=3, nc=len(model_names)).to(device)
        csd = intersect_dicts(csd, model.state_dict(), exclude=['anchor'])  # intersect
        model.load_state_dict(csd, strict=False)  # load
        model.eval()
        print(f'Transferred {len(csd)}/{len(model.state_dict())} items')
        
        target_layers = [eval(layer)]
        method = eval(method)

        colors = np.random.uniform(0, 255, size=(len(model_names), 3)).astype(np.int)
        self.__dict__.update(locals())
    
    def post_process(self, result):
        boxes_ = result[0][..., :4]
        logits_ = []
        for data in result[1]:
            bs, n, w, h, _ = data.size()
            logits_.append(data.reshape((bs, n * w * h, _)))
        logits_ = torch.cat(logits_, dim=1)[..., 4:]
        sorted, indices = torch.sort(logits_[..., 0], descending=True)
        logits_ = logits_[0][indices[0]]
        logits_[:, 0] = torch.sigmoid(logits_[:, 0])
        return logits_, xywh2xyxy(boxes_[0][indices[0]]).cpu().detach().numpy()

    def draw_detections(self, box, color, name, img):
        xmin, ymin, xmax, ymax = list(map(int, list(box)))
        cv2.rectangle(img, (xmin, ymin), (xmax, ymax), tuple(int(x) for x in color), 2)
        cv2.putText(img, str(name), (xmin, ymin - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.8, tuple(int(x) for x in color), 2, lineType=cv2.LINE_AA)
        return img

    def __call__(self, img_path, save_path):
        # remove dir if exist
        if os.path.exists(save_path):
            shutil.rmtree(save_path)
        # make dir if not exist
        os.makedirs(save_path, exist_ok=True)

        # img process
        img = cv2.imread(img_path)
        img = letterbox(img)[0]
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        img = np.float32(img) / 255.0
        tensor = torch.from_numpy(np.transpose(img, axes=[2, 0, 1])).unsqueeze(0).to(self.device)

        # init ActivationsAndGradients
        grads = ActivationsAndGradients(self.model, self.target_layers, reshape_transform=None)

        # get ActivationsAndResult
        result = grads(tensor)
        activations = grads.activations[0].cpu().detach().numpy()

        # postprocess to yolo output
        post_result, post_boxes = self.post_process(result)
        for i in trange(int(post_result.size(0) * self.ratio)):
            if post_result[i][0] < self.conf_threshold:
                break

            self.model.zero_grad()
            if self.backward_type == 'conf':
                post_result[i, 0].backward(retain_graph=True)
            else:
                # get max probability for this prediction
                score = post_result[i, 1:].max()
                score.backward(retain_graph=True)

            # process heatmap
            gradients = grads.gradients[0]
            b, k, u, v = gradients.size()
            weights = self.method.get_cam_weights(self.method, None, None, None, activations, gradients.detach().numpy())
            weights = weights.reshape((b, k, 1, 1))
            saliency_map = np.sum(weights * activations, axis=1)
            saliency_map = np.squeeze(np.maximum(saliency_map, 0))
            saliency_map = cv2.resize(saliency_map, (tensor.size(3), tensor.size(2)))
            saliency_map_min, saliency_map_max = saliency_map.min(), saliency_map.max()
            if (saliency_map_max - saliency_map_min) == 0:
                continue
            saliency_map = (saliency_map - saliency_map_min) / (saliency_map_max - saliency_map_min)

            # add heatmap and box to image
            cam_image = show_cam_on_image(img.copy(), saliency_map, use_rgb=True)
            #cam_image = self.draw_detections(post_boxes[i], self.colors[int(post_result[i, 1:].argmax())], f'{self.model_names[int(post_result[i, 1:].argmax())]} {post_result[i][0]:.2f}', cam_image)
            cam_image = Image.fromarray(cam_image)
            cam_image.save(f'{save_path}/{i}.png')

def get_params():
    params = {
        'weight': 'runs/train/exp/weights/best.pt',  
        'cfg': 'cfg/training/yolov7_test.yaml',
        'device': 'cuda:0',
        'method': 'GradCAM', # GradCAMPlusPlus, GradCAM, XGradCAM
        'layer': 'model.model[-2]',  
        'backward_type': 'class', # class or conf
        'conf_threshold': 0.6, # 0.6
        'ratio': 0.02 # 0.02-0.1
    }
    return params

if __name__ == '__main__':
    model = yolov7_heatmap(**get_params())
    model('inference/heat_image/001.jpg', 'heat_result')

2️⃣ 修改配置参数

文件中的主要参数配置如下：

在这里插入图片描述

参数	解释
weight	权重路径，训练完成后的权重文件
cfg	模型文件路径，与权重所训练出来的模型文件一致
device	运行的设备，和模型训练时的device参数设置一致
method	可选择GradCAM，GradCAMPlusPlus和XGradCAM ，可以都试试，效果不同
layer	想要输出第几层的热力图就写几，我这里写的的-2，即倒数第二层，可以多换换，看看效果
backward_type	反向传播的计算类型，class表示按照类别最大概率进行计算或 conf 通过置信度计算梯度
conf_threshold	置信度阈值，设置成0.6
ratio	取前多少数据，设置成0.02

在这里插入图片描述

箭头指向的数据就是行号。

3️⃣ 数据源

在这里插入图片描述
在model('inference/heat_image/001.jpg', 'heat_result')中：

第一个参数inference/heat_image/001.jpg表示想要进行热力图绘制的原图像路径。

第二个参数'heat_result'表示绘制完成后输出的文件夹路径。

4️⃣ 调试

在这里插入图片描述
此时就已经绘制完成了，在指定的文件夹下就已经输出了热力图了。进度条还没有满就停止，是因为后面的目标已经不满足置信度conf_threshold的设定值。

这个进度条的长度151是之前设定的参数ratio的结果，其只会选择前0.02的目标进行热力图可视化。

博客参考链接
 代码参考链接

Limiiiing

关注

4
点赞
踩
6

收藏

觉得还不错? 一键收藏
打赏
0
评论
YOLOv7输出层之间的热力图

我们经常看到一些论文里绘制了不同的热力图，来直观的感受其模型的有效性。特别是使用了注意力模块的网络，热力图就可以验证注意力机制是否真正聚焦到了预期的重要特征上，以便对模型的有效性和合理性进行评估。此时就已经绘制完成了，在指定的文件夹下就已经输出了热力图了。本文就来记录一下如何使用python脚本来输出YOLOv7层之间的热力图。这篇文章中展示的，就很能够表达作者改进后的模型相比之前模型的一个优越性。的结果，其只会选择前0.02的目标进行热力图可视化。这个进度条的长度151是之前设定的参数。
复制链接

扫一扫