特征可视化————YOLOv8查看网络结构及打印热力图

SQingL

于 2024-06-27 14:23:30 发布

阅读量1.5k

点赞数 10

文章标签： YOLO

本文链接：https://blog.csdn.net/weixin_43826536/article/details/140011711

版权

在深度学习模型中，特征可视化是指通过可视化方法展示模型内部特征的提取和表示过程。这有助于理解模型的决策过程和它在不同层级上关注的区域。特征可视化在调试、优化和解释模型方面起着关键作用。

一.网络结构可视化

1.将pt文件转换为onnx文件

首先将需要可视化的模型转换为onnx文件，使用如下代码，运行后会在同级目录下生产onnx文件。

from ultralytics import YOLO
 # 加载训练好的模型
model = YOLO("yolov8n.pt")
# # 将模型转为onnx格式
success = model.export(format='onnx')

2.使用Netron可视化网络结构

Netron是一款开源的可视化工具，专门用于查看神经网络、卷积神经网络、循环神经网络和其他机器学习模型的结构和权重。

支持多种文件格式：包括TensorFlow、PyTorch、ONNX和Caffe在内的多种流行框架的模型文件。
实时更新显示：在更改模型或调整参数时，Netron会立即更新其可视化表示。
丰富的交互功能：提供了丰富的交互功能，如缩放、平移、搜索等。
易于使用和部署：是一个跨平台工具，可以在Linux、Windows和Mac上运行。

打开Netron官网，界面如下：

选择上传的模型：

打开后，会显示如上图所示的模型结构，可导出PNG,SVG格式图片；右侧可查看每个模块的参数信息，如卷积核大小，stride等。

通过上述，操作我们就可以清楚的看到整个网络结构，并且可以查看每个网络节点的详细信息

二.特征可视化

1.可视化过程

可视化过程一般包括：

图像预处理：对输入图像进行预处理，包括调整大小、颜色空间转换、归一化等，以满足模型输入的要求。
前向传播：将预处理后的图像输入模型，进行前向传播，获取模型的预测结果和中间激活值。
计算梯度：选择目标类别，并计算目标类别相对于指定层激活值的梯度。
生成热力图：通过计算梯度权重，对激活值进行加权，生成类激活图或热力图。
热力图叠加：将生成的热力图叠加到原图上，以可视化模型关注的区域。

2.实际操作

实现YOLOv8模型的热力图可视化，突出显示模型在目标检测时关注的区域。

依赖库介绍

warnings: 忽略警告信息
torch: PyTorch深度学习框架
cv2: OpenCV库，用于图像处理
os 和 shutil: 用于文件和目录操作
numpy: 数值计算库
matplotlib.pyplot: 用于绘制图像
tqdm: 进度条显示库
PIL.Image: 用于图像加载和处理
ultralytics: YOLOv8模型相关库
pytorch_grad_cam: 用于生成Grad-CAM热力图

调整图像大小和填充，使其符合模型输入的要求，保持图像的宽高比例

def letterbox(im, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True, stride=32):
    # Resize and pad image while meeting stride-multiple constraints
    shape = im.shape[:2]  # current shape [height, width]
    if isinstance(new_shape, int):
        new_shape = (new_shape, new_shape)

    # Scale ratio (new / old)
    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
    if not scaleup:  # only scale down, do not scale up (for better val mAP)
        r = min(r, 1.0)

    # Compute padding
    ratio = r, r  # width, height ratios
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
    dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding
    if auto:  # minimum rectangle
        dw, dh = np.mod(dw, stride), np.mod(dh, stride)  # wh padding
    elif scaleFill:  # stretch
        dw, dh = 0.0, 0.0
        new_unpad = (new_shape[1], new_shape[0])
        ratio = new_shape[1] / shape[1], new_shape[0] / shape[0]  # width, height ratios

    dw /= 2  # divide padding into 2 sides
    dh /= 2

    if shape[::-1] != new_unpad:  # resize
        im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
    im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border
    return im, ratio, (dw, dh)

加载模型权重和配置文件，选择目标检测层和热力图方法，生成随机颜色用于绘制检测框。

class yolov8_heatmap:
    def __init__(self, weight, cfg, device, method, layer, backward_type, conf_threshold, ratio):
        device = torch.device(device)
        ckpt = torch.load(weight)
        model_names = ckpt['model'].names
        csd = ckpt['model'].float().state_dict()  # checkpoint state_dict as FP32
        model = Model(cfg, ch=3, nc=len(model_names)).to(device)
        csd = intersect_dicts(csd, model.state_dict(), exclude=['anchor'])  # intersect
        model.load_state_dict(csd, strict=False)  # load
        model.eval()
        print(f'Transferred {len(csd)}/{len(model.state_dict())} items')

        target_layers = [eval(layer)]
        method = eval(method)

        colors = np.random.uniform(0, 255, size=(len(model_names), 3)).astype(np.int32)
        self.__dict__.update(locals())

对模型输出结果进行后处理，提取类别概率和边界框坐标，并将坐标从 xywh 格式转换为 xyxy 格式。

def post_process(self, result):
    logits_ = result[:, 4:]
    boxes_ = result[:, :4]
    sorted, indices = torch.sort(logits_.max(1)[0], descending=True)
    return torch.transpose(logits_[0], dim0=0, dim1=1)[indices[0]], torch.transpose(boxes_[0], dim0=0, dim1=1)[indices[0]], xywh2xyxy(torch.transpose(boxes_[0], dim0=0, dim1=1)[indices[0]]).cpu().detach().numpy()

在图像上绘制检测到的目标边界框和类别标签

def draw_detections(self, box, color, name, img):
    xmin, ymin, xmax, ymax = list(map(int, list(box)))
    cv2.rectangle(img, (xmin, ymin), (xmax, ymax), tuple(int(x) for x in color), 2)
    cv2.putText(img, str(name), (xmin, ymin - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.8, tuple(int(x) for x in color), 2, lineType=cv2.LINE_AA)
    return img

完整代码：

import warnings

warnings.filterwarnings('ignore')
warnings.simplefilter('ignore')
import torch, cv2, os, shutil
import numpy as np

np.random.seed(0)
import matplotlib.pyplot as plt
from tqdm import trange
from PIL import Image
from ultralytics.nn.tasks import DetectionModel as Model
from ultralytics.yolo.utils.torch_utils import intersect_dicts
from ultralytics.yolo.utils.ops import xywh2xyxy
from pytorch_grad_cam import GradCAMPlusPlus, GradCAM, XGradCAM
from pytorch_grad_cam.utils.image import show_cam_on_image
from pytorch_grad_cam.activations_and_gradients import ActivationsAndGradients


def letterbox(im, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True, stride=32):
    # Resize and pad image while meeting stride-multiple constraints
    shape = im.shape[:2]  # current shape [height, width]
    if isinstance(new_shape, int):
        new_shape = (new_shape, new_shape)

    # Scale ratio (new / old)
    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
    if not scaleup:  # only scale down, do not scale up (for better val mAP)
        r = min(r, 1.0)

    # Compute padding
    ratio = r, r  # width, height ratios
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
    dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding
    if auto:  # minimum rectangle
        dw, dh = np.mod(dw, stride), np.mod(dh, stride)  # wh padding
    elif scaleFill:  # stretch
        dw, dh = 0.0, 0.0
        new_unpad = (new_shape[1], new_shape[0])
        ratio = new_shape[1] / shape[1], new_shape[0] / shape[0]  # width, height ratios

    dw /= 2  # divide padding into 2 sides
    dh /= 2

    if shape[::-1] != new_unpad:  # resize
        im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
    im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border
    return im, ratio, (dw, dh)


class yolov8_heatmap:
    def __init__(self, weight, cfg, device, method, layer, backward_type, conf_threshold, ratio):
        device = torch.device(device)
        ckpt = torch.load(weight)
        model_names = ckpt['model'].names
        csd = ckpt['model'].float().state_dict()  # checkpoint state_dict as FP32
        model = Model(cfg, ch=3, nc=len(model_names)).to(device)
        csd = intersect_dicts(csd, model.state_dict(), exclude=['anchor'])  # intersect
        model.load_state_dict(csd, strict=False)  # load
        model.eval()
        print(f'Transferred {len(csd)}/{len(model.state_dict())} items')

        target_layers = [eval(layer)]
        method = eval(method)

        colors = np.random.uniform(0, 255, size=(len(model_names), 3)).astype(np.int32)
        self.__dict__.update(locals())

    def post_process(self, result):
        logits_ = result[:, 4:]
        boxes_ = result[:, :4]
        sorted, indices = torch.sort(logits_.max(1)[0], descending=True)
        return torch.transpose(logits_[0], dim0=0, dim1=1)[indices[0]], torch.transpose(boxes_[0], dim0=0, dim1=1)[
            indices[0]], xywh2xyxy(torch.transpose(boxes_[0], dim0=0, dim1=1)[indices[0]]).cpu().detach().numpy()

    def draw_detections(self, box, color, name, img):
        xmin, ymin, xmax, ymax = list(map(int, list(box)))
        cv2.rectangle(img, (xmin, ymin), (xmax, ymax), tuple(int(x) for x in color), 2)
        cv2.putText(img, str(name), (xmin, ymin - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.8, tuple(int(x) for x in color), 2,
                    lineType=cv2.LINE_AA)
        return img

    def __call__(self, img_path, save_path):
        # remove dir if exist
        if os.path.exists(save_path):
            shutil.rmtree(save_path)
        # make dir if not exist
        os.makedirs(save_path, exist_ok=True)

        # img process
        img = cv2.imread(img_path)
        img = letterbox(img)[0]
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        img = np.float32(img) / 255.0
        tensor = torch.from_numpy(np.transpose(img, axes=[2, 0, 1])).unsqueeze(0).to(self.device)

        # init ActivationsAndGradients
        grads = ActivationsAndGradients(self.model, self.target_layers, reshape_transform=None)

        # get ActivationsAndResult
        result = grads(tensor)
        activations = grads.activations[0].cpu().detach().numpy()

        # postprocess to yolo output
        post_result, pre_post_boxes, post_boxes = self.post_process(result[0])
        for i in trange(int(post_result.size(0) * self.ratio)):
            if float(post_result[i].max()) < self.conf_threshold:
                break

            self.model.zero_grad()
            # get max probability for this prediction
            if self.backward_type == 'class' or self.backward_type == 'all':
                score = post_result[i].max()
                score.backward(retain_graph=True)

            if self.backward_type == 'box' or self.backward_type == 'all':
                for j in range(4):
                    score = pre_post_boxes[i, j]
                    score.backward(retain_graph=True)

            # process heatmap
            if self.backward_type == 'class':
                gradients = grads.gradients[0]
            elif self.backward_type == 'box':
                gradients = grads.gradients[0] + grads.gradients[1] + grads.gradients[2] + grads.gradients[3]
            else:
                gradients = grads.gradients[0] + grads.gradients[1] + grads.gradients[2] + grads.gradients[3] + \
                            grads.gradients[4]
            b, k, u, v = gradients.size()
            weights = self.method.get_cam_weights(self.method, None, None, None, activations,
                                                  gradients.detach().numpy())
            weights = weights.reshape((b, k, 1, 1))
            saliency_map = np.sum(weights * activations, axis=1)
            saliency_map = np.squeeze(np.maximum(saliency_map, 0))
            saliency_map = cv2.resize(saliency_map, (tensor.size(3), tensor.size(2)))
            saliency_map_min, saliency_map_max = saliency_map.min(), saliency_map.max()
            if (saliency_map_max - saliency_map_min) == 0:
                continue
            saliency_map = (saliency_map - saliency_map_min) / (saliency_map_max - saliency_map_min)

            # add heatmap and box to image
            cam_image = show_cam_on_image(img.copy(), saliency_map, use_rgb=True)
            cam_image = Image.fromarray(cam_image)
            cam_image.save(f'{save_path}/{i}.png')

def get_params():
    params = {
        'weight': './yolov8n.pt', # 这选择想要热力可视化的模型权重路径
        'cfg': r'F:\Project\ultralyticsnew\ultralytics\models\v8\yolov8.yaml', # 这里选择与训练上面模型权重相对应的.yaml文件路径
        'device': 'cpu', # 选择设备，其中0表示0号显卡。如果使用CPU可视化 # 'device': 'cpu'
        'method': 'GradCAM', # GradCAMPlusPlus, GradCAM, XGradCAM
        'layer': 'model.model[18]',   # 选择特征层
        'backward_type': 'all', # class, box, all
        'conf_threshold': 0.65, # 置信度阈值默认0.65， 可根据情况调节
        'ratio': 0.02 # 取前多少数据，默认是0.02，可根据情况调节
    }
    return params

if __name__ == '__main__':
    model = yolov8_heatmap(**get_params()) # 初始化
    model(r'F:\Project\ultralyticsnew\datasets\bus.jpg', './result/map_18') # 第一个参数是图片的路径，第二个参数是保存路径，比如是result的话，其会创建一个名字为result的文件夹，如果result文件夹不为空，其会先清空文件夹。

参数信息：

    params = {
        'weight': './yolov8n.pt', # 这选择想要热力可视化的模型权重路径
        'cfg': r'F:\Project\ultralyticsnew\ultralytics\models\v8\yolov8.yaml', # 这里选择与训练上面模型权重相对应的.yaml文件路径
        'device': 'cpu', # 选择设备，其中0表示0号显卡。如果使用CPU可视化 # 'device': 'cpu'
        'method': 'GradCAM', # GradCAMPlusPlus, GradCAM, XGradCAM
        'layer': 'model.model[18]',   # 选择特征层
        'backward_type': 'all', # class, box, all
        'conf_threshold': 0.65, # 置信度阈值默认0.65， 可根据情况调节
        'ratio': 0.02 # 取前多少数据，默认是0.02，可根据情况调节
    }
    return

！！！！注意：

运行代码后，如果下图中的分子和分母不相等，证明cfg文件和模型权重不匹配

打印出的热力图如下所示：

SQingL

关注

10
点赞
踩
27

收藏

觉得还不错? 一键收藏
打赏
0
评论
特征可视化————YOLOv8查看网络结构及打印热力图

在深度学习模型中，特征可视化是指通过可视化方法展示模型内部特征的提取和表示过程。这有助于理解模型的决策过程和它在不同层级上关注的区域。特征可视化在调试、优化和解释模型方面起着关键作用。
复制链接

扫一扫