在深度学习模型中,特征可视化是指通过可视化方法展示模型内部特征的提取和表示过程。这有助于理解模型的决策过程和它在不同层级上关注的区域。特征可视化在调试、优化和解释模型方面起着关键作用。
一.网络结构可视化
1.将pt文件转换为onnx文件
首先将需要可视化的模型转换为onnx文件,使用如下代码,运行后会在同级目录下生产onnx文件。
from ultralytics import YOLO
# 加载训练好的模型
model = YOLO("yolov8n.pt")
# # 将模型转为onnx格式
success = model.export(format='onnx')
2.使用Netron可视化网络结构
Netron是一款开源的可视化工具,专门用于查看神经网络、卷积神经网络、循环神经网络和其他机器学习模型的结构和权重。
- 支持多种文件格式:包括TensorFlow、PyTorch、ONNX和Caffe在内的多种流行框架的模型文件。
- 实时更新显示:在更改模型或调整参数时,Netron会立即更新其可视化表示。
- 丰富的交互功能:提供了丰富的交互功能,如缩放、平移、搜索等。
- 易于使用和部署:是一个跨平台工具,可以在Linux、Windows和Mac上运行。
打开Netron官网,界面如下:
选择上传的模型:
打开后,会显示如上图所示的模型结构,可导出PNG,SVG格式图片;右侧可查看每个模块的参数信息,如卷积核大小,stride等。
通过上述,操作我们就可以清楚的看到整个网络结构,并且可以查看每个网络节点的详细信息
二.特征可视化
1.可视化过程
可视化过程一般包括:
-
图像预处理:对输入图像进行预处理,包括调整大小、颜色空间转换、归一化等,以满足模型输入的要求。
-
前向传播:将预处理后的图像输入模型,进行前向传播,获取模型的预测结果和中间激活值。
-
计算梯度:选择目标类别,并计算目标类别相对于指定层激活值的梯度。
-
生成热力图:通过计算梯度权重,对激活值进行加权,生成类激活图或热力图。
-
热力图叠加:将生成的热力图叠加到原图上,以可视化模型关注的区域。
2.实际操作
实现YOLOv8模型的热力图可视化,突出显示模型在目标检测时关注的区域。
依赖库介绍
warnings
: 忽略警告信息torch
: PyTorch深度学习框架cv2
: OpenCV库,用于图像处理os
和shutil
: 用于文件和目录操作numpy
: 数值计算库matplotlib.pyplot
: 用于绘制图像tqdm
: 进度条显示库PIL.Image
: 用于图像加载和处理ultralytics
: YOLOv8模型相关库pytorch_grad_cam
: 用于生成Grad-CAM热力图
调整图像大小和填充,使其符合模型输入的要求,保持图像的宽高比例
def letterbox(im, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True, stride=32):
# Resize and pad image while meeting stride-multiple constraints
shape = im.shape[:2] # current shape [height, width]
if isinstance(new_shape, int):
new_shape = (new_shape, new_shape)
# Scale ratio (new / old)
r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
if not scaleup: # only scale down, do not scale up (for better val mAP)
r = min(r, 1.0)
# Compute padding
ratio = r, r # width, height ratios
new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding
if auto: # minimum rectangle
dw, dh = np.mod(dw, stride), np.mod(dh, stride) # wh padding
elif scaleFill: # stretch
dw, dh = 0.0, 0.0
new_unpad = (new_shape[1], new_shape[0])
ratio = new_shape[1] / shape[1], new_shape[0] / shape[0] # width, height ratios
dw /= 2 # divide padding into 2 sides
dh /= 2
if shape[::-1] != new_unpad: # resize
im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border
return im, ratio, (dw, dh)
加载模型权重和配置文件,选择目标检测层和热力图方法,生成随机颜色用于绘制检测框。
class yolov8_heatmap:
def __init__(self, weight, cfg, device, method, layer, backward_type, conf_threshold, ratio):
device = torch.device(device)
ckpt = torch.load(weight)
model_names = ckpt['model'].names
csd = ckpt['model'].float().state_dict() # checkpoint state_dict as FP32
model = Model(cfg, ch=3, nc=len(model_names)).to(device)
csd = intersect_dicts(csd, model.state_dict(), exclude=['anchor']) # intersect
model.load_state_dict(csd, strict=False) # load
model.eval()
print(f'Transferred {len(csd)}/{len(model.state_dict())} items')
target_layers = [eval(layer)]
method = eval(method)
colors = np.random.uniform(0, 255, size=(len(model_names), 3)).astype(np.int32)
self.__dict__.update(locals())
对模型输出结果进行后处理,提取类别概率和边界框坐标,并将坐标从 xywh
格式转换为 xyxy
格式。
def post_process(self, result):
logits_ = result[:, 4:]
boxes_ = result[:, :4]
sorted, indices = torch.sort(logits_.max(1)[0], descending=True)
return torch.transpose(logits_[0], dim0=0, dim1=1)[indices[0]], torch.transpose(boxes_[0], dim0=0, dim1=1)[indices[0]], xywh2xyxy(torch.transpose(boxes_[0], dim0=0, dim1=1)[indices[0]]).cpu().detach().numpy()
在图像上绘制检测到的目标边界框和类别标签
def draw_detections(self, box, color, name, img):
xmin, ymin, xmax, ymax = list(map(int, list(box)))
cv2.rectangle(img, (xmin, ymin), (xmax, ymax), tuple(int(x) for x in color), 2)
cv2.putText(img, str(name), (xmin, ymin - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.8, tuple(int(x) for x in color), 2, lineType=cv2.LINE_AA)
return img
完整代码:
import warnings
warnings.filterwarnings('ignore')
warnings.simplefilter('ignore')
import torch, cv2, os, shutil
import numpy as np
np.random.seed(0)
import matplotlib.pyplot as plt
from tqdm import trange
from PIL import Image
from ultralytics.nn.tasks import DetectionModel as Model
from ultralytics.yolo.utils.torch_utils import intersect_dicts
from ultralytics.yolo.utils.ops import xywh2xyxy
from pytorch_grad_cam import GradCAMPlusPlus, GradCAM, XGradCAM
from pytorch_grad_cam.utils.image import show_cam_on_image
from pytorch_grad_cam.activations_and_gradients import ActivationsAndGradients
def letterbox(im, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True, stride=32):
# Resize and pad image while meeting stride-multiple constraints
shape = im.shape[:2] # current shape [height, width]
if isinstance(new_shape, int):
new_shape = (new_shape, new_shape)
# Scale ratio (new / old)
r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
if not scaleup: # only scale down, do not scale up (for better val mAP)
r = min(r, 1.0)
# Compute padding
ratio = r, r # width, height ratios
new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding
if auto: # minimum rectangle
dw, dh = np.mod(dw, stride), np.mod(dh, stride) # wh padding
elif scaleFill: # stretch
dw, dh = 0.0, 0.0
new_unpad = (new_shape[1], new_shape[0])
ratio = new_shape[1] / shape[1], new_shape[0] / shape[0] # width, height ratios
dw /= 2 # divide padding into 2 sides
dh /= 2
if shape[::-1] != new_unpad: # resize
im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border
return im, ratio, (dw, dh)
class yolov8_heatmap:
def __init__(self, weight, cfg, device, method, layer, backward_type, conf_threshold, ratio):
device = torch.device(device)
ckpt = torch.load(weight)
model_names = ckpt['model'].names
csd = ckpt['model'].float().state_dict() # checkpoint state_dict as FP32
model = Model(cfg, ch=3, nc=len(model_names)).to(device)
csd = intersect_dicts(csd, model.state_dict(), exclude=['anchor']) # intersect
model.load_state_dict(csd, strict=False) # load
model.eval()
print(f'Transferred {len(csd)}/{len(model.state_dict())} items')
target_layers = [eval(layer)]
method = eval(method)
colors = np.random.uniform(0, 255, size=(len(model_names), 3)).astype(np.int32)
self.__dict__.update(locals())
def post_process(self, result):
logits_ = result[:, 4:]
boxes_ = result[:, :4]
sorted, indices = torch.sort(logits_.max(1)[0], descending=True)
return torch.transpose(logits_[0], dim0=0, dim1=1)[indices[0]], torch.transpose(boxes_[0], dim0=0, dim1=1)[
indices[0]], xywh2xyxy(torch.transpose(boxes_[0], dim0=0, dim1=1)[indices[0]]).cpu().detach().numpy()
def draw_detections(self, box, color, name, img):
xmin, ymin, xmax, ymax = list(map(int, list(box)))
cv2.rectangle(img, (xmin, ymin), (xmax, ymax), tuple(int(x) for x in color), 2)
cv2.putText(img, str(name), (xmin, ymin - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.8, tuple(int(x) for x in color), 2,
lineType=cv2.LINE_AA)
return img
def __call__(self, img_path, save_path):
# remove dir if exist
if os.path.exists(save_path):
shutil.rmtree(save_path)
# make dir if not exist
os.makedirs(save_path, exist_ok=True)
# img process
img = cv2.imread(img_path)
img = letterbox(img)[0]
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = np.float32(img) / 255.0
tensor = torch.from_numpy(np.transpose(img, axes=[2, 0, 1])).unsqueeze(0).to(self.device)
# init ActivationsAndGradients
grads = ActivationsAndGradients(self.model, self.target_layers, reshape_transform=None)
# get ActivationsAndResult
result = grads(tensor)
activations = grads.activations[0].cpu().detach().numpy()
# postprocess to yolo output
post_result, pre_post_boxes, post_boxes = self.post_process(result[0])
for i in trange(int(post_result.size(0) * self.ratio)):
if float(post_result[i].max()) < self.conf_threshold:
break
self.model.zero_grad()
# get max probability for this prediction
if self.backward_type == 'class' or self.backward_type == 'all':
score = post_result[i].max()
score.backward(retain_graph=True)
if self.backward_type == 'box' or self.backward_type == 'all':
for j in range(4):
score = pre_post_boxes[i, j]
score.backward(retain_graph=True)
# process heatmap
if self.backward_type == 'class':
gradients = grads.gradients[0]
elif self.backward_type == 'box':
gradients = grads.gradients[0] + grads.gradients[1] + grads.gradients[2] + grads.gradients[3]
else:
gradients = grads.gradients[0] + grads.gradients[1] + grads.gradients[2] + grads.gradients[3] + \
grads.gradients[4]
b, k, u, v = gradients.size()
weights = self.method.get_cam_weights(self.method, None, None, None, activations,
gradients.detach().numpy())
weights = weights.reshape((b, k, 1, 1))
saliency_map = np.sum(weights * activations, axis=1)
saliency_map = np.squeeze(np.maximum(saliency_map, 0))
saliency_map = cv2.resize(saliency_map, (tensor.size(3), tensor.size(2)))
saliency_map_min, saliency_map_max = saliency_map.min(), saliency_map.max()
if (saliency_map_max - saliency_map_min) == 0:
continue
saliency_map = (saliency_map - saliency_map_min) / (saliency_map_max - saliency_map_min)
# add heatmap and box to image
cam_image = show_cam_on_image(img.copy(), saliency_map, use_rgb=True)
cam_image = Image.fromarray(cam_image)
cam_image.save(f'{save_path}/{i}.png')
def get_params():
params = {
'weight': './yolov8n.pt', # 这选择想要热力可视化的模型权重路径
'cfg': r'F:\Project\ultralyticsnew\ultralytics\models\v8\yolov8.yaml', # 这里选择与训练上面模型权重相对应的.yaml文件路径
'device': 'cpu', # 选择设备,其中0表示0号显卡。如果使用CPU可视化 # 'device': 'cpu'
'method': 'GradCAM', # GradCAMPlusPlus, GradCAM, XGradCAM
'layer': 'model.model[18]', # 选择特征层
'backward_type': 'all', # class, box, all
'conf_threshold': 0.65, # 置信度阈值默认0.65, 可根据情况调节
'ratio': 0.02 # 取前多少数据,默认是0.02,可根据情况调节
}
return params
if __name__ == '__main__':
model = yolov8_heatmap(**get_params()) # 初始化
model(r'F:\Project\ultralyticsnew\datasets\bus.jpg', './result/map_18') # 第一个参数是图片的路径,第二个参数是保存路径,比如是result的话,其会创建一个名字为result的文件夹,如果result文件夹不为空,其会先清空文件夹。
参数信息:
params = {
'weight': './yolov8n.pt', # 这选择想要热力可视化的模型权重路径
'cfg': r'F:\Project\ultralyticsnew\ultralytics\models\v8\yolov8.yaml', # 这里选择与训练上面模型权重相对应的.yaml文件路径
'device': 'cpu', # 选择设备,其中0表示0号显卡。如果使用CPU可视化 # 'device': 'cpu'
'method': 'GradCAM', # GradCAMPlusPlus, GradCAM, XGradCAM
'layer': 'model.model[18]', # 选择特征层
'backward_type': 'all', # class, box, all
'conf_threshold': 0.65, # 置信度阈值默认0.65, 可根据情况调节
'ratio': 0.02 # 取前多少数据,默认是0.02,可根据情况调节
}
return
!!!!注意:
运行代码后,如果下图中的分子和分母不相等,证明cfg文件和模型权重不匹配
打印出的热力图如下所示:
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |