基于论文:Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
论文下载地址:https://arxiv.org/abs/1610.02391
Pytorch代码下载地址:https://github.com/jacobgil/pytorch-grad-cam
目录
Pytorch-Grad-CAM原理介绍
以图像分类为例:
Activations:正向传播
Gradients:反向传播
A:从原图像中提取的特征层越往后抽象程度越高,语义信息越丰富,故在图像分类任务中我们取特征层的最后一层,即Features[-1]
在经过两个全连接层后得到损失,经过backpropagation后得到的彩色表示层表示的是A中相应层对图像的重要程度,对矩阵均值后加权求和然后经过ReLU激活函数激活,显色后得到热力图。
更多理论介绍可参见这篇blog,讲的很详细:https://blog.csdn.net/qq_37541097/article/details/123089851
使用Grad-CAM绘制热力图
从本文开头论文下载地址下载代码,我们来看main文件:
import os
import numpy as np
import torch
from PIL import Image
import matplotlib.pyplot as plt
from torchvision import models
from torchvision import transforms
from utils import GradCAM, show_cam_on_image, center_crop_img
def main():
model = models.mobilenet_v3_large(pretrained=True)
target_layers = [model.features[-1]] #层结构列表
# model = models.vgg16(pretrained=True)
# target_layers = [model.features]
# model = models.resnet34(pretrained=True)
# target_layers = [model.layer4]
# model = models.regnet_y_800mf(pretrained=True)
# target_layers = [model.trunk_output]
# model = models.efficientnet_b0(pretrained=True)
# target_layers = [model.features]
data_transform = transforms.Compose([transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])
# load image
img_path = "both.png"
assert os.path.exists(img_path), "file: '{}' dose not exist.".format(img_path)
img = Image.open(img_path).convert('RGB')
img = np.array(img, dtype=np.uint8)
# img = center_crop_img(img, 224)
# [C, H, W]
img_tensor = data_transform(img)
# expand batch dimension
# [C, H, W] -> [N, C, H, W]
input_tensor = torch.unsqueeze(img_tensor, dim=0) #增加Batch维度
# 初始化CAM对象,包括模型,目标层以及是否使用cuda等
cam = GradCAM(model=model, target_layers=target_layers, use_cuda=True)
# target_category = 281 # tabby, tabby cat
# target_category = 673 # mouse, computer mouse
# target_category = 657 # missile
target_category = 254 # pug, pug-dog
grayscale_cam = cam(input_tensor=input_tensor, target_category=target_category)
grayscale_cam = grayscale_cam[0, :]
visualization = show_cam_on_image(img.astype(dtype=np.float32) / 255.,
grayscale_cam,
use_rgb=True)
plt.imshow(visualization)
plt.show()
if __name__ == '__main__':
main()
运行结果如下:
值得一提的是:在作者提供的 imagenet1k_classes.txt文件中提供了1000种不同识别特征,在代码的
target_category=
等号后面改成相应[行号-1]即可。
将target_category改成tabby cat对应的行号-1=281,输出结果为:
我们来识别鼠标试试(category=678):
识别导弹热力图: