【Pytorch】Visualization of Feature Maps（1）—— Maximize Filter

bryant_meng

已于 2024-01-09 17:07:23 修改

阅读量354

点赞数 2

分类专栏： PyTorch/Keras/Caffe/TensroFlow 文章标签： pytorch 人工智能 python

于 2023-11-21 10:35:54 首次发布

本文链接：https://blog.csdn.net/bryant_meng/article/details/134526106

版权

PyTorch/Keras/Caffe/TensroFlow 专栏收录该内容

61 篇文章 9 订阅

订阅专栏

在这里插入图片描述

学习参考来自

CNN可视化Convolutional Features
https://github.com/wmn7/ML_Practice/blob/master/2019_05_27/filter_visualizer.ipynb

可视化介绍

说明在图像分类任务中可视化CNN特征的两种方法。
答：

输入遮挡：遮挡输入图像的一部分，看看哪部分对分类的影响最大。例如，针对某个训练好的图像分类模型，将下列图像作为输入。如果我们看到第三幅图像被分类为狗狗的概率为98%，而第二幅图像的准确率仅为65%，则说明眼睛对于对分类的影响更大。
激活最大化：创建一个人造的输入图像，以最大化目标响应（梯度上升）。

CNN可视化技术总结(一)-特征图可视化：

一、特征图可视化。特征图可视化有两类方法，一类是直接将某一层的feature map映射到0-255的范围，变成图像。另一类是使用一个反卷积网络（反卷积、反池化）将feature map变成图像，从而达到可视化feature map的目的。

二、卷积核可视化。

三、类激活可视化。这个主要用于确定图像哪些区域对识别某个类起主要作用。如常见的热力图（Heat Map），在识别猫时，热力图可直观看出图像中每个区域对识别猫的作用大小。这个目前主要用的方法有CAM系列（CAM、Grad-CAM、Grad-CAM++）。

四、一些技术工具。通过一些研究人员开源出来的工具可视化CNN模型某一层。

https://github.com/julrog/nn_vis

https://github.com/Zetane/viewer

在这里插入图片描述

filter 的激活值

原理：找一张图片，使得某个 layer 的 filter 的激活值最大，这张图片就是能被这个 filter 所检测的对象。

来个案例，流程：

初始化一张图片, 56X56
使用预训练好的 VGG16 网络，固定网络参数；
若想可视化第 40 层 layer 的第 k 个 filter 的 conv, 我们设置 loss 函数为 (-1*神经元激活值)；
梯度下降, 对初始图片进行更新；
对得到的图片X1.2, 得到新的图片，重复上面的步骤；

其中第五步比较关键，我们可以看到初始化的图片不是很大，只有56X56. 这是因为原文作者在实际做的时候发现，若初始图片较大，得到的特征的频率会较高，即没有现在这么好的显示效果。

import torch
from torch.autograd import Variable
from PIL import Image, ImageOps
import torchvision.transforms as transforms
import torchvision.models as models

import numpy as np
import cv2
from cv2 import resize
from matplotlib import pyplot as plt

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

"initialize input image"
sz = 56
img = np.uint(np.random.uniform(150, 180, (3, sz, sz))) / 255  # (3, 56, 56)
img = torch.from_numpy(img[None]).float().to(device)  # (1, 3, 56, 56)

"pretrained model"
model_vgg16 = models.vgg16_bn(pretrained=True).features.to(device).eval()
# downloading /home/xxx/.cache/torch/hub/checkpoints/vgg16_bn-6c64b313.pth, 500M+
# print(model_vgg16)
# print(len(list(model_vgg16.children())))  # 44
# print(list(model_vgg16.children()))

"get the filter's output of one layer"
# 使用hook来得到网络中间层的输出
class SaveFeatures():
    def __init__(self, module):
        self.hook = module.register_forward_hook(self.hook_fn)
    def hook_fn(self, module, input, output):
        self.features = output.clone()
    def close(self):
        self.hook.remove()

layer = 42
activations = SaveFeatures(list(model_vgg16.children())[layer])

"backpropagation, setting hyper-parameters"
lr = 0.1
opt_steps = 25 # 迭代次数
filters = 265 # layer 42 的第 265 个 filter，使其激活值最大
upscaling_steps = 13 # 图像放大次数
blur = 3
upscaling_factor = 1.2 # 放大倍率

"preprocessing of datasets"
cnn_normalization_mean = torch.tensor([0.485, 0.456, 0.406]).view(-1, 1, 1).to(device)
cnn_normalization_std = torch.tensor([0.299, 0.224, 0.225]).view(-1, 1, 1).to(device)

"gradient descent"
for epoch in range(upscaling_steps):  # scale the image up up_scaling_steps times
    img = (img - cnn_normalization_mean) / cnn_normalization_std
    img[img > 1] = 1
    img[img < 0] = 0
    print("Image Shape1:", img.shape)
    img_var = Variable(img, requires_grad=True)  # convert image to Variable that requires grad
    "optimizer"
    optimizer = torch.optim.Adam([img_var], lr=lr, weight_decay=1e-6)
    for n in range(opt_steps):
        optimizer.zero_grad()
        model_vgg16(img_var)  # forward
        loss = -activations.features[0, filters].mean()  # max the activations
        loss.backward()
        optimizer.step()

    "restore the image"
    print("Loss:", loss.cpu().detach().numpy())
    img = img_var * cnn_normalization_std + cnn_normalization_mean
    img[img>1] = 1
    img[img<0] = 0
    img = img.data.cpu().numpy()[0].transpose(1,2,0)
    sz = int(upscaling_factor * sz)  # calculate new image size
    img = cv2.resize(img, (sz, sz), interpolation=cv2.INTER_CUBIC)  # scale image up
    if blur is not None:
        img = cv2.blur(img, (blur, blur))  # blur image to reduce high frequency patterns
    print("Image Shape2:", img.shape)

    img = torch.from_numpy(img.transpose(2, 0, 1)[None]).to(device)
    print("Image Shape3:", img.shape)
    print(str(epoch), ", Finished")
    print("="*10)

activations.close()  # remove the hook

image = img.cpu().clone()
image = image.squeeze(0)
unloader = transforms.ToPILImage()

image = unloader(image)
image = cv2.cvtColor(np.asarray(image), cv2.COLOR_RGB2BGR)
cv2.imwrite("res1.jpg", image)
torch.cuda.empty_cache()


"""
Image Shape1: torch.Size([1, 3, 56, 56])
Loss: -6.0634975
Image Shape2: (67, 67, 3)
Image Shape3: torch.Size([1, 3, 67, 67])
0 , Finished
==========
Image Shape1: torch.Size([1, 3, 67, 67])
Loss: -7.8898916
Image Shape2: (80, 80, 3)
Image Shape3: torch.Size([1, 3, 80, 80])
1 , Finished
==========
Image Shape1: torch.Size([1, 3, 80, 80])
Loss: -8.730318
Image Shape2: (96, 96, 3)
Image Shape3: torch.Size([1, 3, 96, 96])
2 , Finished
==========
Image Shape1: torch.Size([1, 3, 96, 96])
Loss: -9.697872
Image Shape2: (115, 115, 3)
Image Shape3: torch.Size([1, 3, 115, 115])
3 , Finished
==========
Image Shape1: torch.Size([1, 3, 115, 115])
Loss: -10.190881
Image Shape2: (138, 138, 3)
Image Shape3: torch.Size([1, 3, 138, 138])
4 , Finished
==========
Image Shape1: torch.Size([1, 3, 138, 138])
Loss: -10.315895
Image Shape2: (165, 165, 3)
Image Shape3: torch.Size([1, 3, 165, 165])
5 , Finished
==========
Image Shape1: torch.Size([1, 3, 165, 165])
Loss: -9.73861
Image Shape2: (198, 198, 3)
Image Shape3: torch.Size([1, 3, 198, 198])
6 , Finished
==========
Image Shape1: torch.Size([1, 3, 198, 198])
Loss: -9.503629
Image Shape2: (237, 237, 3)
Image Shape3: torch.Size([1, 3, 237, 237])
7 , Finished
==========
Image Shape1: torch.Size([1, 3, 237, 237])
Loss: -9.488493
Image Shape2: (284, 284, 3)
Image Shape3: torch.Size([1, 3, 284, 284])
8 , Finished
==========
Image Shape1: torch.Size([1, 3, 284, 284])
Loss: -9.100454
Image Shape2: (340, 340, 3)
Image Shape3: torch.Size([1, 3, 340, 340])
9 , Finished
==========
Image Shape1: torch.Size([1, 3, 340, 340])
Loss: -8.699549
Image Shape2: (408, 408, 3)
Image Shape3: torch.Size([1, 3, 408, 408])
10 , Finished
==========
Image Shape1: torch.Size([1, 3, 408, 408])
Loss: -8.90135
Image Shape2: (489, 489, 3)
Image Shape3: torch.Size([1, 3, 489, 489])
11 , Finished
==========
Image Shape1: torch.Size([1, 3, 489, 489])
Loss: -8.838546
Image Shape2: (586, 586, 3)
Image Shape3: torch.Size([1, 3, 586, 586])
12 , Finished
==========

Process finished with exit code 0
"""

得到特征图

请添加图片描述
网上找个图片测试下，看响应是不是最大

测试图片

请添加图片描述

import torch
from torch.autograd import Variable
from PIL import Image, ImageOps
import torchvision.transforms as transforms
import torchvision.models as models

import numpy as np
import cv2
from cv2 import resize
from matplotlib import pyplot as plt

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

class SaveFeatures():
    def __init__(self, module):
        self.hook = module.register_forward_hook(self.hook_fn)
    def hook_fn(self, module, input, output):
        self.features = output.clone()
    def close(self):
        self.hook.remove()

size = (224, 224)
picture = Image.open("./bird.jpg").convert("RGB")
picture = ImageOps.fit(picture, size, Image.ANTIALIAS)

loader = transforms.ToTensor()
picture = loader(picture).to(device)
print(picture.shape)

cnn_normalization_mean = torch.tensor([0.485, 0.456, 0.406]).view(-1, 1, 1).to(device)
cnn_normalization_std = torch.tensor([0.229, 0.224, 0.225]).view(-1, 1, 1).to(device)

picture = (picture-cnn_normalization_mean) / cnn_normalization_std

model_vgg16 = models.vgg16_bn(pretrained=True).features.to(device).eval()
print(list(model_vgg16.children())[40])  # Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
print(list(model_vgg16.children())[41])  # BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
print(list(model_vgg16.children())[42])  # ReLU(inplace=True)

layer = 42
filters = 265
activations = SaveFeatures(list(model_vgg16.children())[layer])

with torch.no_grad():
    picture_var = Variable(picture[None])
    model_vgg16(picture_var)
activations.close()

print(activations.features.shape)  # torch.Size([1, 512, 14, 14])

# 画出每个 filter 的平均值
mean_act = [activations.features[0, i].mean().item() for i in range(activations.features.shape[1])]
plt.figure(figsize=(7,5))
act = plt.plot(mean_act, linewidth=2.)
extraticks = [filters]
ax = act[0].axes
ax.set_xlim(0, 500)
plt.axvline(x=filters, color="gray", linestyle="--")
ax.set_xlabel("feature map")
ax.set_ylabel("mane activation")
ax.set_xticks([0, 200, 400] + extraticks)
plt.show()

"""
torch.Size([3, 224, 224])
Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
ReLU(inplace=True)
torch.Size([1, 512, 14, 14])
"""