cnn到底学了啥

可视化代码

torchvision.models
Pytorch - torchvison.models 模型结构定义
AlexNet网络的Pytorch实现
hook
PyTorch学习总结(一)——查看模型中间结果
pytorch 获取层权重,对特定层注入hook, 提取中间层输出
深度学习小白——卷积神经网络可视化(一)

pytorch获取中间层参数、输出与可视化
https://www.zhihu.com/question/68384370/answer/419741762

建议使用hook,在不改变网络forward函数的基础上提取所需的特征或者梯度,在调用阶段对module使用即可获得所需梯度或者特征。

inter_feature = {}
inter_gradient = {}
 def make_hook(name, flag):
     if flag == 'forward':
         def hook(m, input, output):
             inter_feature[name] = input
         return hook
     elif flag == 'backward':
         def hook(m, input, output):
             inter_gradient[name] = output
         return hook
     else:
         assert False
    
    m.register_forward_hook(make_hook(name, 'forward'))                 
    m.register_backward_hook(make_hook(name, 'backward'))

在前向计算和反向计算的时候即可达到类似钩子的作用,中间变量已经被放置于inter_feature 和 inter_gradient。output = model(input) # achieve intermediate feature
loss = criterion(output, target)
loss.backward() # achieve backward intermediate gradients最后可根据需求是否释放hook。hook.remove()

Pytorch对Tensor的各种“特别”操作
PyTorch中permute的用法
Pytorch中Tensor与各种图像格式的相互转化

翻译5.4

可视化卷积学了什么
 我们经常说深度学习是个黑匣子:学习到的表示很难被抓取并且难以用人类可理解的形式呈现。但是卷积层的可视化还是可以做到的,因为是视觉概念的表示
 1. Visualizing intermediate convnet outputs (intermediate activations)—Useful for
understanding how successive convnet layers transform their input, and for getting a first idea of the meaning of individual convnet filters.
2. Visualizing convnets filters—Useful for understanding precisely what visual pattern or concept each filter in a convnet is receptive to.
3.  Visualizing heatmaps of class activation in an image—Useful for understanding
which parts of an image were identified as belonging to a given class, thus allow-
ing you to localize objects in images.

Visualizing intermediate activations

给定一个确定的输入,输出卷积层和pooling 层的特征图

AlexNet(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))    #(55,55,64)
    (1): ReLU(inplace)
    (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)  #(27,27,64)
    (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))  #(1, 192, 27, 27)
    (4): ReLU(inplace)
    (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)  #(1, 192, 13, 13)
    (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) #(1, 384, 13, 13)
    (7): ReLU(inplace)
    (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))    #(1, 256, 13, 13)
    (9): ReLU(inplace)
    (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) #(1, 256, 13, 13)
    (11): ReLU(inplace)
    (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (classifier): Sequential(
    (0): Dropout(p=0.5)
    (1): Linear(in_features=9216, out_features=4096, bias=True)
    (2): ReLU(inplace)
    (3): Dropout(p=0.5)
    (4): Linear(in_features=4096, out_features=4096, bias=True)
    (5): ReLU(inplace)
    (6): Linear(in_features=4096, out_features=1000, bias=True)
  )
)

代码

 #-*-coding:utf-8 -*-
from PIL import Image
from alexnetvisualize import alexnet
import torch
from torchvision import models
from torchsummary import summary
import numpy as np
import matplotlib.pyplot as plt

# 导入一张图片
img=Image.open("cat.png").convert('RGB')
img=img.resize((224,224))

# 将图片处理成网络的输入格式
img=np.array(img)
img = np.expand_dims(img, axis=0)
img=torch.from_numpy(img)
img=img.permute(0,3,1,2)
img = img.float()

# 导入已经训练好的网络
model=alexnet(pretrained= True)
#print(model)
# 注册一个hook
first_layer_activation=None

def hook(module, inputdata, output):
	global first_layer_activation
	first_layer_activation=output.data

handle = model.features[12].register_forward_hook(hook)
y=model(img)
print(first_layer_activation.shape)
 
size=first_layer_activation.shape[-1]
number=first_layer_activation.shape[1]
hang=8
lie=number/8
display_grid=np.zeros((size*hang,size*lie))

for i in range(8):
	for j in range(number/8):
		display_grid[i*size:(i+1)*size,j*size:(j+1)*size]=first_layer_activation[0, (i+1)*(j+1)-1, :, :]
plt.figure(figsize=(display_grid.shape[1]/size,display_grid.shape[0]/size))
plt.title("(12): MaxPool2d")
plt.grid(False)
plt.imshow(display_grid,aspect='auto', cmap='viridis')
plt.show()

# 用完hook后删除
#handle.remove()

The first layer acts as a collection of various edge detectors. At that stage, the
activations retain almost all of the information present in the initial picture.
 As you go higher, the activations become increasingly abstract and less visually
interpretable. They begin to encode higher-level concepts such as “cat ear” and
“cat eye.” Higher presentations carry increasingly less information about the
visual contents of the image, and increasingly more information related to the
class of the image.
 The sparsity of the activations increases with the depth of the layer: in the first
layer, all filters are activated by the input image; but in the following layers,
more and more filters are blank. This means the pattern encoded by the filter
isn’t found in the input image.
We have just evidenced an important universal characteristic of the representations learned by deep neural networks: the features extracted by a layer become increasingly abstract with the depth of the layer. The activations of higher layers carry less and less information about the specific input being seen, and more and more information about the target (in this case, the class of the image: cat or dog). A deep neural network effectively acts as an information distillation pipeline, with raw data going in (in this case, RGB pictures) and being repeatedly transformed so that irrelevant information is filtered out (for example, the specific visual appearance of the image), and useful information is magnified and refined (for example, the class of the image)

This is analogous to the way humans and animals perceive the world: after observing a scene for a few seconds, a human can remember which abstract objects were present in it (bicycle, tree) but can’t remember the specific appearance of these objects. In fact, if you tried to draw a generic bicycle from memory, chances are you couldn’t get it even remotely right, even though you’ve seen thousands of bicycles in your lifetime (see, for example, figure 5.28). Try it right now: this effect is absolutely real. You brain has learned to completely abstract its visual input—to transform it into high-level visual concepts while filtering out irrelevant visual details—making it tremendously difficult to remember how things around you look.

5.4.2 Visualizing convnet filters

获取Pytorch中间某一层权重或者特征
PyTorch 学习笔记(五):存储和恢复模型并查看参数
Pytorch之提取模型中的某一层
pytorch 获取层权重,对特定层注入hook, 提取中间层输出
CNN 可视化卷积核
PyTorch可视化理解卷积神经网络
人工智能玄乎其玄,内部究竟如何?带你一窥卷积神经网络内部 !!!
Pytorch自由载入部分模型参数并冻结
凭什么相信你,我的CNN模型?(篇一:CAM和Grad-CAM)


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值