一、池化层——Pooling Layer
1.1 池化层概念
池化运算: 对信号进行“收集”并“总结” ,类似水池收集水资源,因而 ,得名池化层
- “收集" : 多变少
- “总结" : 最大值/平均值
1.2 nn.MaxPool2d
nn.MaxPool2d(kernel_size,
stride=None,
padding=0,
dilation=1,
return_indices=False,
ceil_mode=False)
功能: 对二维信号(图像)进行最大值池化
主要参数:
- kernel_size: 池化核尺寸
- stride: 步长
- padding : 填充个数
- dilation: 池化核间隔大小
- ceil_mode: 尺寸向上取整
- return_indices: 记录池化像素索引,在最大值反池化上采样时使用
最大值反池化上采样:
早期的自编码器和图像分割任务中都会涉及图像上采样的操作,当时常使用最大值反池化上采样
如上图所示,图片先进行池化下采样,然后再进行上采样,此时就出现问题:2*2的图像中的像素值上采样后应该放到哪个位置?
而return_indices记录了最大池化下采样时,每个像素值来自的位置,在上采样时就可以根据记录的位置进行上采样
代码:
# -*- coding: utf-8 -*-
import os
import torch
import random
import numpy as np
import torchvision
import torch.nn as nn
from torchvision import transforms
from matplotlib import pyplot as plt
from PIL import Image
from tools.common_tools import transform_invert, set_seed
set_seed(1) # 设置随机种子
# ================================= load img ==================================
path_img = os.path.join(os.path.dirname(os.path.abspath(__file__)), "lena.png")
img = Image.open(path_img).convert('RGB') # 0~255
# convert to tensor
img_transform = transforms.Compose([transforms.ToTensor()])
img_tensor = img_transform(img)
img_tensor.unsqueeze_(dim=0) # C*H*W to B*C*H*W
# ================================= create convolution layer ==================================
# ================ maxpool
flag = 1
# flag = 0
if flag:
maxpool_layer = nn.MaxPool2d((2, 2), stride=(2, 2)) # input:(i, o, size) weights:(o, i , h, w)
img_pool = maxpool_layer(img_tensor)
# ================================= visualization ==================================
print("池化前尺寸:{}\n池化后尺寸:{}".format(img_tensor.shape, img_pool.shape))
img_pool = transform_invert(img_pool[0, 0:3, ...], img_transform)
img_raw = transform_invert(img_tensor.squeeze(), img_transform)
plt.subplot(122).imshow(img_pool)
plt.subplot(121).imshow(img_raw)
plt.show()
运行结果:
1.3 nn.AvgPool2d
nn.AvgPool2d(kernel_size,
stride=None,
padding=0,
ceil_mode=False,
count_include_pad=True,
divisor_override=None)
功能:对二维信号(图像)进行平均值池化
主要参数:
- kernel_size: 池化核尺寸
- stride: 步长
- padding: 填充个数
- ceil mode: 尺寸向上取整
- count_include_pad: 填充值用于计算
- divisor_override: 除法因子
# -*- coding: utf-8 -*-
import os
import torch
import random
import numpy as np
import torchvision
import torch.nn as nn
from torchvision import transforms
from matplotlib import pyplot as plt
from PIL import Image
from tools.common_tools import transform_invert, set_seed
set_seed(1) # 设置随机种子
# ================================= load img ==================================
path_img = os.path.join(os.path.dirname(os.path.abspath(__file__)), "lena.png")
img = Image.open(path_img).<