1. 卷积操作
import torch
# 定义输入、输出通道
in_channels, out_channels = 5, 10
# 定义图像尺寸
width, height = 100, 100
# 定义卷积核的大小,下式表示大小为3*3的正方形,同时,卷积核的通道数与输入图像的通道数一致,均为5
kernel_size = 3
# 定义一次输入图像的数量
batch_size = 1
input = torch.randn(batch_size,
in_channels,
width,
height)
# out_channels 决定了卷积核的数量, 即一共有10个3*3*5的卷积核
conv_layer = torch.nn.Conv2d(in_channels,
out_channels,
kernel_size=kernel_size)
output = conv_layer(input)
print(input.shape)
print(output.shape)
print(conv_layer.weight.shape)
torch.Size([1, 5, 100, 100])
torch.Size([1, 10, 98, 98])
torch.Size([10, 5, 3, 3])
有时,我们希望获得与原图像相同大小的卷积后的图像,这时需要属性padding,默认为0
conv_layer_with_padding = torch.nn.Conv2d(in_channels,
out_channels,
padding=1,
kernel_size = kernel_size)
output_with_padding = conv_layer_with_padding(input)
print(output_with_padding.shape)
torch.Size([1, 10, 100, 100])
还有时,我们希望再次降低网络的大小,以降低运算量。此时引入卷积核移动步长stride的概念,默认为1
conv_layer_with_stride = torch.nn.Conv2d(in_channels,
out_channels,
stride=2,
kernel_size=kernel_size)
output_with_stride = conv_layer_with_stride(input)
print(output_with_stride.shape)
torch.Size([1, 10, 49, 49])
2. 下采样
下采样与卷积无本质区别,不同的在于目的。下采样的目的是将数据维度再次减少。
最常用的下采样手段是Max Pooling 最大池化。
input = [
3,4,6,5,
2,4,6,8,
1,6,7,8,
9,7,4,6,
]
input = torch.Tensor(input).view(1,1,4,4)
maxpooling_layer = torch.nn.MaxPool2d(kernel_size=2)
# 注意,我们将kernel_size设为2,此时stride默认也为2
output = maxpooling_layer(input)
print(output)
tensor([[[[4., 8.],
[9., 8.]]]])