NNDL 作业5：卷积

蒂洛洛

于 2022-10-17 11:27:18 发布

阅读量292

点赞数 1

文章标签： python 开发语言

本文链接：https://blog.csdn.net/abc84986565/article/details/127358317

版权

作业1

编程实现

1. 图1使用卷积核 $\begin{pmatrix} 1 & -1 \end{pmatrix}$ ， $\begin{pmatrix} 1\\ -1\\ \end{pmatrix}$ ，输出特征图

import numpy as np
import torch
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号

w1 = np.array([1, -1], dtype='float32').reshape([1, 1, 1, 2])
w2 = np.array([1, -1], dtype='float32').T.reshape([1, 1, 2, 1])
print(w2)
w1 = torch.Tensor(w1)
w2 = torch.Tensor(w2)
conv1 = torch.nn.Conv2d(1, 1, [1, 2])
conv1.weight = torch.nn.Parameter(w1)

conv2 = torch.nn.Conv2d(1, 1, [2, 1])
conv2.weight = torch.nn.Parameter(w2)
# 创建图像
img = np.ones([7, 6], dtype='float32')
img[:, 3:] = 0.
img[:, :3] = 255.
x = img.reshape([1, 1, 7, 6])
x = torch.Tensor(x)

y1 = conv1(x).detach().numpy()
y2 = conv2(x).detach().numpy()
plt.subplot(131).set_title('图1')
plt.imshow(img, cmap='gray')
plt.subplot(132).set_title('图1使用卷积核为(1,-1)结果')
plt.imshow(y1.squeeze(), cmap='gray')
plt.subplot(133).set_title('图1使用卷积核为(1,-1)T结果')
plt.imshow(y2.squeeze(), cmap='gray')
plt.show()

2. 图2使用卷积核 $\begin{pmatrix} 1 & -1 \end{pmatrix}$ ， $\begin{pmatrix} 1\\ -1\\ \end{pmatrix}$ ，输出特征图

import numpy as np
import torch
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号

w1 = np.array([1, -1], dtype='float32').reshape([1, 1, 1, 2])
w2 = np.array([1, -1], dtype='float32').T.reshape([1, 1, 2, 1])
print(w2)
w1 = torch.Tensor(w1)
w2 = torch.Tensor(w2)
conv1 = torch.nn.Conv2d(1, 1, [1, 2])
conv1.weight = torch.nn.Parameter(w1)

conv2 = torch.nn.Conv2d(1, 1, [2, 1])
conv2.weight = torch.nn.Parameter(w2)
# 创建图像
img = np.ones([8, 8], dtype='float32')
img[:4, :4] = 0.
img[:4, 4:] = 255.
img[4:, :4] = 255.
img[4:, 4:] = 0.

x = img.reshape([1, 1, 8, 8])
x = torch.Tensor(x)

y1 = conv1(x).detach().numpy()
y2 = conv2(x).detach().numpy()
plt.subplot(131).set_title('图2')
plt.imshow(img, cmap='gray')
plt.subplot(132).set_title('图2使用卷积核为(1,-1)结果')
plt.imshow(y1.squeeze(), cmap='gray')
plt.subplot(133).set_title('图2使用卷积核为(1,-1)T结果')
plt.imshow(y2.squeeze(), cmap='gray')
plt.show()

3. 用卷积核 $\begin{pmatrix} 1 & -1 \end{pmatrix}$ ， $\begin{pmatrix} 1\\ -1\\ \end{pmatrix}$ ， $\begin{pmatrix} 1 &-1 \\ -1&1 \end{pmatrix}$ ，输出特征图

import numpy as np
import torch
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号

w1 = np.array([1, -1], dtype='float32').reshape([1, 1, 1, 2])
w2 = np.array([1, -1], dtype='float32').T.reshape([1, 1, 2, 1])
w3 = np.array([[1, -1, -1, 1]], dtype='float32').reshape([1, 1, 2, 2])

print(w3)
w1 = torch.Tensor(w1)
w2 = torch.Tensor(w2)
w3 = torch.Tensor(w3)

conv1 = torch.nn.Conv2d(1, 1, [1, 2])
conv1.weight = torch.nn.Parameter(w1)

conv2 = torch.nn.Conv2d(1, 1, [2, 1])
conv2.weight = torch.nn.Parameter(w2)

conv3 = torch.nn.Conv2d(1, 1, [2, 2])
conv3.weight = torch.nn.Parameter(w3)
# 创建图像
img = np.ones([9, 9], dtype='float32')
for i in range(7):
    img[i + 1, i + 1] = 255.
    img[i + 1, 7 - i] = 255.

x = img.reshape([1, 1, 9, 9])
x = torch.Tensor(x)

y1 = conv1(x).detach().numpy()
y2 = conv2(x).detach().numpy()
y3 = conv3(x).detach().numpy()
plt.subplot(221).set_title('图3')
plt.imshow(img, cmap='gray')
plt.subplot(222).set_title('图3使用卷积核为(1,-1)结果')
plt.imshow(y1.squeeze(), cmap='gray')
plt.subplot(223).set_title('图3使用卷积核为(1,-1)T结果')
plt.imshow(y2.squeeze(), cmap='gray')
plt.subplot(224).set_title('图3使用卷积核为[[1 -1],[-1 1]]结果')
plt.imshow(y3.squeeze(), cmap='gray')
plt.show()

作业2

一、概念

用自己的语言描述“卷积、卷积核、特征图、特征选择、步长、填充、感受野”。

卷积：

卷积是通过两个函数f和g生成第三个函数的一种数学运算，其本质是一种特殊的积分变换，表征函数f与g经过翻转和平移的重叠部分函数值乘积对重叠长度的积分。

卷积核：

卷积核也就是滤波器，在图像处理时，给定输入图像，输入图像中一个小区域中像素加权平均后成为输出图像中的每个对应像素，其中权值由一个函数定义，这个函数称为卷积核

特征图:

特征图（feature map）：在每个卷积层，数据都是以三维形式存在的。你可以把它看成许多个二维图片叠在一起，其中每一个称为一个feature map。在输入层，如果是灰度图片，那就只有一个feature map；如果是彩色图片，一般就是3个feature map（红绿蓝）。层与层之间会有若干个卷积核（kernel），上一层和每个feature map跟每个卷积核做卷积，都会产生下一层的一个feature map。

特征选择:

特征选择的目的：在实际项目中，我们可能会有大量的特征可使用，有的特征携带的信息丰富，有的特征携带的信息有重叠，有的特征则属于无关特征，如果所有特征不经筛选地全部作为训练特征，经常会出现维度灾难问题，甚至会降低模型的准确性，如果只选择所有特征中的关键特征构建模型，那么可以大大减少学习算法的运行时间，也可以增加模型的可解释性。

步长:

步长的选取很关键，如果步长过长，那么每次 w 偏移过大，永远都找不到真正的最小值。而如果步长选取过小，那么收敛会变得很慢，而且有可能在中间某段平滑处停下来，找到的也不是真正的最小值。

填充:

填充也就是在矩阵的边界上填充一些值，以增加矩阵的大小，通常都用“0”来进行填充的。

感受野:

感受野（Receptive Field），指的是神经网络中神经元“看到的”输入区域，再卷积神经网络中，feature map上某个元素的计算受输入图像某个区域，这个区域即该元素的感受野

二、探究不同卷积核的作用

sharpen

outline

blur

bottom sobel

left sobel

right sobel

top sobel

emboss

三、编程实现

1，实现灰度图的边缘检测、锐化、模糊

import numpy as np
import torch
from torch import nn
from torch.autograd import Variable
from PIL import Image
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号
# 加载图片
file_path = 'hui.jpg.png'
im = Image.open(file_path).convert('L')
im = np.array(im, dtype='float32')

plt.subplot(331).set_title('原图')
plt.imshow(im.astype('uint8'), cmap='gray')

im = torch.from_numpy(im.reshape((1, 1, im.shape[0], im.shape[1])))
conv1 = nn.Conv2d(1, 1, 3, bias=False)  # 定义卷积
conv2 = nn.Conv2d(1, 1, 3, bias=False)  # 定义卷积
conv3 = nn.Conv2d(1, 1, 3, bias=False)  # 定义卷积
conv4 = nn.Conv2d(1, 1, 3, bias=False)  # 定义卷积
conv5 = nn.Conv2d(1, 1, 3, bias=False)  # 定义卷积
conv6 = nn.Conv2d(1, 1, 3, bias=False)  # 定义卷积
conv7 = nn.Conv2d(1, 1, 3, bias=False)  # 定义卷积
conv8 = nn.Conv2d(1, 1, 3, bias=False)  # 定义卷积

bottom_sobel = np.array([[-1, -2, -1],
                         [0, 0, 0],
                         [1, 2, 1]], dtype='float32').reshape((1, 1, 3, 3))
conv1.weight.data = torch.from_numpy(bottom_sobel)
left_sobel = np.array([[1, 0, -1],
                       [2, 0, -2],
                       [1, 0, -1]], dtype='float32').reshape((1, 1, 3, 3))
conv2.weight.data = torch.from_numpy(left_sobel)
right_sobel = np.array([[-1, 0, 1],
                        [-2, 0, 2],
                        [-1, 0, 1]], dtype='float32').reshape((1, 1, 3, 3))
conv3.weight.data = torch.from_numpy(right_sobel)

top_sobel = np.array([[-1, 2, 1],
                      [0, 0, 0],
                      [-1, -2, -1]], dtype='float32').reshape((1, 1, 3, 3))
conv4.weight.data = torch.from_numpy(top_sobel)

sharpen = np.array([[0, -1, 0],
                    [-1, 5, -1],
                    [0, -1, 0]], dtype='float32').reshape((1, 1, 3, 3))
conv5.weight.data = torch.from_numpy(sharpen)
blur = np.array([[0.0625, 0.125, 0.0625],
                 [0.125, 0.25, 0.125],
                 [0.0625, 0.125, 0.0625]], dtype='float32').reshape((1, 1, 3, 3))
conv6.weight.data = torch.from_numpy(blur)
emboss = np.array([[-2, -1, 0],
                   [-1, 1, 1],
                   [0, 1, 2]], dtype='float32').reshape((1, 1, 3, 3))
conv7.weight.data = torch.from_numpy(emboss)
outline = np.array([[-1, -1, -1],
                    [-1, 8, -1],
                    [-1, -1, -1]], dtype='float32').reshape((1, 1, 3, 3))
conv8.weight.data = torch.from_numpy(outline)

y1 = conv1(Variable(im)).data.squeeze().numpy()
y2 = conv2(Variable(im)).data.squeeze().numpy()
y3 = conv3(Variable(im)).data.squeeze().numpy()
y4 = conv4(Variable(im)).data.squeeze().numpy()
y5 = conv5(Variable(im)).data.squeeze().numpy()
y6 = conv6(Variable(im)).data.squeeze().numpy()
y7 = conv7(Variable(im)).data.squeeze().numpy()
y8 = conv8(Variable(im)).data.squeeze().numpy()

# 可视化
plt.subplot(332).set_title('bottom_sobel')
plt.imshow(y1, cmap='gray')
plt.subplot(333).set_title('left_sobel')
plt.imshow(y2, cmap='gray')
plt.subplot(334).set_title('right_sobel')
plt.imshow(y3, cmap='gray')
plt.subplot(335).set_title('top_sobel')
plt.imshow(y4, cmap='gray')
plt.subplot(336).set_title('sharpen')
plt.imshow(y5, cmap='gray')
plt.subplot(337).set_title('blur')
plt.imshow(y6, cmap='gray')
plt.subplot(338).set_title('emboss')
plt.imshow(y7, cmap='gray')
plt.subplot(339).set_title('outline')
plt.imshow(y8, cmap='gray')
plt.show()

1，调整卷积核参数，测试并总结。

stride=1 表示步长为1，即每次矩阵移动一个列再构成一个矩阵，padding 是对边缘进行填充，padding=1表示填充一行一列，等于2 表示填充两行两列；
groups表示的是分组卷积，分组卷积是可以减少计算量；
dilation：是对核矩阵进行填充，如果原始的核大小是33的且dilation=2 则核大小就变成了77的了；
bias是偏置项

更改stride=10时如下图：