NNDL 作业6 卷积

এ琳

已于 2023-12-10 16:10:37 修改

阅读量111

点赞数 4

文章标签：深度学习人工智能

于 2023-11-04 23:06:07 首次发布

本文链接：https://blog.csdn.net/m0_67043426/article/details/134190226

版权

一、概念

用自己的语言描述“卷积、卷积核、特征图、特征选择、步长、填充、感受野”。

卷积：

设 $f(x)$ 和 $g(x)$ 是定义在无穷区间上的两个连续时间信号，则将积分

定义为 $f(x)$ 和 $g(x)$ 的卷积（Convolution），记为 $f(x)*g(x)$ 。

可以理解为，将一个图像与一个卷积核进行加权累加的过程。

卷积和互相关

卷积计算需要进行卷积核翻转。翻转：旋转180°

互相关（不翻转卷积）是衡量两个序列相关性的函数，通常是用滑动窗口的点积计算来实现

互相关和卷积的区别仅仅在于卷积核是否进行翻转

卷积核：

卷积核，又叫滤波器，通常用于卷积操作中。在卷积过程中，卷积核会在输入图像上滑动，并对覆盖的像素进行加权求和，输出新的特征图。

特征图：

图像与卷积核加权累加得到卷积后的结果。

特征选择：

特征选择是指通过学习来选择哪些特征能够更好地表示数据。在卷积神经网络的训练过程中，网络会自动学习哪些特征对于解决任务最为重要，并且将这些特征加强和突出，同时忽略掉其他无关的特征。通过特征选择，卷积神经网络可以有效提取和利用输入数据中的信息，从而实现高效和准确的图像处理任务

步长：

步长是指卷积核在图像上滑动的步幅。也就是说，卷积核每次在图像上移动的距离，比如高度2 宽度2就是卷积核从左到右每次移动2，到图象边缘的时候回到初始位置在向下移动2。

填充：

填充(padding)是指在输⼊⾼和宽的两侧填充元素(通常是0元素)。

对原始矩阵进行填充，可以减少边缘信息丢失。

卷积的结果按输出长度不同可以分为三类:
窄卷积:步长S=1，两端不补零P=0，输出长度为M-K+ 1

宽卷积:步长S=1，两端补零P=K-1，输出长度M+ K- 1

等宽卷积:步长S=1，两端补零P=(K- 1)/2，输出长度M

常用的两种padding：

（1）valid padding：不进行任何处理，只使用原始图像，不允许卷积核超出原始图像边界

（2）same padding：进行填充，允许卷积核超出原始图像边界，并使得卷积后结果的大小与原来的一致

感受野：

感受野是CNN每一层输出的特征图上的像素点在输入图像上映射的区域大小。

二、探究不同卷积核的作用

卷积神经网络工作原理的直观理解_superdont的博客-CSDN博客

1. 图1分别使用卷积核 $\left ( 1 -1 \right )$ , $\binom{1}{-1}$ ，输出特征图

# -*- coding: utf-8 -*-
import numpy as np
import torch
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号

w1 = np.array([1, -1], dtype='float32').reshape([1, 1, 1, 2])
w2 = np.array([1, -1], dtype='float32').T.reshape([1, 1, 2, 1])
print(w2)
w1 = torch.Tensor(w1)
w2 = torch.Tensor(w2)
conv1 = torch.nn.Conv2d(1, 1, [1, 2])
conv1.weight = torch.nn.Parameter(w1)

conv2 = torch.nn.Conv2d(1, 1, [2, 1])
conv2.weight = torch.nn.Parameter(w2)
# 创建图像
img = np.ones([7, 6], dtype='float32')
img[:, 3:] = 0.
img[:, :3] = 255.
x = img.reshape([1, 1, 7, 6])
x = torch.Tensor(x)

y1 = conv1(x).detach().numpy()
y2 = conv2(x).detach().numpy()
plt.subplot(131).set_title('图1')
plt.imshow(img, cmap='gray')
plt.subplot(132).set_title('卷积核为(1,-1)结果')
plt.imshow(y1.squeeze(), cmap='gray')
plt.subplot(133).set_title('卷积核为(1,-1)T结果')
plt.imshow(y2.squeeze(), cmap='gray')
plt.show()

2. 图2分别使用卷积核 $\left ( 1 -1 \right )$ , $\binom{1}{-1}$ ，，输出特征图

# -*- coding: utf-8 -*-
import numpy as np
import torch
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号

w1 = np.array([1, -1], dtype='float32').reshape([1, 1, 1, 2])
w2 = np.array([1, -1], dtype='float32').T.reshape([1, 1, 2, 1])
print(w2)
w1 = torch.Tensor(w1)
w2 = torch.Tensor(w2)
conv1 = torch.nn.Conv2d(1, 1, [1, 2])
conv1.weight = torch.nn.Parameter(w1)

conv2 = torch.nn.Conv2d(1, 1, [2, 1])
conv2.weight = torch.nn.Parameter(w2)
# 创建图像
img = np.ones([8, 8], dtype='float32')
img[:4, :4] = 0.
img[:4, 4:] = 255.
img[4:, :4] = 255.
img[4:, 4:] = 0.

x = img.reshape([1, 1, 8, 8])
x = torch.Tensor(x)

y1 = conv1(x).detach().numpy()
y2 = conv2(x).detach().numpy()
plt.subplot(131).set_title('图2')
plt.imshow(img, cmap='gray')
plt.subplot(132).set_title('卷积核为(1,-1)结果')
plt.imshow(y1.squeeze(), cmap='gray')
plt.subplot(133).set_title('卷积核为(1,-1)T结果')
plt.imshow(y2.squeeze(), cmap='gray')
plt.show()

3. 图3分别使用卷积核 $\left ( 1 -1 \right )$ , $\binom{1}{-1}$ ，,输出特征图

# -*- coding: utf-8 -*-
import numpy as np
import torch
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号

w1 = np.array([1, -1], dtype='float32').reshape([1, 1, 1, 2])
w2 = np.array([1, -1], dtype='float32').T.reshape([1, 1, 2, 1])
w3 = np.array([[1, -1, -1, 1]], dtype='float32').reshape([1, 1, 2, 2])

print(w3)
w1 = torch.Tensor(w1)
w2 = torch.Tensor(w2)
w3 = torch.Tensor(w3)

conv1 = torch.nn.Conv2d(1, 1, [1, 2])
conv1.weight = torch.nn.Parameter(w1)

conv2 = torch.nn.Conv2d(1, 1, [2, 1])
conv2.weight = torch.nn.Parameter(w2)

conv3 = torch.nn.Conv2d(1, 1, [2, 2])
conv3.weight = torch.nn.Parameter(w3)
# 创建图像
img = np.ones([9, 9], dtype='float32')
for i in range(7):
    img[i + 1, i + 1] = 255.
    img[i + 1, 7 - i] = 255.

x = img.reshape([1, 1, 9, 9])
x = torch.Tensor(x)

y1 = conv1(x).detach().numpy()
y2 = conv2(x).detach().numpy()
y3 = conv3(x).detach().numpy()
plt.subplot(221).set_title('图3')
plt.imshow(img, cmap='gray')
plt.subplot(222).set_title('卷积核为(1,-1)结果')
plt.imshow(y1.squeeze(), cmap='gray')
plt.subplot(223).set_title('卷积核为(1,-1)T结果')
plt.imshow(y2.squeeze(), cmap='gray')
plt.subplot(224).set_title('卷积核为[[1 -1],[-1 1]]结果')
plt.imshow(y3.squeeze(), cmap='gray')
plt.show()

4. 实现灰度图的边缘检测、锐化、模糊

边缘检测

[-1, -1, -1]
[-1,  8, -1]
[-1, -1, -1]

# coding=gbk
import numpy as np
import torch
from torch import nn
from torch.autograd import Variable
from PIL import Image
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号 #有中文出现的情况，需要u'内容
file_path = 'deer.jpg'
im = Image.open(file_path).convert('L')  # 读入一张灰度图的图片
im = np.array(im, dtype='float32')  # 将其转换为一个矩阵
print(im.shape[0], im.shape[1])

plt.imshow(im.astype('uint8'), cmap='gray')  # 可视化图片
plt.title('原图')
plt.show()

im = torch.from_numpy(im.reshape((1, 1, im.shape[0], im.shape[1])))
conv1 = nn.Conv2d(1, 1, 3, bias=False, padding=1)  # 定义卷积

sobel_kernel = np.array([[-1, -1, -1],
                         [-1, 8, -1],
                         [-1, -1, -1]], dtype='float32')  # 定义轮廓检测算子
sobel_kernel = sobel_kernel.reshape((1, 1, 3, 3))  # 适配卷积的输入输出
conv1.weight.data = torch.from_numpy(sobel_kernel)  # 给卷积的 kernel 赋值

edge1 = conv1(Variable(im))  # 作用在图片上
for i in range(edge1.shape[2]):
    for j in range(edge1.shape[3]):
        if edge1[0][0][i][j] > 255:
            edge1[0][0][i][j] = 255
        if edge1[0][0][i][j] < 0:
            edge1[0][0][i][j] = 0
x = edge1.data.squeeze().numpy()

print(x.shape)  # 输出大小

plt.imshow(x, cmap='gray')
plt.title('边缘检测')
plt.show()

锐化

[0, -1, 0]
[-1, 5, -1]
[0, -1, 0]

import numpy as np
import torch
from torch import nn
from torch.autograd import Variable
from PIL import Image
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号 #有中文出现的情况，需要u'内容
file_path = 'deer.jpg'
im = Image.open(file_path).convert('L')  # 读入一张灰度图的图片
im = np.array(im, dtype='float32')  # 将其转换为一个矩阵
print(im.shape[0], im.shape[1])

plt.imshow(im.astype('uint8'), cmap='gray')  # 可视化图片
plt.title('原图')
plt.show()

im = torch.from_numpy(im.reshape((1, 1, im.shape[0], im.shape[1])))
conv1 = nn.Conv2d(1, 1, 3, bias=False, padding=1)  # 定义卷积

sobel_kernel = np.array([[0, -1, 0],
                         [-1, 5, -1],
                         [0, -1, 0]], dtype='float32')  # 定义轮廓检测算子
sobel_kernel = sobel_kernel.reshape((1, 1, 3, 3))  # 适配卷积的输入输出
conv1.weight.data = torch.from_numpy(sobel_kernel)  # 给卷积的 kernel 赋值

edge1 = conv1(Variable(im))  # 作用在图片上
for i in range(edge1.shape[2]):
    for j in range(edge1.shape[3]):
        if edge1[0][0][i][j] > 255:
            edge1[0][0][i][j] = 255
        if edge1[0][0][i][j] < 0:
            edge1[0][0][i][j] = 0
x = edge1.data.squeeze().numpy()

print(x.shape)  # 输出大小

plt.imshow(x, cmap='gray')
plt.title('锐化')
plt.show()

模糊

[0.0625, 0.125, 0.0625]
[ 0.125,  0.25,  0.125]
[0.0625, 0.125, 0.0625]

# coding=gbk
import numpy as np
import torch
from torch import nn
from torch.autograd import Variable
from PIL import Image
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号 #有中文出现的情况，需要u'内容
file_path = 'deer.jpg'
im = Image.open(file_path).convert('L')  # 读入一张灰度图的图片
im = np.array(im, dtype='float32')  # 将其转换为一个矩阵
print(im.shape[0], im.shape[1])

plt.imshow(im.astype('uint8'), cmap='gray')  # 可视化图片
plt.title('原图')
plt.show()

im = torch.from_numpy(im.reshape((1, 1, im.shape[0], im.shape[1])))
conv1 = nn.Conv2d(1, 1, 3, bias=False, padding=1)  # 定义卷积

sobel_kernel = np.array([[0.0625, 0.125, 0.0625],
                         [0.125, 0.25, 0.125],
                         [0.0625, 0.125, 0.0625]], dtype='float32')  # 定义轮廓检测算子
sobel_kernel = sobel_kernel.reshape((1, 1, 3, 3))  # 适配卷积的输入输出
conv1.weight.data = torch.from_numpy(sobel_kernel)  # 给卷积的 kernel 赋值

edge1 = conv1(Variable(im))  # 作用在图片上
for i in range(edge1.shape[2]):
    for j in range(edge1.shape[3]):
        if edge1[0][0][i][j] > 255:
            edge1[0][0][i][j] = 255
        if edge1[0][0][i][j] < 0:
            edge1[0][0][i][j] = 0
x = edge1.data.squeeze().numpy()

print(x.shape)  # 输出大小

plt.imshow(x, cmap='gray')
plt.title('模糊')
plt.show()