NNDL 作业5：卷积

最新推荐文章于 2024-11-14 13:48:50 发布

Perfect(*^ω^*)

最新推荐文章于 2024-11-14 13:48:50 发布

阅读量238

点赞数 2

文章标签： python 深度学习 numpy

本文链接：https://blog.csdn.net/weixin_51668257/article/details/127341550

版权

NNDL 作业5：卷积

作业1
- 编程实现：
作业2
总结

作业1

编程实现：

在这里插入图片描述

1. 在这里插入图片描述

2. 在这里插入图片描述

import numpy as np
import torch
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号

w1 = np.array([1, -1], dtype='float32').reshape([1, 1, 1, 2])
w2 = np.array([1, -1], dtype='float32').T.reshape([1, 1, 2, 1])
print(w2)
w1 = torch.Tensor(w1)
w2 = torch.Tensor(w2)
conv1 = torch.nn.Conv2d(1, 1, [1, 2])
conv1.weight = torch.nn.Parameter(w1)

conv2 = torch.nn.Conv2d(1, 1, [2, 1])
conv2.weight = torch.nn.Parameter(w2)# 创建图像
img = np.ones([7, 6], dtype='float32')
img[:, 3:] = 0.
img[:, :3] = 255.
x = img.reshape([1, 1, 7, 6])
x = torch.Tensor(x)

y1 = conv1(x).detach().numpy()
y2 = conv2(x).detach().numpy()
plt.subplot(131).set_title('图1')
plt.imshow(img,cmap='gray')
plt.subplot(132).set_title('图1使用卷积核为(1,-1)结果')
plt.imshow(y1.squeeze(),cmap='gray')
plt.subplot(133).set_title('图1使用卷积核为(1,-1)T结果')
plt.imshow(y2.squeeze(),cmap='gray')
plt.show()

import numpy as np
import torch
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号

w1 = np.array([1, -1], dtype='float32').reshape([1, 1, 1, 2])
w2 = np.array([1, -1], dtype='float32').T.reshape([1, 1, 2, 1])
print(w2)
w1 = torch.Tensor(w1)
w2 = torch.Tensor(w2)
conv1 = torch.nn.Conv2d(1, 1, [1, 2])
conv1.weight = torch.nn.Parameter(w1)

conv2 = torch.nn.Conv2d(1, 1, [2, 1])
conv2.weight = torch.nn.Parameter(w2)# 创建图像
img = np.ones([8, 8], dtype='float32')
img[:4, :4] = 0.
img[:4, 4:] = 255.
img[4:, :4] = 255.
img[4:, 4:] = 0.

x = img.reshape([1, 1, 8, 8])
x = torch.Tensor(x)

y1 = conv1(x).detach().numpy()
y2 = conv2(x).detach().numpy()
plt.subplot(131).set_title('图2')
plt.imshow(img, cmap='gray')
plt.subplot(132).set_title('图2使用卷积核为(1,-1)结果')
plt.imshow(y1.squeeze(), cmap='gray')
plt.subplot(133).set_title('图2使用卷积核为(1,-1)T结果')
plt.imshow(y2.squeeze(), cmap='gray')
plt.show()

5. 在这里插入图片描述

import numpy as np
import torch
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号

w1 = np.array([1, -1], dtype='float32').reshape([1, 1, 1, 2])
w2 = np.array([1, -1], dtype='float32').T.reshape([1, 1, 2, 1])
w3 =  np.array([[1,-1,-1,1]],dtype='float32').reshape([1,1,2,2])

print(w3)
w1 = torch.Tensor(w1)
w2 = torch.Tensor(w2)
w3 = torch.Tensor(w3)

conv1 = torch.nn.Conv2d(1, 1, [1, 2])
conv1.weight = torch.nn.Parameter(w1)

conv2 = torch.nn.Conv2d(1, 1, [2, 1])
conv2.weight = torch.nn.Parameter(w2)

conv3 = torch.nn.Conv2d(1,1,[2,2])
conv3.weight = torch.nn.Parameter(w3)
# 创建图像
img = np.ones([9, 9], dtype='float32')
for i in range(7):
    img[i+1,i+1]=255.
    img[i+1,7-i]=255.

x = img.reshape([1, 1, 9, 9])
x = torch.Tensor(x)

y1 = conv1(x).detach().numpy()
y2 = conv2(x).detach().numpy()
y3 = conv3(x).detach().numpy()
plt.subplot(221).set_title('图3')
plt.imshow(img, cmap='gray')
plt.subplot(222).set_title('图3使用卷积核为(1,-1)结果')
plt.imshow(y1.squeeze(),cmap='gray')
plt.subplot(223).set_title('图3使用卷积核为(1,-1)T结果')
plt.imshow(y2.squeeze(),cmap='gray')
plt.subplot(224).set_title('图3使用卷积核为[[1 -1],[-1 1]]结果')
plt.imshow(y3.squeeze(),cmap='gray')
plt.show()

作业2

一、概念

卷积：卷积是指在滑动中提取特征的过程，可以形象地理解为用放大镜把每步都放大并且拍下来，再把拍下来的图片拼接成一个新的大图片的过程。
卷积核：可以看作对某个局部的加权求和；它是对应局部感知，它的原理是在观察某个物体时我们既不能观察每个像素也不能一次观察整体，而是先从局部开始认识，这就对应了卷积。卷积核的大小一般为奇数*奇数
特征图：特征图就是对原始图像像素矩阵经过卷积操作后的结果矩阵。
特征选择：卷积的作用就是用来提取特征，具体提取哪部分特征需要选择。类似于放大镜去选择放大哪一部分。
步长：卷积核进行一次卷积后，横向移动的步长和纵向移动的步长。
填充：边缘上的像素永远不会位于卷积核中心，而卷积核也没法扩展到边缘区域以外,所以输入图像的边缘被“修剪”掉了。这是不理想的，通常我们都希望输入和输出的大小应该保持一致。将输入图像的边缘用0填充。
感受野：最后特征图的某个点对应原图的范围。

二、探究不同卷积核的作用

1、锐化：

2.边缘检测：

3、模糊：

4、底部轮廓检测:

5、左侧轮廓检测:

6、右侧轮廓检测：

7.上部轮廓检测：

8.浮雕：

三、编程实现

1、实现灰度图的边缘检测、锐化、模糊。（必做）

边缘检测：

#边缘检测
import numpy as np
import torch
from torch import nn
from torch.autograd import Variable
from PIL import Image
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号 #有中文出现的情况，需要u'内容
# https://blog.csdn.net/weixin_40123108/article/details/83510592
file_path = 'D:\\copy\\6.jpg'
im = Image.open(file_path).convert('L')  # 读入一张灰度图的图片
im = np.array(im, dtype='float32')  # 将其转换为一个矩阵
print(im.shape[0], im.shape[1])
plt.imshow(im.astype('uint8'), cmap='gray')  # 可视化图片
plt.title('原图')
plt.show()

im = torch.from_numpy(im.reshape((1, 1, im.shape[0], im.shape[1])))
conv1 = nn.Conv2d(1, 1, 3, bias=False,padding=5)  # 定义卷积

sobel_kernel = np.array([[-1, -1, -1],
                         [-1, 8, -1],
                         [-1, -1, -1]], dtype='float32')  # 定义轮廓检测算子
print(sobel_kernel.shape)
sobel_kernel = sobel_kernel.reshape((1, 1, 3, 3))  # 适配卷积的输入输出
conv1.weight.data = torch.from_numpy(sobel_kernel)  # 给卷积的 kernel 赋值

edge1 = conv1(Variable(im))  # 作用在图片上

x = edge1.data.squeeze().numpy()
print(x.shape)  # 输出大小

plt.imshow(x, cmap='gray')
plt.show()

锐化：

#锐化
#encoding:utf-8
#By:Eastmount CSDN 2021-07-19
import cv2
import numpy as np
import matplotlib.pyplot as plt

#读取图像
img = cv2.imread('D:\\copy\\6.jpg')
lenna_img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

#灰度化处理图像
grayImage = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

#高斯滤波
gaussianBlur = cv2.GaussianBlur(grayImage, (3,3), 0)

#阈值处理
ret, binary = cv2.threshold(gaussianBlur, 127, 255, cv2.THRESH_BINARY)

#Roberts算子
kernelx = np.array([[-1,0],[0,1]], dtype=int)
kernely = np.array([[0,-1],[1,0]], dtype=int)
x = cv2.filter2D(binary, cv2.CV_16S, kernelx)
y = cv2.filter2D(binary, cv2.CV_16S, kernely)
absX = cv2.convertScaleAbs(x)
absY = cv2.convertScaleAbs(y)
Roberts = cv2.addWeighted(absX, 0.5, absY, 0.5, 0)

#Prewitt算子
kernelx = np.array([[1,1,1],[0,0,0],[-1,-1,-1]], dtype=int)
kernely = np.array([[-1,0,1],[-1,0,1],[-1,0,1]], dtype=int)
x = cv2.filter2D(binary, cv2.CV_16S, kernelx)
y = cv2.filter2D(binary, cv2.CV_16S, kernely)
absX = cv2.convertScaleAbs(x)
absY = cv2.convertScaleAbs(y)
Prewitt = cv2.addWeighted(absX,0.5,absY,0.5,0)

#Sobel算子
x = cv2.Sobel(binary, cv2.CV_16S, 1, 0)
y = cv2.Sobel(binary, cv2.CV_16S, 0, 1)
absX = cv2.convertScaleAbs(x)
absY = cv2.convertScaleAbs(y)
Sobel = cv2.addWeighted(absX, 0.5, absY, 0.5, 0)

#拉普拉斯算法
dst = cv2.Laplacian(binary, cv2.CV_16S, ksize = 3)
Laplacian = cv2.convertScaleAbs(dst)

#效果图
titles = ['Source Image', 'Binary Image', 'Roberts Image',
          'Prewitt Image','Sobel Image', 'Laplacian Image']
images = [lenna_img, binary, Roberts, Prewitt, Sobel, Laplacian]
for i in np.arange(6):
    plt.subplot(2,3,i+1),plt.imshow(images[i],'gray')
    plt.title(titles[i])
    plt.xticks([]),plt.yticks([])
plt.show()

模糊：
(1)均值模糊：

# 图像模糊处理
# 均值模糊 box blur

import cv2
import numpy as np
import matplotlib.pyplot as plt

if __name__ == "__main__":
    image = cv2.imread('D:\\copy\\6.jpg')

    image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    # 此为均值模糊
    # （30,1）为一维卷积核，指在x，y方向偏移多少位
    dst1 = cv2.blur(image, (30, 1))

    # 此为中值模糊，常用于去除椒盐噪声
    dst2 = cv2.medianBlur(image, 15)

    # 自定义卷积核，执行模糊操作，也可定义执行锐化操作
    kernel = np.ones([5, 5], np.float32) / 25
    dst3 = cv2.filter2D(image, -1, kernel=kernel)

    plt.subplot(2, 2, 1)
    plt.imshow(image)
    plt.axis('off')
    plt.title('Offical')

    plt.subplot(2, 2, 2)
    plt.imshow(dst1)
    plt.axis('off')
    plt.title('Box blur')

    plt.subplot(2, 2, 3)
    plt.imshow(dst2)
    plt.axis('off')
    plt.title('median blur')

    plt.subplot(2, 2, 4)
    plt.imshow(dst3)
    plt.axis('off')
    plt.title('defined blur')

    plt.show()

(2)高斯模糊：

# 图像模糊处理
# 高斯模糊 gaussian blur
# 使用自编写高斯噪声及自编写高斯模糊函数与自带高斯函数作效果对比

import cv2
import numpy as np
import matplotlib.pyplot as plt


def clamp(pv):
    if pv > 255:
        return 255
    if pv < 0:
        return 0
    else:
        return pv


def gaussian_noise(image):        # 加高斯噪声
    h, w, c = image.shape
    for row in range(h):
        for col in range(w):
            s = np.random.normal(0, 20, 3)
            b = image[row, col, 0]   # blue
            g = image[row, col, 1]   # green
            r = image[row, col, 2]   # red
            image[row, col, 0] = clamp(b + s[0])
            image[row, col, 1] = clamp(g + s[1])
            image[row, col, 2] = clamp(r + s[2])
    dst = cv2.GaussianBlur(image, (15, 15), 0)  # 高斯模糊
    return dst, image


if __name__ == "__main__":
    src = cv2.imread('D:\\copy\\6.jpg')
    plt.subplot(2, 2, 1)
    plt.imshow(src)
    plt.axis('off')
    plt.title('Offical')

    output, noise = gaussian_noise(src)
    cvdst = cv2.GaussianBlur(src, (15, 15), 0)   # 高斯模糊

    plt.subplot(2, 2, 2)
    plt.imshow(noise)
    plt.axis('off')
    plt.title('Gaussian Noise')

    plt.subplot(2, 2, 3)
    plt.imshow(output)
    plt.axis('off')
    plt.title('Gaussian Blur')

    plt.subplot(2, 2, 4)
    plt.imshow(cvdst)
    plt.axis('off')
    plt.title('defined blur by opencv')

    plt.show()

在这里插入图片描述

(3)运动模糊：

# 图像模糊处理
# 运动模糊，亦称动态模糊，motion blur
# 运动模糊：由于相机和物体之间的相对运动造成的模糊

import numpy as np
import cv2
import matplotlib.pyplot as plt


def motion_blur(image, degree=12, angle=45):

    image = np.array(image)
    # 这里生成任意角度的运动模糊kernel的矩阵， degree越大，模糊程度越高
    M = cv2.getRotationMatrix2D((degree / 2, degree / 2), angle, 1)
    motion_blur_kernel = np.diag(np.ones(degree))
    motion_blur_kernel = cv2.warpAffine(motion_blur_kernel, M, (degree, degree))
    motion_blur_kernel = motion_blur_kernel / degree
    blurred = cv2.filter2D(image, -1, motion_blur_kernel)
    # convert to uint8
    cv2.normalize(blurred, blurred, 0, 255, cv2.NORM_MINMAX)
    blurred = np.array(blurred, dtype=np.uint8)
    return blurred


if __name__ == "__main__":

    img = cv2.imread('D:\\copy\\6.jpg')
    dst = motion_blur(img)

    plt.subplot(1, 2, 1)
    plt.imshow(img)
    plt.axis('off')
    plt.title('Offical')

    plt.subplot(1, 2, 2)
    plt.imshow(dst)
    plt.axis('off')
    plt.title('Motion blur')

    plt.show()

这里是引用

2、调整卷积核参数，测试并总结。（必做）

调整卷积核步长为2：
在这里插入图片描述

调整卷积核步长为3：
在这里插入图片描述

结果：随着步长变大，像素点逐渐减少，并且图像提取的边界越来越模糊。因此，选择一个合适的步长十分重要。

调整卷积核参数，设置padding为3：

调整卷积核参数，设置padding为5：

Padding，它会用额外的“假”像素填充边缘（值一般为0），这样，当卷积核扫描输入数据时，它能延伸到边缘以外的伪像素，从而使输出和输入大小相同。因此使用padding可以使图片的边缘也可以得到使用，不被浪费掉。

3、使用不同尺寸图片，测试并总结。（必做）

原图：

边缘检测:

锐化：

模糊：
（1）均值模糊：

(2)高斯模糊:

（3）运动模糊：

4.探索更多类型卷积核。（选做）

上图已经进行多种类型卷积核的操作，见上图。

5.尝试彩色图片边缘检测。（选做）

import numpy as np
import torch
from torch import nn
from torch.autograd import Variable
import torch.nn.functional as F
from PIL import Image
import matplotlib.pyplot as plt

im = Image.open(r'D:\\copy\\8.jpg').convert('L') # 读入一张灰度图的图片
im = np.array(im, dtype='float32') # 将其转换为一个矩阵

#print(im.shape[0],im.shape[1])     448*448
# 可视化图片
plt.imshow(im.astype('uint8'), cmap='gray')
im = torch.from_numpy(im.reshape((1, 1, im.shape[0], im.shape[1])))
conv1 = nn.Conv2d(in_channels=1,out_channels=1, kernel_size=3, bias=False,stride=1,padding=1) # 定义卷积

sobel_kernel = np.array([[-1, -1, -1], [-1, 8, -1], [-1, -1, -1]], dtype='float32') # 定义轮廓检测算子
sobel_kernel = sobel_kernel.reshape((1, 1, 3, 3)) # 适配卷积的输入输出
conv1.weight.data = torch.from_numpy(sobel_kernel) # 给卷积的 kernel 赋值

edge1 = conv1(Variable(im)) # 作用在图片上
edge1 = edge1.data.squeeze().numpy() # 将输出转换为图片的格式
plt.imshow(edge1, cmap='gray')

plt.show()

总结

本次作业主要实现了卷积的应用，更加深刻的了解了卷积等一系列的概念，以及如何实现图像的边缘检测和锐化和模糊。调整卷积核参数，可以通过改变卷积核的步长，改变padding等主要是对cov2d函数进行改变使图片发生变化。
卷积能够提取特征的原理：特征是数字图像映射到计算机处理的矩阵，而每个矩阵的数值就是一个特征点，由一幅图像组成的整个特征矩阵就是一个特征图，每输入网络的点针对神经网络而言都是一个特征，不同维度的特征就是不同维度的特征向量。故卷积、CNN并不是完全说是提取特征，而是对特征的一种处理或者说是转变，所以卷积和卷积神经网络不过是针对图像方面的特征，处理起来更符合能达到预期结果。
参考：
概念01 为什么卷积、CNN能够提取特征？什么是特征、卷积？
图像中的卷积为何能够提取特征
 深度学习之卷积