NNDL 实验六 卷积神经网络(2)基础算子

目录

5.2 卷积神经网络的基础算子

5.2.1 卷积算子

5.2.1.1 多通道卷积

5.2.1.2 多通道卷积层算子

5.2.1.3 卷积算子的参数量和计算量

5.2.2 汇聚层算子

选做题:使用pytorch实现Convolution Demo

总结:


5.2 卷积神经网络的基础算子

卷积神经网络是目前计算机视觉中使用最普遍的模型结构,如图5.8 所示,由MM个卷积层和bb个汇聚层组合作用在输入图片上,在网络的最后通常会加入KK个全连接层。

91832d34267e44bead69eb4f6347a856.png

图5.8:卷积神经网络经典结构

从上图可以看出,卷积网络是由多个基础的算子组合而成。下面我们先实现卷积网络的两个基础算子:卷积层算子和汇聚层算子。

5.2.1 卷积算子

卷积层是指用卷积操作来实现神经网络中一层。

为了提取不同种类的特征,通常会使用多个卷积核一起进行特征提取。

5.2.1.1 多通道卷积

e9748e8cf6934731974686a931730535.png   

9acfe334df444eee8b0ef2c08334df93.png

图5.9:多输入通道的卷积运算

8185a8120b4e4a128f5ebb4383d42edb.png

0794e02d59d6494cb7e43f86989488e1.png图5.10:多输出通道的卷积运算 

5.2.1.2 多通道卷积层算子

1. 多通道卷积卷积层的代码实现

2. Pytorch:torch.nn.Conv2d()代码实现

3. 比较自定义算子框架中的算子

import torch
import torch.nn as nn
class Conv2D(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, stride=1, padding=0,weight_attr=[],bias_attr=[]):
        super(Conv2D, self).__init__()
        # 创建卷积核
        weight_attr = torch.randn([out_channels, in_channels, kernel_size,kernel_size])
        weight_attr = torch.nn.init.constant(torch.tensor(weight_attr,dtype=torch.float32),val=1.0)
        self.weight = torch.nn.Parameter(weight_attr)
        # 创建偏置
        bias_attr = torch.zeros([out_channels, 1])
        bias_attr = torch.tensor(bias_attr,dtype=torch.float32)
        self.bias = torch.nn.Parameter(bias_attr)
        self.stride = stride
        self.padding = padding
        # 输入通道数
        self.in_channels = in_channels
        # 输出通道数
        self.out_channels = out_channels
 
    # 基础卷积运算
    def single_forward(self, X, weight):
        # 零填充
        new_X = torch.zeros([X.shape[0], X.shape[1]+2*self.padding, X.shape[2]+2*self.padding])
        new_X[:, self.padding:X.shape[1]+self.padding, self.padding:X.shape[2]+self.padding] = X
        u, v = weight.shape
        output_w = (new_X.shape[1] - u) // self.stride + 1
        output_h = (new_X.shape[2] - v) // self.stride + 1
        output = torch.zeros([X.shape[0], output_w, output_h])
        for i in range(0, output.shape[1]):
            for j in range(0, output.shape[2]):
                output[:, i, j] = torch.sum(
                    new_X[:, self.stride*i:self.stride*i+u, self.stride*j:self.stride*j+v]*weight, 
                    [1,2])
        return output
 
    def forward(self, inputs):
        """
        输入:
            - inputs:输入矩阵,shape=[B, D, M, N]
            - weights:P组二维卷积核,shape=[P, D, U, V]
            - bias:P个偏置,shape=[P, 1]
        """
        feature_maps = []
        # 进行多次多输入通道卷积运算
        p=0
        for w, b in zip(self.weight, self.bias): # P个(w,b),每次计算一个特征图Zp
            multi_outs = []
            # 循环计算每个输入特征图对应的卷积结果
            for i in range(self.in_channels):
                single = self.single_forward(inputs[:,i,:,:], w[i])
                multi_outs.append(single)
                # print("Conv2D in_channels:",self.in_channels,"i:",i,"single:",single.shape)
            # 将所有卷积结果相加
            feature_map = torch.sum(torch.stack(multi_outs), 0) + b #Zp
            feature_maps.append(feature_map)
            # print("Conv2D out_channels:",self.out_channels, "p:",p,"feature_map:",feature_map.shape)
            p+=1
        # 将所有Zp进行堆叠
        out = torch.stack(feature_maps, 1) 
        return out
 
inputs = torch.tensor([[[[0.0, 1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0, 8.0]],
               [[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]]]])
conv2d = Conv2D(in_channels=2, out_channels=3, kernel_size=2)
print("inputs shape:",inputs.shape)
outputs = conv2d(inputs)
print("Conv2D outputs shape:",outputs.shape)
 
# 比较与torch API运算结果
weight_attr = torch.ones([3,2,2,2])
bias_attr = torch.zeros([3, 1])
bias_attr = torch.tensor(bias_attr,dtype=torch.float32)
conv2d_torch = nn.Conv2d(in_channels=2, out_channels=3, kernel_size=2,bias=True)
conv2d_torch.weight = torch.nn.Parameter(weight_attr)
outputs_torch = conv2d_torch(inputs)
# 自定义算子运算结果
print('Conv2D outputs:', outputs)
# torch API运算结果
print('nn.Conv2D outputs:', outputs_torch)

986989e4c6694913a1981ccddcc2065a.png

5.2.1.3 卷积算子的参数量和计算量

参数量

64add3d3e8dc4ad89db9e06017a6d28c.png

计算量

882035184e864dc0b200706fb488a567.png

5.2.2 汇聚层算子

汇聚层的作用是进行特征选择,降低特征数量,从而减少参数数量。

由于汇聚之后特征图会变得更小,如果后面连接的是全连接层,可以有效地减小神经元的个数,节省存储空间并提高计算效率。

常用的汇聚方法有两种,分别是:平均汇聚、最大汇聚。

  • 平均汇聚:将输入特征图划分为2×22×2大小的区域,对每个区域内的神经元活性值取平均值作为这个区域的表示;
  • 最大汇聚:使用输入特征图的每个子区域内所有神经元的最大活性值作为这个区域的表示。

a0e36e2f3f624e9583c6c95de45f006e.png

汇聚层的参数量和计算量

由于汇聚层中没有参数,所以参数量为0;最大汇聚中,没有乘加运算,所以计算量为0,

平均汇聚中,输出特征图上每个点都对应了一次求平均运算。

1. 代码实现一个简单的汇聚层。 

2. torch.nn.MaxPool2d();torch.nn.avg_pool2d()代码实现

3. 比较自定义算子框架中的算子

import torch
import torch.nn as nn
class Pool2D(nn.Module):
    def __init__(self, size=(2, 2), mode='max', stride=1):
        super(Pool2D, self).__init__()
        # 汇聚方式
        self.mode = mode
        self.h, self.w = size
        self.stride = stride
 
    def forward(self, x):
        output_w = (x.shape[2] - self.w) // self.stride + 1
        output_h = (x.shape[3] - self.h) // self.stride + 1
        output = torch.zeros([x.shape[0], x.shape[1], output_w, output_h])
        # 汇聚
        for i in range(output.shape[2]):
            for j in range(output.shape[3]):
                # 最大汇聚
                if self.mode == 'max':
                    value_m = max(torch.max(
                        x[:, :, self.stride * i:self.stride * i + self.w, self.stride * j:self.stride * j + self.h],
                        3).values[0][0])
                    output[:, :, i, j] = torch.tensor(value_m)
                # 平均汇聚
                elif self.mode == 'avg':
                    value_m = max(torch.mean(
                        x[:, :, self.stride * i:self.stride * i + self.w, self.stride * j:self.stride * j + self.h],
                        3)[0][0])
                    output[:, :, i, j] = torch.tensor(value_m)
 
        return output
 
 
# 1.实现一个简单汇聚层
inputs = torch.tensor([[[[1., 2., 3., 4.], [5., 6., 7., 8.], [9., 10., 11., 12.], [13., 14., 15., 16.]]]])
pool2d = Pool2D(stride=2)
outputs = pool2d(inputs)
print("input: {}, \noutput: {}".format(inputs.shape, outputs.shape))
# 2.自定义算子上述代码已经实现,下面我们进行比较。
# 3.比较Maxpool2D与torch API运算结果
maxpool2d_torch = nn.MaxPool2d(kernel_size=(2, 2), stride=2)
outputs_torch = maxpool2d_torch(inputs)
# 自定义算子运算结果
print('Maxpool2D outputs:', outputs)
# torch API运算结果
print('nn.Maxpool2D outputs:', outputs_torch)
 
# 3.比较Avgpool2D与torch API运算结果
avgpool2d_torch = nn.AvgPool2d(kernel_size=(2, 2), stride=2)
outputs_torch = avgpool2d_torch(inputs)
pool2d = Pool2D(mode='avg', stride=2)
outputs = pool2d(inputs)
# 自定义算子运算结果
print('Avgpool2D outputs:', outputs)
# torch API运算结果
print('nn.Avgpool2D outputs:', outputs_torch)

3e3522b00ba6481bb56132eca368e8a0.png

选做题:使用pytorch实现Convolution Demo

CS231n Convolutional Neural Networks for Visual Recognition

1. 翻译以下内容

8435bdb71a414369949a4ab86cf07973.jpeg

Convolution Demo.Below is a running demo of a CONV layer.Since 3D volumes are hard to visualize, all the volumes(the input volume(in bule),the weight volumes(in red),the output volumes(in green)) are visualized with each depth slice stacked in rows.The input volume is of size W1=5,H1=5,D1=3,and the CONV layer parameters are K=2,F=3,S=2,P=1.That is,have two filters of  size 3 * 3,and they are applied with a stride of 2.Therefore,the output volume size has spatial size (5-3+2)/2+1=3. Moreover,notice that a padding of P=1 is applied to the input volume,making the outer border of the input volume zero.The visualizatin below iterates over the output activations (green),and shows that each element is computed by elementwise multiplying the highlighted input (blue) with the filter (red),summing it up,and then offsetting the result by the bias.

卷积演示。下面是CONV层的运行演示。由于3D体积很难可视化,所有体积(输入体积(蓝色)、重量体积(红色)、输出体积(绿色))都是可视化的,每个深度切片成行堆叠。输入体积的大小为W1=5,H1=5,D1=3,CONV层参数为K=2,F=3,S=2,P=1。也就是说,有两个大小为3*3的过滤器,它们的应用步长为2。因此,输出体积的大小具有空间大小(5-3+2)/2+1=3。此外,请注意,将P=1的填充应用于输入体积,使输入体积的外边界为零。下面的可视化在输出激活(绿色)上进行迭代,并显示每个元素都是通过将高亮显示的输入(蓝色)与过滤器(红色)逐元素相乘、相加,然后用偏差抵消结果来计算的。

2. 代码实现下图

4c2e240630784d128f037b805ea4225a.gif

使用卷积核Filter,代码如下:
import torch
import torch.nn as nn
class Conv2D(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, stride=1, padding=0,weight_attr=[],bias_attr=[]):
        super(Conv2D, self).__init__()
        self.weight = torch.nn.Parameter(weight_attr)
        self.bias = torch.nn.Parameter(bias_attr)
        self.stride = stride
        self.padding = padding
        # 输入通道数
        self.in_channels = in_channels
        # 输出通道数
        self.out_channels = out_channels

    # 基础卷积运算
    def single_forward(self, X, weight):
        # 零填充
        new_X = torch.zeros([X.shape[0], X.shape[1]+2*self.padding, X.shape[2]+2*self.padding])
        new_X[:, self.padding:X.shape[1]+self.padding, self.padding:X.shape[2]+self.padding] = X
        u, v = weight.shape
        output_w = (new_X.shape[1] - u) // self.stride + 1
        output_h = (new_X.shape[2] - v) // self.stride + 1
        output = torch.zeros([X.shape[0], output_w, output_h])
        for i in range(0, output.shape[1]):
            for j in range(0, output.shape[2]):
                output[:, i, j] = torch.sum(
                    new_X[:, self.stride*i:self.stride*i+u, self.stride*j:self.stride*j+v]*weight,
                    axis=[1,2])
        return output

    def forward(self, inputs):
        """
        输入:
            - inputs:输入矩阵,shape=[B, D, M, N]
            - weights:P组二维卷积核,shape=[P, D, U, V]
            - bias:P个偏置,shape=[P, 1]
        """
        feature_maps = []
        # 进行多次多输入通道卷积运算
        p=0
        for w, b in zip(self.weight, self.bias): # P个(w,b),每次计算一个特征图Zp
            multi_outs = []
            # 循环计算每个输入特征图对应的卷积结果
            for i in range(self.in_channels):
                single = self.single_forward(inputs[:,i,:,:], w[i])
                multi_outs.append(single)
                # print("Conv2D in_channels:",self.in_channels,"i:",i,"single:",single.shape)
            # 将所有卷积结果相加
            feature_map = torch.sum(torch.stack(multi_outs), axis=0) + b #Zp
            feature_maps.append(feature_map)
            # print("Conv2D out_channels:",self.out_channels, "p:",p,"feature_map:",feature_map.shape)
            p+=1
        # 将所有Zp进行堆叠
        out = torch.stack(feature_maps, 1)
        return out
#创建第一层卷积核
weight_attr1 = torch.tensor([[[-1,1,0],[0,1,0],[0,1,1]],[[-1,-1,0],[0,0,0],[0,-1,0]],[[0,0,-1],[0,1,0],[1,-1,-1]]],dtype=torch.float32)
weight_attr1 = weight_attr1.reshape([1,3,3,3])
bias_attr1 = torch.tensor(torch.ones([3,1]))
print("第一层卷积,W0为:\n",weight_attr1)
#传入参数进行输出
Input_Volume = torch.tensor([[[0,1,1,0,2],[2,2,2,2,1],[1,0,0,2,0],[0,1,1,0,0],[1,2,0,0,2]]
                ,[[1,0,2,2,0],[0,0,0,2,0],[1,2,1,2,1],[1,0,0,0,0],[1,2,1,1,1]],
               [[2,1,2,0,0],[1,0,0,1,0],[0,2,1,0,1],[0,1,2,2,2],[2,1,0,0,1]]])
Input_Volume = Input_Volume.reshape([1,3,5,5])
conv2d_1 = Conv2D(in_channels=3, out_channels=3, kernel_size=3, stride=2,padding=1,weight_attr=weight_attr1 , bias_attr=bias_attr1)
output1 = conv2d_1(Input_Volume)
print("第一层卷积,W0的输出结果:\n",output1)

#创建第二层卷积核
weight_attr2 = torch.tensor([[[1,1,-1],[-1,-1,1],[0,-1,1]],[[0,1,0],[-1,0,-1],[-1,1,0]],[[-1,0,0],[-1,0,1],[-1,0,0]]],dtype=torch.float32)
weight_attr2 = weight_attr2.reshape([1,3,3,3])
bias_attr2 = torch.tensor(torch.zeros([3,1]))
print("第二层卷积,W1为:\n",weight_attr2)
Input_Volume = torch.tensor([[[0,1,1,0,2],[2,2,2,2,1],[1,0,0,2,0],[0,1,1,0,0],[1,2,0,0,2]]
                ,[[1,0,2,2,0],[0,0,0,2,0],[1,2,1,2,1],[1,0,0,0,0],[1,2,1,1,1]],
               [[2,1,2,0,0],[1,0,0,1,0],[0,2,1,0,1],[0,1,2,2,2],[2,1,0,0,1]]])
Input_Volume = Input_Volume.reshape([1,3,5,5])
conv2d_2 = Conv2D(in_channels=3, out_channels=2, kernel_size=3, stride=2,padding=1,weight_attr=weight_attr2 , bias_attr=bias_attr2)
output2 = conv2d_2(Input_Volume)
print("第二层卷积,W1的输出结果:\n",output2)

51f53c86195641b79d883c1a6e4ffb83.png

总结:

         这次实验主要是多通道卷积算子的实现和汇聚层算子的实现,学习了汇聚层算子参数量和计算量如何计算,了解了框架算子的使用方法,以及与自定义算子的区别,实现了多通道的卷积操作,实践了利用不同卷积核提取不同特征的过程。通过本次实验,对多通道卷积和汇聚有了更多理解。

参考:

NNDL 实验六 卷积神经网络(2)基础算子_HBU_David的博客-CSDN博客
神经网络与深度学习day09-卷积神经网络2:基础算子_小鬼缠身、的博客-CSDN博客

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
好的,以下是处理 MNIST 数据集分类问题的卷积神经网络的代码: ```python import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data # 加载 MNIST 数据集 mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) # 定义输入和输出的占位符 x = tf.placeholder(tf.float32, [None, 784]) y_true = tf.placeholder(tf.float32, [None, 10]) # 将输入数据转换为图片的形式 x_image = tf.reshape(x, [-1, 28, 28, 1]) # 第一层卷积 conv1 = tf.layers.conv2d(inputs=x_image, filters=32, kernel_size=[5, 5], padding="same", activation=tf.nn.relu) # 第一层池化 pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2) # 第二层卷积 conv2 = tf.layers.conv2d(inputs=pool1, filters=64, kernel_size=[5, 5], padding="same", activation=tf.nn.relu) # 第二层池化 pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2) # 将图像数据展开为一维向量 pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64]) # 第一层全连接层 fc1 = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu) # Dropout keep_prob = tf.placeholder(tf.float32) fc1_drop = tf.nn.dropout(fc1, keep_prob) # 第二层全连接层 y_pred = tf.layers.dense(inputs=fc1_drop, units=10) # 定义损失函数和优化器 cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=y_true, logits=y_pred)) train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) # 定义准确率 correct_prediction = tf.equal(tf.argmax(y_pred, 1), tf.argmax(y_true, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) # 开始训练 with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for i in range(20000): batch_x, batch_y = mnist.train.next_batch(50) sess.run(train_step, feed_dict={x: batch_x, y_true: batch_y, keep_prob: 0.5}) if i % 100 == 0: train_acc = sess.run(accuracy, feed_dict={x: batch_x, y_true: batch_y, keep_prob: 1.0}) print("Step %d, training accuracy %g" % (i, train_acc)) # 在测试集上计算准确率 test_acc = sess.run(accuracy, feed_dict={x: mnist.test.images, y_true: mnist.test.labels, keep_prob: 1.0}) print("Test accuracy %g" % test_acc) ``` 这个代码使用了两个卷积层和两个全连接层,以及池化和 Dropout。 在训练过程中,我们每 100 次迭代输出一次训练集上的准确率,并在最后输出测试集上的准确率。 你可以将该代码保存到一个 Python 文件中,然后在命令行中运行该文件来执行训练和测试。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值