深度学习笔记12_卷积神经网络

最新推荐文章于 2022-02-24 15:13:17 发布

瓦力人工智能

最新推荐文章于 2022-02-24 15:13:17 发布

阅读量437

点赞数 1

分类专栏： keras深度学习笔记文章标签：卷积神经网络 keras 深度学习模型猫狗

本文链接：https://blog.csdn.net/fioletfly/article/details/100706207

版权

keras深度学习笔记专栏收录该内容

17 篇文章 3 订阅

订阅专栏

卷积神经网络

Keras 中的卷积神经网络的搭建

主要利用keras中的两个函数进行构建：

Conv2D
- filters: Integer, the dimensionality of the output space
  (i.e. the number of output filters in the convolution).
- kernel_size: An integer or tuple/list of 2 integers, specifying the
  height and width of the 2D convolution window.
  Can be a single integer to specify the same value for
  all spatial dimensions.
- strides: An integer or tuple/list of 2 integers,
  specifying the strides of the convolution
  along the height and width.
  Can be a single integer to specify the same value for
  all spatial dimensions.
  Specifying any stride value != 1 is incompatible with specifying
  any dilation_rate value != 1.
- padding: one of "valid" or "same" (case-insensitive).
  Note that "same" is slightly inconsistent across backends with
  strides != 1, as described
- data_format: A string,
  one of "channels_last" or "channels_first".
  The ordering of the dimensions in the inputs.
  "channels_last" corresponds to inputs with shape
  (batch, height, width, channels) while "channels_first"
  corresponds to inputs with shape
  (batch, channels, height, width).
  It defaults to the image_data_format value found in your
  Keras config file at ~/.keras/keras.json.
  If you never set it, then it will be “channels_last”.
- dilation_rate: an integer or tuple/list of 2 integers, specifying
  the dilation rate to use for dilated convolution.
  Can be a single integer to specify the same value for
  all spatial dimensions.
  Currently, specifying any dilation_rate value != 1 is
  incompatible with specifying any stride value != 1.
- activation: Activation function to use
  If you don’t specify anything, no activation is applied
  (ie. “linear” activation: a(x) = x).
- use_bias: Boolean, whether the layer uses a bias vector.
- kernel_initializer: Initializer for the kernel weights matrix
- bias_initializer: Initializer for the bias vector
- kernel_regularizer: Regularizer function applied to
  the kernel weights matrix
- bias_regularizer: Regularizer function applied to the bias vector
- activity_regularizer: Regularizer function applied to
  the output of the layer (its “activation”).
- kernel_constraint: Constraint function applied to the kernel matrix
- bias_constraint: Constraint function applied to the bias vector
MaxPooling2D
- pool_size: integer or tuple of 2 integers,
  factors by which to downscale (vertical, horizontal).
  (2, 2) will halve the input in both spatial dimension.
  If only one integer is specified, the same window length
  will be used for both dimensions.
- strides: Integer, tuple of 2 integers, or None.
  Strides values.
  If None, it will default to pool_size.
- padding: One of "valid" or "same" (case-insensitive).
- data_format: A string,
  one of channels_last (default) or channels_first.
  The ordering of the dimensions in the inputs.
  channels_last corresponds to inputs with shape
  (batch, height, width, channels) while channels_first
  corresponds to inputs with shape
  (batch, channels, height, width).
  本次搭建的模型包括：
conv2d_1 (Conv2D)
max_pooling2d_1
conv2d_2 (Conv2D)
max_pooling2d_2
conv2d_3 (Conv2D)
dense_1
dense_2

最后训练的结果测试精度为：0.9938，测试精度为：0.9929

from keras import layers
from keras import models

model = models.Sequential()
# 卷积kernel 为（3，3），卷积核的个数是32个, 接收形状为 (image_height, image_width, image_channels)
model.add(layers.Conv2D(32,(3,3),activation='relu',input_shape=(28,28,1)))
# pooling层，2x2取最大值,的输出都是一个形状为 (height, width, channels) 
model.add(layers.MaxPool2D((2,2)))
model.add(layers.Conv2D(64,(3,3),activation='relu'))
model.add(layers.MaxPool2D((2,2)))
model.add(layers.Conv2D(64,(3,3),activation='relu'))
#--------以上是卷积层的设置-------------------------------#
#将形状 (3, 3, 64) 的输出被展平为形状 (576,) 的向量
model.add(layers.Flatten())
# 创建密集层为分类做准备
model.add(layers.Dense(64,activation='relu'))
# 开始分类处理，利用softmax函数
model.add(layers.Dense(10,activation='softmax'))

model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_5 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 3, 3, 64)          36928     
_________________________________________________________________
flatten_1 (Flatten)          (None, 576)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 64)                36928     
_________________________________________________________________
dense_2 (Dense)              (None, 10)                650       
=================================================================
Total params: 93,322
Trainable params: 93,322
Non-trainable params: 0
_________________________________________________________________

# 准备数据训练模型
from keras.datasets import mnist
from keras.utils import to_categorical

(train_images,train_labels),(test_images,test_labels) = mnist.load_data()

train_images = train_images.reshape((60000,28,28,1))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000,28,28,1))
test_images = test_images.astype('float32')/255

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

model.compile(optimizer='rmsprop',
             loss = 'categorical_crossentropy',
             metrics=['accuracy'])
model.fit(train_images,train_labels,epochs=5,batch_size=64)

Instructions for updating:
Use tf.cast instead.
Epoch 1/5
60000/60000 [==============================] - 10s 169us/step - loss: 0.1833 - acc: 0.9419
Epoch 2/5
60000/60000 [==============================] - 5s 84us/step - loss: 0.0477 - acc: 0.9851
Epoch 3/5
60000/60000 [==============================] - 5s 82us/step - loss: 0.0326 - acc: 0.9897
Epoch 4/5
60000/60000 [==============================] - 5s 83us/step - loss: 0.0247 - acc: 0.9925
Epoch 5/5
60000/60000 [==============================] - 5s 83us/step - loss: 0.0204 - acc: 0.9938





<keras.callbacks.History at 0x2b772bfdc88>

# 测试结果
test_loss,test_acc = model.evaluate(test_images,test_labels)
print(test_acc)

10000/10000 [==============================] - 1s 59us/step
0.9929

卷积运算

密集连接层和卷积层的根本区别

Dense 层从输入特征空间中学到的是全局模式
卷积层学到的是局部模式

卷积神经网络性质

平移不变性（translation invariant）
- 卷积神经网络在图像右下角学到某个模式之后，它可以在任何地方识别这个模式，比如左上角
- 卷积神经网络在图像右下角学到某个模式之后，它可以在任何地方识别这个模式，比如左上角
- 视觉世界从根本上具有平移不变性
- 只需要更少的训练样本就可以学到具有泛化能力的数据表示
模式的空间层次结构
- 第一个卷积层将学习较小的局部模式
- 第二个卷积层将学习由第一层特征组成的更大的模式
- 卷积神经网络可以有效地学习越来越复杂、越来越抽象的视觉概念

特征图

特征图（feature map）: 两个空间轴（高度和宽度）和一个深度轴（也叫通道轴）的 3D 张量。

深度一般是3，表示3个颜色通道，灰度图为1，表示灰度等级

**输出特征图（output feature map）：**卷积运算从输入特征图中提取图块，并对所有这些图块应用相同的变换

输出特征图仍是一个 3D 张量
具有宽度和高度，其深度可以任意取值
输出深度是层的参数，是代表过滤器（filter）

整个提取的过程如下图：

视觉模块的空间层次结构

以上的概念对应到MNIST示例，第一层卷积接收一个(28,28,1)的特征图，通过卷积运算，输出一个(26,26,32)的特征图。这里的32的意思是指的是32个滤波器（feature map）进行了计算，最后输出也是一个26x26的数值图像数据。也叫响应图（response map）。
响应图（response map）： 经过了特征图的滤波后的结果。
如图：

响应图

所以这里的两个关键参数是：

从输入中提取的图块尺寸：3x3，5x5
输出特征图的深度：也就是滤波器的个数，比如上例中的32，64

卷积运算的过程

将 3×3 或 5×5 的小窗口，在 3D 输入特征图上滑动（slide），每个可以滑动的位置进行张量运算，卷积运算，然后将计算的结果进行空间重组，转换为3D的输出特征图。

卷积运算的过程

边界填充（padding）

假设有一个 5×5 的特征图（共 25 个方块）。其中只有9个方块可以作为中心放入一个3×3 的窗口，这9个方块形成一个 3×3 的网格。也就是说通过过滤之后尺寸的大小会变化，而且是变小。为了不变少可以通过边界填充来实现。

边界效应

**边界填充：**在输入特征图的每一边添加适当数目的行和列，使得每个输入方块都能作为卷积窗口的中心。

边界填充

在keras的Conv2D 的函数中，可以通过参数padding的设置。

‘valid’ 就表示不填充,默认是不填充
‘same’ 表示不填充

卷积步幅(strides)

两个连续窗口的距离是卷积的一个参数，叫作步幅,默认值为 1.
步幅为 2 意味着特征图的宽度和高度都被做了 2 倍下采样（除了边界效应引起的变化）。

池化运算

最大池化的作用：对特征图进行下采样。

最大池化是从输入特征图中提取窗口，并输出每个通道的最大值。具体如下示意图：

最大池化

池化层对于网络模型的作用：

利于学习特征的空间层级结构
减少参数量

分享关于人工智能，机器学习，深度学习以及计算机视觉的好文章，同时自己对于这个领域学习心得笔记。想要一起深入学习人工智能的小伙伴一起结伴学习吧！扫码上车！

瓦力人工智能 - 扫码上车

瓦力人工智能

关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
1
评论
深度学习笔记12_卷积神经网络

卷积神经网络Keras 中的卷积神经网络的搭建主要利用keras中的两个函数进行构建：Conv2Dfilters: Integer, the dimensionality of the output space(i.e. the number of output filters in the convolution).kernel_size: An integer or tuple...
复制链接

扫一扫

专栏目录