Part 1:卷积神经网络
本周课程将利用numpy实现卷积层(CONV) 和 池化层(POOL), 包含前向传播和可选的反向传播。
变量说明
- 上标 [l] [ l ] 表示神经网络的第几层
- 上标 (i) ( i ) 表示第几个样本
- 上标 [i] [ i ] 表示第几个mini-batch
- 下标 i i 表示向量的第几个维度
- nH n H , nW n W , nC n C 分别代表图片的高,宽和通道数
- nHprev n H p r e v , nWprev n W p r e v , nCprev n C p r e v 分别代表上一层图片的高,宽和通道数
1 导包
import numpy as np # 科学计算的
import h5py # 读取数据文件的
import matplotlib.pyplot as plt # 画图的
%matplotlib inline
plt.rcParams['figure.figsize'] = (5.0, 4.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'
%load_ext autoreload
%autoreload 2
np.random.seed(1) # 使随机函数一致的
2 作业大纲
- 卷积层
- Zero Padding
- Convolve window
- Convolution forward
- Convolution backward (optional)
- 池化层
- Pooling forward
- Create mask
- Distribute value
- Pooling backward (optional)
本次作业我们采用numpy方式实现(需要实现反向传播),之后的作业可以采用tensorflow实现
3 卷积层
实现如下卷几层:将输入转化为不同size的输出。
3.1 Zero-Padding (填充0作为Padding)
Padding 的好处
- 不加Padding时每次卷积图片都会缩小,加上Padding后图片可以自由设置大小,比如保持不变的SAME模式
- 保留图片边缘的信息,如果没有Padding,边缘信息作用的输出非常少,会再一定程度上丢失信息
练习
用0位图片填充Padding,以下代码可以为(5,5,5,5,5)的数组a添加Padding:为第2维添加 pad =1,为第4维添加 pad=3,其余维度pad=0
a = np.pad(a, ((0,0), (1,1), (0,0), (3,3), (0,0)), 'constant', constant_values = (..,..))
代码
# GRADED FUNCTION: zero_pad
def zero_pad(X, pad):
"""
Pad with zeros all images of the dataset X. The padding is applied to the height and width of an image,
as illustrated in Figure 1.
Argument:
X -- python numpy array of shape (m, n_H, n_W, n_C) representing a batch of m images
pad -- integer, amount of padding around each image on vertical and horizontal dimensions
Returns:
X_pad -- padded image of shape (m, n_H + 2*pad, n_W + 2*pad, n_C)
"""
### START CODE HERE ### (≈ 1 line)
X_pad = np.pad(X, ((0,0), (pad,pad), (pad,pad), (0,0)), 'constant')
### END CODE HERE ###
return X_pad
##########################################
np.random.seed(1)
x = np.random.randn(4, 3, 3, 2)
x_pad = zero_pad(x, 2)
print ("x.shape =", x.shape)
print ("x_pad.shape =", x_pad.shape)
print ("x[1,1] =", x[1,1])
print ("x_pad[1,1] =", x_pad[1,1])
fig, axarr = plt.subplots(1, 2)
axarr[0].set_title('x')
axarr[0].imshow(x[0,:,:,0])
axarr[1].set_title('x_pad')
axarr[1].imshow(x_pad[0,:,:,0])
# x.shape = (4, 3, 3, 2)
# x_pad.shape = (4, 7, 7, 2)
# x[1,1] = [[ 0.90085595 -0.68372786]
# [-0.12289023 -0.93576943]
# [-0.26788808 0.53035547]]
# x_pad[1,1] = [[ 0. 0.]
# [ 0. 0.]
# [ 0. 0.]
# [ 0. 0.]
# [ 0. 0.]
# [ 0. 0.]
# [ 0. 0.]]
#
# <matplotlib.image.AxesImage at 0x7f1a576871d0>
3.2 单步卷积层
利用卷积核(filter)遍历输入(input)得到输出(output)
- 卷积核的计算:每次对应元素相乘再加和作为输出的一个元素,相当于WX
- 真正的输出:A = sigmoid(WX+b)也就是在卷积的基础上加上偏移量再进行sigmoid非线性运算
# GRADED FUNCTION: conv_single_step
def conv_single_step(a_slice_prev, W, b):
"""
Apply one filter defined by parameters W on a single slice (a_slice_prev) of the output activation
of the previous layer.
Arguments:
a_slice_prev -- slice of input data of shape (f, f, n_C_prev)
W -- Weight parameters contained in a window - matrix of shape (f, f, n_C_prev)
b -- Bias parameters contained in a window - matrix of shape (1, 1, 1)
Returns:
Z -- a scalar value, result of convolving the sliding window (W, b) on a slice x of the input data
"""
### START CODE HERE ### (≈ 2 lines of code)
# Element-wise product between a_slice and W. Do not add the bias yet.
s = a_slice_prev * W
# Sum over all entries of the volume s.
Z = np.sum(s)
# Add bias b to Z. Cast b to a float() so that Z results in a scalar value.
Z = Z + b
### END CODE HERE ###
return Z
########################################
np.random.seed(1)
a_slice_prev = np.random.randn(4, 4, 3)
W = np.random.randn(4, 4, 3)
b = np.random.randn(1, 1, 1)
Z = conv_single_step(a_slice_prev, W, b)
print("Z =", Z)
# Z = [[[-6.99908945]]]
3.3 卷积神经网络-前向传播
运用多个卷积核(filter)处理输出图像,每个卷积核输出一个2维图像,多个卷积核输出多个2维图像作为通道叠加在一起。
提示
- 选取图像分片
a_slice_prev = a_prev[0:2,0:2,:]
选取分片之前,应该先定义分片的范围(vert_start, vert_end, horiz_start, horiz_end)
输出图片的大小
nHnW