吴恩达卷积神经网络第一周代码作业
1. 0填充
代码:
# GRADED FUNCTION: zero_pad
def zero_pad(X, pad):
"""
Pad with zeros all images of the dataset X. The padding is applied to the height and width of an image,
填充是应用于一个图像的高度和宽度
as illustrated in Figure 1.
Argument:
X -- python numpy array of shape (m, n_H, n_W, n_C) representing a batch of m images
X -- python numpy数组的形状(m, n_H, n_W, n_C)表示一批m个图像
pad -- integer, amount of padding around each image on vertical and horizontal dimensions
pad -- 整数,在垂直和水平维度上每个图像周围的填充量
Returns:
X_pad -- padded image of shape (m, n_H + 2 * pad, n_W + 2 * pad, n_C)
X_pad -- 形状的填充图像(m, n_H + 2 * pad, n_W + 2 * pad, n_C)
"""
#(≈ 1 line)
# X_pad = None
# YOUR CODE STARTS HERE
--这里是填写的代码--
# YOUR CODE ENDS HERE
return X_pad
np.random.seed(1)
x = np.random.randn(4, 3, 3, 2) #产生的x为4行3列的数组元素,每个元素是3行2列
x_pad = zero_pad(x, 3) # 调用我们编写的zero_pad()函数进行零填充
print ("x.shape =\n", x.shape) #打印x的shape值
print ("x_pad.shape =\n", x_pad.shape) #打印x_pad的shape值
print ("x[1,1] =\n", x[1, 1]) #打印x第二行第二列的值,为3x2的矩阵
print ("x_pad[1,1] =\n", x_pad[1, 1]) #打印x_pad第二行第二列的值
#当x_pad为np.ndarray 执行下去
assert type(x_pad) == np.ndarray, "输出必须是一个np数组"
assert x_pad.shape == (4, 9, 9, 2), f"Wrong shape: {x_pad.shape} != (4, 9, 9, 2)"
print(x_pad[0, 0:2,:, 0])
assert np.allclose(x_pad[0, 0:2,:, 0], [[0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0]], 1e-15), "Rows are not padded with zeros"
assert np.allclose(x_pad[0, :, 7:9, 1].transpose(), [[0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0]], 1e-15), "Columns are not padded with zeros"
assert np.allclose(x_pad[:, 3:6, 3:6, :], x, 1e-15), "Internal values are different"
fig, axarr = plt.subplots(1, 2)
axarr[0].set_title('x')
axarr[0].imshow(x[0, :, :, 0])
axarr[1].set_title('x_pad')
axarr[1].imshow(x_pad[0, :, :, 0])
zero_pad_test(zero_pad)
分析:
代码段2:
- 首先是在下面的代码中先创建一个4x3的矩阵,矩阵中的元素是3x2的矩阵
- x_pad执行0填充,这里面是需要填写的内容,下面具体说明
- 打印出相关信息,并且使用assert去判断,后面用到了np.allclose()函数,该函数是比较两个array是不是每一元素都相等,默认在1e-05的误差范围内。
- 后面使用到了绘画函数去输出图像
代码段1:
- 这是0填充函数的具体内容
- 从代码2中知道传入的参数第一个是x也就是4x3的矩阵,第二个参数是3 给出提示内容就是指的padding
- 通过提示我们知道X的意义是4个3x3的图像且通道数为2
- 我们需要使用np.pad()函数去做0填充,其中的参数
pad(array, pad_width, mode, **kwargs)
参数解释:
array:也就是其中我们要做填充的图像矩阵数据
pad_width:表示每个轴(axis)边缘需要填充的数值数目。
mode:填充方式,一般是constant,后面是(0,0)的填充
对于pad_width这个参数大意上可以做一个例子
import numpy as np
x=[[1,1],
[1,1]]
print(np.array(x))
x_d=np.pad(x,((2,1),(1,2)),'constant',constant_values=(2,-2))
print('----------')
print(x_d)
输出为:
[[1 1]
[1 1]]
----------
[[ 2 2 2 -2 -2]
[ 2 2 2 -2 -2]
[ 2 1 1 -2 -2]
[ 2 1 1 -2 -2]
[ 2 -2 -2 -2 -2]]
- 对于题目中我们要对3x3的矩阵边界3圈(pad)的0值
X_pad=np.pad(X, # 输入的图像数据3x3x2,数量为4
((0,0), #样本数,不需要做填充还是4
(pad,pad), # 上面先填pad个,下面填pad个
(pad,pad), # 左边填pad个,右边填pad个
(0,0)), # 通道数无需做填充
'constant', #mode填充方式,直接填(0,0)对应(pad,pad)中填的数
constant_values=(0,0))
2. Single Step of Convolution
代码段1
# GRADED FUNCTION: conv_single_step
def conv_single_step(a_slice_prev, W, b):
"""
在输出激活的前一层的片段(a_slice_prev)上应用由参数W定义的过滤器。
参数:
a_slice_prev -- 输入数据的一个片段为(f,f,n_C_prev),其实就是输入
W -- 权值参数维度为(f, f, n_C_prev) ,指的就是过滤器
b -- 偏差参数维度为 (1, 1, 1)
返回值:
Z -- 输出的结果,因为输入和过滤器大小一样那么做完卷积后输出的是一个数
"""
#(≈ 3 lines of code)
# Element-wise product between a_slice_prev and W. Do not add the bias yet.
# s = None
# Sum over all entries of the volume s.
# Z = None
# Add bias b to Z. Cast b to a float() so that Z results in a scalar value.
# Z = None
# YOUR CODE STARTS HERE
# YOUR CODE ENDS HERE
return Z
代码段2:
np.random.seed(1)
a_slice_prev = np.random.randn(4, 4, 3)
W = np.random.randn(4, 4, 3)
b = np.random.randn(1, 1, 1)
Z = conv_single_step(a_slice_prev, W, b)
print("Z =", Z)
conv_single_step_test(conv_single_step)
assert (type(Z) == np.float64 or type(Z) == np.float32), "You must cast the output to float"
assert np.isclose(Z, -6.999089450680221), "Wrong value"
分析:
- 可以看出a_slice_prev 是输入图像它是一个4x4x3的图像,W也是4x4x3,b是1x1x1
- 在代码段1中要对4x4x3进行4x4x3过滤器的卷积,最后加上偏差b
- 利用numpy.multiply()函数可以对矩阵中的每个元素做乘法,对偏差值b要转换成float,对于做出的numpy.multiply()的矩阵对应的是每一个元素做乘法,最后我们需要把做完乘法后加在一起使用了numpy.sum()函数
先看一下numpy.multiply()函数和numpy.sum()函数出来的结果
import numpy as np
np.random.seed(1)
input = np.random.randn(2, 2, 2)
W = np.random.randn(2, 2, 2)
b = np.random.randn(1, 1, 1)
print('input:\n',input)
print('W:\n',W)
conv1= np.multiply(input,W)
print('conv1:\n',conv1)
conv2=np.sum(conv1)
print('conv2:\n',conv2)
output=conv2+float(b)
print('output:\n',output)
使用2x2x2的输入与2x2x2的过滤器卷积后加入偏差
结果:
input:
[[[ 1.62434536 -0.61175641]
[-0.52817175 -1.07296862]]
[[ 0.86540763 -2.3015387 ]
[ 1.74481176 -0.7612069 ]]]
W:
[[[ 0.3190391 -0.24937038]
[ 1.46210794 -2.06014071]]
[[-0.3224172 -0.38405435]
[ 1.13376944 -1.09989127]]]
conv1:
[[[ 0.51822968 0.15255393]
[-0.77224411 2.21046634]]
[[-0.27902231 0.88391596]
[ 1.97821426 0.83724482]]]
conv2:
5.529358565096128
output:
5.356930357545693
可以看出在multiply函数中只是把对应元素进行相乘操作,最后使用sun()函数让矩阵中的所有数字都加在一起成为一个数
由此这道题填写应该为:
s=np.multiply(a_slice_prev,W)
Z=np.sum(s)+ float(b)
3. conv_forward
np.random.seed(1)
A_prev = np.random.randn(2, 5, 7, 4)
W = np.random.randn(3, 3, 4, 8)
b = np.random.randn(1, 1, 1, 8)
hparameters = {"pad" : 1,
"stride": 2}
Z, cache_conv = conv_forward(A_prev, W, b, hparameters)
print("Z's mean =\n", np.mean(Z))
print("Z[0,2,1] =\n", Z[0, 2, 1])
print("cache_conv[0][1][2][3] =\n", cache_conv[0][1][2][3])
conv_forward_test(conv_forward)
# GRADED FUNCTION: conv_forward
def conv_forward(A_prev, W, b, hparameters):
"""
实现卷积函数的前向传播
参数:
A_prev -- 前一层的输出激活,numpy数组的形状(m, n_H_prev, n_W_prev, n_C_prev)
W -- 权重参数, numpy array of shape (f, f, n_C_prev, n_C)
b -- 偏差参数, numpy array of shape (1, 1, 1, n_C)
hparameters -- 包含"stride" 和 "pad"的超参数字典
返回值:
Z -- 卷积输出,维度为 (m, n_H, n_W, n_C)
cache -- 缓存了一些反向传播函数 conv_backward()的数据
"""
# 获取上一层的输入信息(≈1 line)
# (m, n_H_prev, n_W_prev, n_C_prev) = None
# 获取W的信息 (≈1 line)
# (f, f, n_C_prev, n_C) = None
# 获取 "hparameters"的信息 (≈2 lines)
# stride = None
# pad = None
# 计算. 卷积后的图像的宽度高度,使用int()来向下取整
# Hint: use int() to apply the 'floor' operation. (≈2 lines)
# n_H = None
# n_W = None
# 用0初始化卷积输出Z. (≈1 line)
# Z = None
#创造 A_prev_pad通过填充 A_prev
# A_prev_pad = None
# for i in range(None): # 循环这些样本
# a_prev_pad = None # 选择第i个训练示例的填充激活
# for h in range(None): # 在垂直方向上循环
# 找到垂直方向的开始和结束位置(≈2 lines)
# vert_start = None
# vert_end = None
# for w in range(None): #水平方向循环
# 查找当前“切片”的水平起始和结束 (≈2 lines)
# horiz_start = None
# horiz_end = None
# for c in range(None): # l循环遍历输出卷积核的通道数
# Use the corners to define the (3D) slice of a_prev_pad (See Hint above the cell). (≈1 line)
# a_slice_prev = None
# Convolve the (3D) slice with the correct filter W and bias b, to get back one output neuron. (≈3 line)
# weights = None
# biases = None
# Z[i, h, w, c] = None
# YOUR CODE STARTS HERE
# YOUR CODE ENDS HERE
# Save information in "cache" for the backprop
cache = (A_prev, W, b, hparameters)
return Z, cache
分析:
- 首先看代码段1,A_prev是输入图像,W是过滤器,b是偏差,hparameters里面有两个超参数,进入函数conv_forward()中将上面的参数都输入进去做前向传播。
- 看到提示# (m, n_H_prev, n_W_prev, n_C_prev) = None 也就是里面的数就是上一层也就是函数参数中第一个A_prev输入图像中的参数,用A_prev.shape可以得到输入A_prev的各项参数
- 下面的(f, f, n_C_prev, n_C) = None是W的各项参数,# stride = None pad = None对应hparameters字典中的参数都给提取出来。
- 下面计算n_H和n_W的值,并给出提示用int()向下取整,我第一次用numpy.floor()向下取整后面会有错误因为返回的是float。
- 下面初始化Z卷积输出的值为0使用到的函数是numpy.zeros(),使用此函数可以将矩阵中的所有数初始为0按照参数的大小,然后创建一个A_prev_pad是填充过的A_prev,对A_prev进行zero_pad()填充得到A_prev_pad它还是有4个参数(m,?,?,n_C_prev)
- 后面遍历这些示例那就先从m遍历,提示是从第i个输入选择,也就是A_prev_pad[i]中选择。然后从垂直方向遍历,垂直方向的大小应该为n_H,后面水平一样。其中vert_start和vert_end的值开始有提示:
垂直方向开始肯定是0,后面遍历过程是根据步数变化,所以vert_start的值应该是h*stride,结束位置应该是开始位置加上过滤器f的大小位置 - 三个循环(n_H,n_w,n_C)后需要提取过滤器框住的位置上的数a_slice_prev上面给过提示,后面W,b对应每个通道数的值,最后使用一次卷积操作的函数conv_single_step()
因此这道题的答案应该是:
def conv_forward(A_prev, W, b, hparameters):
# Retrieve dimensions from A_prev's shape (≈1 line)
# (m, n_H_prev, n_W_prev, n_C_prev) = None
(m, n_H_prev, n_W_prev, n_C_prev)=A_prev.shape
# Retrieve dimensions from W's shape (≈1 line)
# (f, f, n_C_prev, n_C) = None
(f, f, n_C_prev, n_C)=W.shape
# Retrieve information from "hparameters" (≈2 lines)
# stride = None
# pad = None
stride=hparameters['stride']
pad = hparameters['pad']
# Compute hparameters['stride']the dimensions of the CONV output volume using the formula given above.
# Hint: use int() to apply the 'floor' operation. (≈2 lines)
# n_H = None
# n_W = None
n_H=int(((n_H_prev-f+2*pad)/stride))+1
n_W=int(((n_W_prev-f+2*pad)/stride))+1
# Initialize the output volume Z with zeros. (≈1 line)用0初始化卷积核中的数
# Z = None
Z = np.zeros((m, n_H, n_W, n_C))
# Create A_prev_pad by padding A_prev 创建一个A_prev_pad是填充过的A_prev
# A_prev_pad = None
A_prev_pad=zero_pad(A_prev,pad)
# YOUR CODE STARTS HERE
for i in range(m):
a_prev_pad= A_prev_pad[i]
for h in range(n_H):
vert_start = h * stride
vert_end = vert_start + f
for w in range(n_W):
horiz_start = w * stride
horiz_end = horiz_start + f
for c in range(n_C):
a_slice_prev = a_prev_pad[vert_start:vert_end,horiz_start:horiz_end,:]
weights = W[:,:,:,c]
biases = b[:,:,:,c]
Z[i, h, w, c] = conv_single_step(a_slice_prev,weights,biases)
# YOUR CODE ENDS HERE
# Save information in "cache" for the backprop
cache = (A_prev, W, b, hparameters)
return Z, cache
4. Pooling Layer
代码段1
# Case 1: stride of 1
np.random.seed(1)
A_prev = np.random.randn(2, 5, 5, 3)
hparameters = {"stride" : 1, "f": 3}
A, cache = pool_forward(A_prev, hparameters, mode = "max")
print("mode = max")
print("A.shape = " + str(A.shape))
print("A[1, 1] =\n", A[1, 1])
print()
A, cache = pool_forward(A_prev, hparameters, mode = "average")
print("mode = average")
print("A.shape = " + str(A.shape))
print("A[1, 1] =\n", A[1, 1])
pool_forward_test(pool_forward)
代码段2
def pool_forward(A_prev, hparameters, mode = "max"):
"""
Implements the forward pass of the pooling layer
Arguments:
A_prev -- Input data, numpy array of shape (m, n_H_prev, n_W_prev, n_C_prev)
hparameters -- python dictionary containing "f" and "stride"
mode -- the pooling mode you would like to use, defined as a string ("max" or "average")
Returns:
A -- output of the pool layer, a numpy array of shape (m, n_H, n_W, n_C)
cache -- cache used in the backward pass of the pooling layer, contains the input and hparameters
"""
# Retrieve dimensions from the input shape
(m, n_H_prev, n_W_prev, n_C_prev) = A_prev.shape
# Retrieve hyperparameters from "hparameters"
f = hparameters["f"]
stride = hparameters["stride"]
# Define the dimensions of the output
n_H = int(1 + (n_H_prev - f) / stride)
n_W = int(1 + (n_W_prev - f) / stride)
n_C = n_C_prev
# Initialize output matrix A
A = np.zeros((m, n_H, n_W, n_C))
# for i in range(None): # loop over the training examples
# for h in range(None): # loop on the vertical axis of the output volume
# Find the vertical start and end of the current "slice" (≈2 lines)
# vert_start = None
# vert_end = None
# for w in range(None): # loop on the horizontal axis of the output volume
# Find the vertical start and end of the current "slice" (≈2 lines)
# horiz_start = None
# horiz_end = None
# for c in range (None): # loop over the channels of the output volume
# Use the corners to define the current slice on the ith training example of A_prev, channel c. (≈1 line)
# a_prev_slice = None
# Compute the pooling operation on the slice.
# Use an if statement to differentiate the modes.
# Use np.max and np.mean.
# if mode == "max":
# A[i, h, w, c] = None
# elif mode == "average":
# A[i, h, w, c] = None
# YOUR CODE STARTS HERE
# YOUR CODE ENDS HERE
# Store the input and hparameters in "cache" for pool_backward()
cache = (A_prev, hparameters)
# Making sure your output shape is correct
#assert(A.shape == (m, n_H, n_W, n_C))
return A, cache
分析:
- 输入是A_prev,hparameters是超参数有步长和f的大小进入池化层
- 在池化层中先获取基本的参数利用公式得到n_H和n_W,n_C保持不变,先定义一个空的输出矩阵,在m数量、nH、nW、nC上进行遍历,在每一个框中使用numpy.max()函数去得到最大池化使用numpy.mean()函数得到平均池化的结果
答案:
def pool_forward(A_prev, hparameters, mode = "max"):
(m, n_H_prev, n_W_prev, n_C_prev) = A_prev.shape
f = hparameters["f"]
stride = hparameters["stride"]
n_H = int(1 + (n_H_prev - f) / stride)
n_W = int(1 + (n_W_prev - f) / stride)
n_C = n_C_prev
# Initialize output matrix A
A = np.zeros((m, n_H, n_W, n_C))
for i in range(m):
for h in range(n_H):
for w in range(n_W):
for c in range (n_C):
vert_start = h * stride
vert_end = vert_start + f
horiz_start = w * stride
horiz_end = horiz_start + f
a_prev_slice = A_prev[i,vert_start:vert_end,horiz_start:horiz_end,c]
if mode == "max":
A[i, h, w, c] = np.max(a_prev_slice)
elif mode == "average":
A[i, h, w, c] = np.mean(a_prev_slice)
cache = (A_prev, hparameters)
assert(A.shape == (m, n_H, n_W, n_C))
return A, cache
5.Backpropagation in Convolutional Neural Networks (OPTIONAL / UNGRADED)
def conv_backward(dZ, cache):
"""
Implement the backward propagation for a convolution function
Arguments:
dZ -- gradient of the cost with respect to the output of the conv layer (Z), numpy array of shape (m, n_H, n_W, n_C)
cache -- cache of values needed for the conv_backward(), output of conv_forward()
Returns:
dA_prev -- gradient of the cost with respect to the input of the conv layer (A_prev),
numpy array of shape (m, n_H_prev, n_W_prev, n_C_prev)
dW -- gradient of the cost with respect to the weights of the conv layer (W)
numpy array of shape (f, f, n_C_prev, n_C)
db -- gradient of the cost with respect to the biases of the conv layer (b)
numpy array of shape (1, 1, 1, n_C)
"""
# Retrieve information from "cache"
(A_prev, W, b, hparameters) = cache
# Retrieve dimensions from A_prev's shape
(m, n_H_prev, n_W_prev, n_C_prev) = A_prev.shape
# Retrieve dimensions from W's shape
(f, f, n_C_prev, n_C) = W.shape
# Retrieve information from "hparameters"
stride = hparameters['stride']
pad = hparameters['pad']
# Retrieve dimensions from dZ's shape
(m, n_H, n_W, n_C) = dZ.shape
# Initialize dA_prev, dW, db with the correct shapes
dA_prev = np.zeros((m, n_H_prev, n_W_prev, n_C_prev))
dW = np.zeros((f, f, n_C_prev, n_C))
db = np.zeros((1, 1, 1, n_C))
# Pad A_prev and dA_prev
A_prev_pad = zero_pad(A_prev, pad)
dA_prev_pad = zero_pad(dA_prev, pad)
for i in range(m): # loop over the training examples
# select ith training example from A_prev_pad and dA_prev_pad
a_prev_pad = A_prev_pad[i]
da_prev_pad = dA_prev_pad[i]
for h in range(n_H): # loop over vertical axis of the output volume
for w in range(n_W): # loop over horizontal axis of the output volume
for c in range(n_C): # loop over the channels of the output volume
# Find the corners of the current "slice"
vert_start = h * stride
vert_end = vert_start + f
horiz_start = w * stride
horiz_end = horiz_start + f
# Use the corners to define the slice from a_prev_pad
a_slice = a_prev_pad[vert_start:vert_end,horiz_start:horiz_end,:]
# Update gradients for the window and the filter's parameters using the code formulas given above
da_prev_pad[vert_start:vert_end, horiz_start:horiz_end, :] += W[:,:,:,c] * dZ[i,h,w,c]
dW[:,:,:,c] += a_slice * dZ[i,h,w,c]
db[:,:,:,c] += dZ[i,h,w,c]
# Set the ith training example's dA_prev to the unpadded da_prev_pad (Hint: use X[pad:-pad, pad:-pad, :])
dA_prev[i, :, :, :] = da_prev_pad[pad:-pad, pad:-pad, :]
# YOUR CODE STARTS HERE
# YOUR CODE ENDS HERE
# Making sure your output shape is correct
assert(dA_prev.shape == (m, n_H_prev, n_W_prev, n_C_prev))
return dA_prev, dW, db
分析:
- 这道题作者给出了大量的提示,首先我们删掉作者给出的按照提示加上之后可以写完大部分的地方
- 做后面的地方按照上面给出的讲解进行填写
6.create_mask_from_window
def create_mask_from_window(x):
# (≈1 line)
mask = (x == np.max(x))
# YOUR CODE STARTS HERE
# YOUR CODE ENDS HERE
return mask
分析:
- 这道题的关键在于作者给出的提示,主要构建出一个mask函数,该函数的作用是标记出一个矩阵中最大的数,最大的是true,其他的都是false。并且该函数返回的值是参数x.shape的矩阵。
- 作者给出两个方法,首先是numpy.max()函数它能够计算出数组中的最大值,第二个方法是A = (X == x) ,由于对其不了解在pycharm中进行尝试
import numpy as np
#先创建一个3x3的矩阵
x=np.array([[1,2,3],
[4,5,6],
[7,8,9]])
print(x)
print('----------')
y=(x==5)
print(y)
结果
[[1 2 3]
[4 5 6]
[7 8 9]]
----------
[[False False False]
[False True False]
[False False False]]
可以看出A = (X == x) 的大概意思是对于X矩阵中的数如果等于后面x的数那么就为true放在矩阵A中,不等于x的为false放在矩阵A中
3. 因此这道题想要mark矩阵用这种形式判断是否等于X矩阵中的最大值即可
7.distribute_value
def distribute_value(dz, shape):
# Retrieve dimensions from shape (≈1 line)
(n_H, n_W) = shape
# Compute the value to distribute on the matrix (≈1 line)
average = dz/(n_H*n_W)
# Create a matrix where every entry is the "average" value (≈1 line)
a = np.ones(shape) * average
return a
分析:
- 先看作者给的解释:在最大池化过程中每一个输出的值都来自于输入窗口的最大值,但是在平均池化层输出值取决于每一个输入的窗口中的值。所以对于平均池化操作我们要重新计算
- 代码中输入的参数是dz和shape,表示dz的值在最后将被平均分配在shape中的矩阵中。因此先是要算出shape矩阵的大小并且求出每个元素中应该是多少值,那么average的值就出来
- 第二步是创建一个大小为shape的矩阵并且里面的值都是average。在这一步我卡了半天原因是我以为numpy库中有一个函数可以直接把average作为参数再将shape作为参数构建矩阵,看到作者给的**numpy.ones()**的提示也不知道啥意思,最后才知道只要用该函数构建矩阵后乘以average就好了。
8.pool_backward
np.random.seed(1)
# 输入A_prev为5 5 3 2
A_prev = np.random.randn(5, 5, 3, 2)
# s=1 f=2
hparameters = {"stride" : 1, "f": 2}
# 池化层前向传播返回值是A输出矩阵也就是张量,cache是输入的矩阵A_prev和hparameters
A, cache = pool_forward(A_prev, hparameters)
print(A.shape) #输出值为(5,4,2,2)
print(cache[0].shape) #cache中第一个量也就是输入矩阵就是(5,5,3,2)
dA = np.random.randn(5, 4, 2, 2) #dA的结构为(5,4,2,2)和前向传播输出一致
# 反向传播输入参数dA(5,4,2,2),开始输入的矩阵和参数,mode为max
dA_prev1 = pool_backward(dA, cache, mode = "max")
print("mode = max")
print('mean of dA = ', np.mean(dA))
print('dA_prev1[1,1] = ', dA_prev1[1, 1])
print()
dA_prev2 = pool_backward(dA, cache, mode = "average")
print("mode = average")
print('mean of dA = ', np.mean(dA))
print('dA_prev2[1,1] = ', dA_prev2[1, 1])
assert type(dA_prev1) == np.ndarray, "Wrong type"
assert dA_prev1.shape == (5, 5, 3, 2), f"Wrong shape {dA_prev1.shape} != (5, 5, 3, 2)"
assert np.allclose(dA_prev1[1, 1], [[0, 0],
[ 5.05844394, -1.68282702],
[ 0, 0]]), "Wrong values for mode max"
assert np.allclose(dA_prev2[1, 1], [[0.08485462, 0.2787552],
[1.26461098, -0.25749373],
[1.17975636, -0.53624893]]), "Wrong values for mode average"
print("\033[92m All tests passed.")
代码段2:
def pool_backward(dA, cache, mode = "max"):
#dA是与前向传播输出的结构一致,cache是前向传播输入矩阵和参数
# Retrieve information from cache (≈1 line)
(A_prev, hparameters) = cache #A_prev也就是刚开始输入的图像(5,5,3,2)
# Retrieve hyperparameters from "hparameters" (≈2 lines)
stride = hparameters['stride'] #步长和池化过滤器大小
f = hparameters['f']
# Retrieve dimensions from A_prev's shape and dA's shape (≈2 lines)
(m, n_H_prev, n_W_prev, n_C_prev) = A_prev.shape #得到输入图像各项参数数量,高度,宽度,通道数
(m, n_H, n_W, n_C) = dA.shape #得到输出图像各项参数个数
# Initialize dA_prev with zeros (≈1 line)
#初始化一个大小为(5,5,3,2)的dA_prev里面元素都为0
dA_prev = np.zeros((m, n_H_prev, n_W_prev, n_C_prev))
for i in range(m): # loop over the training examples
# select training example from A_prev (≈1 line)
a_prev = A_prev[i] #选择输入的第i+1个图像进行操作
for h in range(n_H): # loop on the vertical axis
for w in range(n_W): # loop on the horizontal axis
for c in range(n_C): # loop over the channels (depth)
# Find the corners of the current "slice" (≈4 lines)
#得到一个窗口的位置
vert_start = h * stride
vert_end = vert_start + f
horiz_start = w * stride
horiz_end = horiz_start + f
# Compute the backward propagation in both modes.
if mode == "max":
# Use the corners and "c" to define the current slice from a_prev (≈1 line)
# 一个a_prev_slice表示的是输入图像的一个窗口的矩阵
a_prev_slice = a_prev[vert_start:vert_end,horiz_start:horiz_end,c]
# Create the mask from a_prev_slice (≈1 line)
#通过create_mask_from_window()函数可以得到一个窗口中的最大值mask大小为a_prev_slice的大小
mask = create_mask_from_window(a_prev_slice)
# Set dA_prev to be dA_prev + (the mask multiplied by the correct entry of dA) (≈1 line)
#对dA_prev(大小为窗口大小)设置值为dA在这个位置的最大值
dA_prev[i, vert_start: vert_end, horiz_start: horiz_end, c] += np.multiply(mask,dA[i,h,w,c])
elif mode == "average":
# Get the value da from dA (≈1 line)
da = dA[i,h,w,c]
# Define the shape of the filter as fxf (≈1 line)
shape = (f,f)
# Distribute it to get the correct slice of dA_prev. i.e. Add the distributed value of da. (≈1 line)
dA_prev[i, vert_start: vert_end, horiz_start: horiz_end, c] += distribute_value(da,shape)
# YOUR CODE STARTS HERE
# YOUR CODE ENDS HERE
# Making sure your output shape is correct
assert(dA_prev.shape == A_prev.shape)
return dA_prev
分析:
- 在注释中有详细的说明,有一点不太明白就是在np.multiply(mask,dA[i,h,w,c])在pycharm里面进行验证
import numpy as np
a= np.array([[0,0],
[0,4]])
# 将a设置为只有[1][1]位置为true的数组
a=np.array(a,dtype=bool)
print(a)
b=np.multiply(a,7)
print(b)
结果
[[False False]
[False True]]
[[0 0]
[0 7]]
也就是说得到的最大值给了dA_prev,dA_prev的shape和刚输入图像的shape大小一致