吴恩达deeplearning Lesson4 Week1
复习卷积网络遗漏
1、maxpooling层每个操作只是对一个channel,maxpooling后channel数不变。
2、卷积层有偏置。每层有一个偏置。
具体代码:
def conv_single_step(a_slice_prev, W, b):
"""
Apply one filter defined by parameters W on a single slice (a_slice_prev) of the output activation
of the previous layer.
Arguments:
a_slice_prev -- slice of input data of shape (f, f, n_C_prev)
W -- Weight parameters contained in a window - matrix of shape (f, f, n_C_prev)
b -- Bias parameters contained in a window - matrix of shape (1, 1, 1)
Returns:
Z -- a scalar value, result of convolving the sliding window (W, b) on a slice x of the input data
"""
### START CODE HERE ### (≈ 2 lines of code)
# Element-wise product between a_slice and W. Do not add the bias yet.
s = a_slice_prev * W
# Sum over all entries of the volume s.
Z = np.sum(np.reshape(s,(s.size,)))
# Add bias b to Z. Cast b to a float() so that Z results in a scalar value.
Z = Z+b
### END CODE HERE ###
return Z
3、quit的错题(也佐证了卷积层有偏置)
问题 3
假设你的输入是300×300彩色(RGB)图像,并且你使用卷积层和100个过滤器,每个过滤器都是5×5的大小,请问这个隐藏层有多少个参数(包括偏置参数)?
【 】 2501
【 】 2600
【 】 7500
【★】 7600
注:视频【1.7单层卷积网络】,05:10处。首先,参数和输入的图片大小是没有关系的,无论你给的图像像素有多大,参数值都是不变的,在这个题中,参数值只与过滤器有关。我们来看一下怎么算:单片过滤器的大小是5∗55∗5,由于输入的是RGB图像,所以信道nc=3nc=3,由此可见,一个完整的过滤器的组成是:5∗5∗nc=5∗5∗35∗5∗nc=5∗5∗3,每一个完整的过滤器只有一个偏置参数bb,所以,每一个完整的过滤器拥有5∗5∗3+1=765∗5∗3+1=76个参数,而此题中使用了100100个过滤器,所以这个隐藏层包含了76∗100=760076∗100=7600个参数。
原文:https://blog.csdn.net/u013733326/article/details/80046299
python矩阵操作
np 多维矩阵求和
多维矩阵所有元素求和:
y要领就是先把矩阵转换成一维数组,再求和。
#Element-wise product between a_slice and W. Do not add the bias yet.
s = a_slice_prev * W
# Sum over all entries of the volume s.
Z = np.sum(np.reshape(s,(s.size,)))
tf 转置 & 平均
logits = tf.transpose(Z3) #tf.transpose转置
labels = tf.transpose(Y) #tf.transpose转置
# Calculate accuracy on the test set
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
#tf.cast目的是转换格式,此处为将原来的int型转换为float型
前向传播
这里遇到不少麻烦,讲一下
def conv_forward(A_prev, W, b, hparameters):
"""
Implements the forward propagation for a convolution function
Arguments:
A_prev -- output activations of the previous layer, numpy array of shape (m, n_H_prev, n_W_prev, n_C_prev)
W -- Weights, numpy array of shape (f, f, n_C_prev, n_C)
b -- Biases, numpy array of shape (1, 1, 1, n_C)
hparameters -- python dictionary containing "stride" and "pad"
Returns:
Z -- conv output, numpy array of shape (m, n_H, n_W, n_C)
cache -- cache of values needed for the conv_backward() function
"""
### START CODE HERE ###
# Retrieve dimensions from A_prev's shape (≈1 line)
(m, n_H_prev, n_W_prev, n_C_prev) = A_prev.shape
# Retrieve dimensions from W's shape (≈1 line)
(f, f, n_C_prev, n_C) = W.shape
# Retrieve information from "hparameters" (≈2 lines)
stride = hparameters["stride"]
pad = hparameters["pad"]
# Compute the dimensions of the CONV output volume using the formula given above. Hint: use int() to floor. (≈2 lines)
n_H = int((n_H_prev-f+2*pad) / stride+1) #公式计算!
n_W = int((n_W_prev-f+2*pad) / stride+1)
#print(m,n_H,n_W,n_C)
# Initialize the output volume Z with zeros. (≈1 line)
Z = np.zeros((m,n_H,n_W,n_C))
# Create A_prev_pad by padding A_prev
A_prev_pad = zero_pad(A_prev, pad)
for i in range(m): # loop over the batch of training examples
a_prev_pad = A_prev_pad[i,:,:,:]
#print(a_prev_pad.shape)
# Select ith training example's padded activation
for h in range(n_H): # loop over vertical axis of the output volume
for w in range(n_W): # loop over horizontal axis of the output volume
for c in range(n_C): # loop over channels (= #filters) of the output volume
# Find the corners of the current "slice" (≈4 lines)
vert_start = h * stride
vert_end = h * stride + f
horiz_start = w * stride
horiz_end = w * stride + f
#print( vert_start, vert_end,horiz_start,horiz_end)
# Use the corners to define the (3D) slice of a_prev_pad (See Hint above the cell). (≈1 line)
a_slice_prev = a_prev_pad[vert_start : vert_end , horiz_start : horiz_end , :]
#print(a_slice_prev.shape,W[:,:,:,c].shape,)
# Convolve the (3D) slice with the correct filter W and bias b, to get back one output neuron. (≈1 line)
Z[i, h, w, c] = np.sum(np.reshape(a_slice_prev * W[:,:,:,c],(a_slice_prev.size,))) +b[:,:,:,c]
### END CODE HERE ###
# Making sure your output shape is correct
assert(Z.shape == (m, n_H, n_W, n_C))
# Save information in "cache" for the backprop
cache = (A_prev, W, b, hparameters)
return Z, cache
n_C_prev和n_C
(f, f, n_C_prev, n_C) = W.shape
这里需要说明下,增进理解。区分n_C_prev和n_C的作用,
n_C_prev:是代表前一层输出通道数(即要处理的特征图的通道数channel)
n_C:是代表输出的通道数(即滤波器数)
- 每一个滤波器的计算流程是将 (f, f, n_C_prev)滤波器 与特征图中(f, f, n_C_prev)大小的区域累乘相加。
- 共有n_C个滤波器
计算Z
a_slice_prev = a_prev_pad[vert_start : vert_end , horiz_start : horiz_end , :]
Z[i, h, w, c] = np.sum(np.reshape(a_slice_prev * W[:,:,:,c],(a_slice_prev.size,))) +b[:,:,:,c]
-
不要忘记偏置b啊!!sum后加上偏置
-
搞清楚h w c,最后的c代表第几个滤波器,但是每个滤波器循环内都要算(f, f, n_C_prev)区域的累乘相加。这里突出反映了n_C_prev和n_C的区别。
即每个滤波器都要把特征图的所有channel都包含进去相乘累加算一遍。