吴恩达deeplearning Lesson4 Week1 np实现卷积网络

最新推荐文章于 2022-06-25 14:55:57 发布

pu扑朔迷离

最新推荐文章于 2022-06-25 14:55:57 发布

阅读量404

点赞数

分类专栏： DeepLearning

本文链接：https://blog.csdn.net/bluehatihati/article/details/90313911

版权

DeepLearning 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

吴恩达deeplearning Lesson4 Week1

复习卷积网络遗漏
python矩阵操作
- np 多维矩阵求和
- tf 转置 & 平均
前向传播
- n_C_prev和n_C
- 计算Z

复习卷积网络遗漏

1、maxpooling层每个操作只是对一个channel，maxpooling后channel数不变。
2、卷积层有偏置。每层有一个偏置。
具体代码：


def conv_single_step(a_slice_prev, W, b):
    """
    Apply one filter defined by parameters W on a single slice (a_slice_prev) of the output activation 
    of the previous layer.
    
    Arguments:
    a_slice_prev -- slice of input data of shape (f, f, n_C_prev)
    W -- Weight parameters contained in a window - matrix of shape (f, f, n_C_prev)
    b -- Bias parameters contained in a window - matrix of shape (1, 1, 1)
    
    Returns:
    Z -- a scalar value, result of convolving the sliding window (W, b) on a slice x of the input data
    """

    ### START CODE HERE ### (≈ 2 lines of code)
    # Element-wise product between a_slice and W. Do not add the bias yet.
    s = a_slice_prev * W
    # Sum over all entries of the volume s.
    Z =  np.sum(np.reshape(s,(s.size,)))
    # Add bias b to Z. Cast b to a float() so that Z results in a scalar value.
    Z = Z+b
    ### END CODE HERE ###

    return Z

3、quit的错题（也佐证了卷积层有偏置）
问题 3
假设你的输入是300×300彩色（RGB）图像，并且你使用卷积层和100个过滤器，每个过滤器都是5×5的大小，请问这个隐藏层有多少个参数（包括偏置参数）？

【】 2501

【】 2600

【】 7500

【★】 7600

注：视频【1.7单层卷积网络】，05:10处。首先，参数和输入的图片大小是没有关系的，无论你给的图像像素有多大，参数值都是不变的，在这个题中，参数值只与过滤器有关。我们来看一下怎么算：单片过滤器的大小是5∗55∗5，由于输入的是RGB图像，所以信道nc=3nc=3，由此可见，一个完整的过滤器的组成是：5∗5∗nc=5∗5∗35∗5∗nc=5∗5∗3，每一个完整的过滤器只有一个偏置参数bb，所以，每一个完整的过滤器拥有5∗5∗3+1=765∗5∗3+1=76个参数，而此题中使用了100100个过滤器，所以这个隐藏层包含了76∗100=760076∗100=7600个参数。

原文：https://blog.csdn.net/u013733326/article/details/80046299

python矩阵操作

np 多维矩阵求和

多维矩阵所有元素求和：
y要领就是先把矩阵转换成一维数组，再求和。

   #Element-wise product between a_slice and W. Do not add the bias yet.
    s = a_slice_prev * W
    # Sum over all entries of the volume s.
    Z =  np.sum(np.reshape(s,(s.size,)))

tf 转置 & 平均

 logits = tf.transpose(Z3) #tf.transpose转置
 labels = tf.transpose(Y) #tf.transpose转置        
     
    # Calculate accuracy on the test set
 accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) 
     #tf.cast目的是转换格式，此处为将原来的int型转换为float型

前向传播

这里遇到不少麻烦，讲一下

def conv_forward(A_prev, W, b, hparameters):
    """
    Implements the forward propagation for a convolution function
    
    Arguments:
    A_prev -- output activations of the previous layer, numpy array of shape (m, n_H_prev, n_W_prev, n_C_prev)
    W -- Weights, numpy array of shape (f, f, n_C_prev, n_C)
    b -- Biases, numpy array of shape (1, 1, 1, n_C)
    hparameters -- python dictionary containing "stride" and "pad"
        
    Returns:
    Z -- conv output, numpy array of shape (m, n_H, n_W, n_C)
    cache -- cache of values needed for the conv_backward() function
    """
    
    ### START CODE HERE ###
    # Retrieve dimensions from A_prev's shape (≈1 line)  
    (m, n_H_prev, n_W_prev, n_C_prev) = A_prev.shape
    
    # Retrieve dimensions from W's shape (≈1 line)
    (f, f, n_C_prev, n_C) = W.shape
    
    # Retrieve information from "hparameters" (≈2 lines)
    stride = hparameters["stride"]
    pad = hparameters["pad"]
    
    # Compute the dimensions of the CONV output volume using the formula given above. Hint: use int() to floor. (≈2 lines)
    n_H = int((n_H_prev-f+2*pad) / stride+1)   #公式计算！
    n_W = int((n_W_prev-f+2*pad) / stride+1)
    #print(m,n_H,n_W,n_C)
    # Initialize the output volume Z with zeros. (≈1 line)
    Z = np.zeros((m,n_H,n_W,n_C))
    
    # Create A_prev_pad by padding A_prev
    A_prev_pad = zero_pad(A_prev, pad)

    for i in range(m):                               # loop over the batch of training examples
        a_prev_pad = A_prev_pad[i,:,:,:]           
        #print(a_prev_pad.shape)
        # Select ith training example's padded activation
        for h in range(n_H):                           # loop over vertical axis of the output volume
            for w in range(n_W):                       # loop over horizontal axis of the output volume
                for c in range(n_C):                   # loop over channels (= #filters) of the output volume
                    
                    # Find the corners of the current "slice" (≈4 lines)
                    vert_start = h * stride
                    vert_end = h * stride + f
                    horiz_start = w * stride
                    horiz_end = w * stride + f
                    
                    #print( vert_start, vert_end,horiz_start,horiz_end)
                    # Use the corners to define the (3D) slice of a_prev_pad (See Hint above the cell). (≈1 line)
                    a_slice_prev = a_prev_pad[vert_start : vert_end , horiz_start : horiz_end , :]
                    #print(a_slice_prev.shape,W[:,:,:,c].shape,)
                    # Convolve the (3D) slice with the correct filter W and bias b, to get back one output neuron. (≈1 line)
                    Z[i, h, w, c] = np.sum(np.reshape(a_slice_prev * W[:,:,:,c],(a_slice_prev.size,))) +b[:,:,:,c]
                                        
    ### END CODE HERE ###
    
    # Making sure your output shape is correct
    assert(Z.shape == (m, n_H, n_W, n_C))
    
    # Save information in "cache" for the backprop
    cache = (A_prev, W, b, hparameters)
    
    return Z, cache

n_C_prev和n_C

(f, f, n_C_prev, n_C) = W.shape
这里需要说明下，增进理解。区分n_C_prev和n_C的作用，
n_C_prev：是代表前一层输出通道数（即要处理的特征图的通道数channel）
n_C：是代表输出的通道数（即滤波器数）

每一个滤波器的计算流程是将 (f, f, n_C_prev)滤波器与特征图中(f, f, n_C_prev)大小的区域累乘相加。
共有n_C个滤波器

计算Z

a_slice_prev = a_prev_pad[vert_start : vert_end , horiz_start : horiz_end , :]
Z[i, h, w, c] = np.sum(np.reshape(a_slice_prev * W[:,:,:,c],(a_slice_prev.size,))) +b[:,:,:,c]