吴恩达deeplearning Lesson4 Week1 np实现卷积网络

复习卷积网络遗漏

1、maxpooling层每个操作只是对一个channel,maxpooling后channel数不变。
2、卷积层有偏置。每层有一个偏置。
具体代码:


def conv_single_step(a_slice_prev, W, b):
    """
    Apply one filter defined by parameters W on a single slice (a_slice_prev) of the output activation 
    of the previous layer.
    
    Arguments:
    a_slice_prev -- slice of input data of shape (f, f, n_C_prev)
    W -- Weight parameters contained in a window - matrix of shape (f, f, n_C_prev)
    b -- Bias parameters contained in a window - matrix of shape (1, 1, 1)
    
    Returns:
    Z -- a scalar value, result of convolving the sliding window (W, b) on a slice x of the input data
    """

    ### START CODE HERE ### (≈ 2 lines of code)
    # Element-wise product between a_slice and W. Do not add the bias yet.
    s = a_slice_prev * W
    # Sum over all entries of the volume s.
    Z =  np.sum(np.reshape(s,(s.size,)))
    # Add bias b to Z. Cast b to a float() so that Z results in a scalar value.
    Z = Z+b
    ### END CODE HERE ###

    return Z

3、quit的错题(也佐证了卷积层有偏置)
问题 3
  假设你的输入是300×300彩色(RGB)图像,并且你使用卷积层和100个过滤器,每个过滤器都是5×5的大小,请问这个隐藏层有多少个参数(包括偏置参数)?

【 】 2501

【 】 2600

【 】 7500

【★】 7600

注:视频【1.7单层卷积网络】,05:10处。首先,参数和输入的图片大小是没有关系的,无论你给的图像像素有多大,参数值都是不变的,在这个题中,参数值只与过滤器有关。我们来看一下怎么算:单片过滤器的大小是5∗55∗5,由于输入的是RGB图像,所以信道nc=3nc=3,由此可见,一个完整的过滤器的组成是:5∗5∗nc=5∗5∗35∗5∗nc=5∗5∗3,每一个完整的过滤器只有一个偏置参数bb,所以,每一个完整的过滤器拥有5∗5∗3+1=765∗5∗3+1=76个参数,而此题中使用了100100个过滤器,所以这个隐藏层包含了76∗100=760076∗100=7600个参数。

原文:https://blog.csdn.net/u013733326/article/details/80046299

python矩阵操作

np 多维矩阵求和

多维矩阵所有元素求和:
y要领就是先把矩阵转换成一维数组,再求和。

   #Element-wise product between a_slice and W. Do not add the bias yet.
    s = a_slice_prev * W
    # Sum over all entries of the volume s.
    Z =  np.sum(np.reshape(s,(s.size,)))

tf 转置 & 平均

 logits = tf.transpose(Z3) #tf.transpose转置
 labels = tf.transpose(Y) #tf.transpose转置        
     
    # Calculate accuracy on the test set
 accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) 
     #tf.cast目的是转换格式,此处为将原来的int型转换为float型

前向传播

这里遇到不少麻烦,讲一下

def conv_forward(A_prev, W, b, hparameters):
    """
    Implements the forward propagation for a convolution function
    
    Arguments:
    A_prev -- output activations of the previous layer, numpy array of shape (m, n_H_prev, n_W_prev, n_C_prev)
    W -- Weights, numpy array of shape (f, f, n_C_prev, n_C)
    b -- Biases, numpy array of shape (1, 1, 1, n_C)
    hparameters -- python dictionary containing "stride" and "pad"
        
    Returns:
    Z -- conv output, numpy array of shape (m, n_H, n_W, n_C)
    cache -- cache of values needed for the conv_backward() function
    """
    
    ### START CODE HERE ###
    # Retrieve dimensions from A_prev's shape (≈1 line)  
    (m, n_H_prev, n_W_prev, n_C_prev) = A_prev.shape
    
    # Retrieve dimensions from W's shape (≈1 line)
    (f, f, n_C_prev, n_C) = W.shape
    
    # Retrieve information from "hparameters" (≈2 lines)
    stride = hparameters["stride"]
    pad = hparameters["pad"]
    
    # Compute the dimensions of the CONV output volume using the formula given above. Hint: use int() to floor. (≈2 lines)
    n_H = int((n_H_prev-f+2*pad) / stride+1)   #公式计算!
    n_W = int((n_W_prev-f+2*pad) / stride+1)
    #print(m,n_H,n_W,n_C)
    # Initialize the output volume Z with zeros. (≈1 line)
    Z = np.zeros((m,n_H,n_W,n_C))
    
    # Create A_prev_pad by padding A_prev
    A_prev_pad = zero_pad(A_prev, pad)

    for i in range(m):                               # loop over the batch of training examples
        a_prev_pad = A_prev_pad[i,:,:,:]           
        #print(a_prev_pad.shape)
        # Select ith training example's padded activation
        for h in range(n_H):                           # loop over vertical axis of the output volume
            for w in range(n_W):                       # loop over horizontal axis of the output volume
                for c in range(n_C):                   # loop over channels (= #filters) of the output volume
                    
                    # Find the corners of the current "slice" (≈4 lines)
                    vert_start = h * stride
                    vert_end = h * stride + f
                    horiz_start = w * stride
                    horiz_end = w * stride + f
                    
                    #print( vert_start, vert_end,horiz_start,horiz_end)
                    # Use the corners to define the (3D) slice of a_prev_pad (See Hint above the cell). (≈1 line)
                    a_slice_prev = a_prev_pad[vert_start : vert_end , horiz_start : horiz_end , :]
                    #print(a_slice_prev.shape,W[:,:,:,c].shape,)
                    # Convolve the (3D) slice with the correct filter W and bias b, to get back one output neuron. (≈1 line)
                    Z[i, h, w, c] = np.sum(np.reshape(a_slice_prev * W[:,:,:,c],(a_slice_prev.size,))) +b[:,:,:,c]
                                        
    ### END CODE HERE ###
    
    # Making sure your output shape is correct
    assert(Z.shape == (m, n_H, n_W, n_C))
    
    # Save information in "cache" for the backprop
    cache = (A_prev, W, b, hparameters)
    
    return Z, cache

n_C_prev和n_C

(f, f, n_C_prev, n_C) = W.shape
这里需要说明下,增进理解。区分n_C_prev和n_C的作用,
n_C_prev:是代表前一层输出通道数(即要处理的特征图的通道数channel)
n_C:是代表输出的通道数(即滤波器数

  • 每一个滤波器的计算流程是将 (f, f, n_C_prev)滤波器 与特征图中(f, f, n_C_prev)大小的区域累乘相加。
  • 共有n_C个滤波器

计算Z

a_slice_prev = a_prev_pad[vert_start : vert_end , horiz_start : horiz_end , :]
Z[i, h, w, c] = np.sum(np.reshape(a_slice_prev * W[:,:,:,c],(a_slice_prev.size,))) +b[:,:,:,c]
  • 不要忘记偏置b啊!!sum后加上偏置

  • 搞清楚h w c,最后的c代表第几个滤波器,但是每个滤波器循环内都要算(f, f, n_C_prev)区域的累乘相加。这里突出反映了n_C_prev和n_C的区别。
    即每个滤波器都要把特征图的所有channel都包含进去相乘累加算一遍。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值