CS231n 课程作业 Assignment Two(五)卷积神经网络(0830)

本文详细介绍了卷积神经网络的基础操作,包括Naive前向和反向传播、Max-Pooling、快速层实现、卷积ReLU池化层以及Spatial Batch和Group Normalization的前向和反向传播。通过实践,讨论了在卷积网络中这些操作的重要性,以及如何使用它们训练和测试模型。
摘要由CSDN通过智能技术生成

Assignment Two(五)卷积神经网络

一、前提

和其他任务不同的是,Fast layers需要安装一个cython的扩展包
在虚拟环境中安装cython和正常方式不同,最简单的就是在anaconda prompt中activate cs231n,然后使用pip install cython
运行命令python setup.py build_ext --inplace
报错Unable to find vcvarsall.bat
然后安装VS2015,成功后重新调试
Note:编译成功后会在cs231n中生成如下3个文件
如果编译失败,重新编译,将这3个文件删除后再运行
在这里插入图片描述
这样就可以成功使用fast_layer了

二、layers中的传播函数

2.1 Convolution: Naive forward pass
def conv_forward_naive(x, w, b, conv_param):
    out = None
    stride,pad = conv_param['stride'],conv_param['pad']
    N,C,H,W = x.shape
    F,_,HH,WW = w.shape
    
    # 计算输出的大小
    H_out = int(1+(H+2*pad-HH)/stride)
    W_out = int(1+(W+2*pad-WW)/stride)
    out = np.zeros((N,F,H_out,W_out))    
    
    # 给x补零,((0,),(0,),(pad,),(pad,))表示第一二个维度不需要补零,第三四个维度前后各补pad个零
    x_pad = np.pad(x,((0,),(0,),(pad,),(pad,)),mode='constant',constant_values=0)
    
    for k in range(N):
        for i in range(H_out):
            for j in range(W_out):
                # 取出第k个输入数据中与卷积核相乘的部分,维度为(1, C, HH, WW)
                mask = x_pad[k,:,i*stride:i*stride+HH,j*stride:j*stride+WW]
                # 和F个卷积核相乘,卷积核维度为(F, C, HH, WW),相乘的结果为(F, C, HH, WW)
                # 然后求和,最后的维度为(F,)
                out[k,:,i,j] = np.sum(mask*w[:,:,:,:],axis=(1,2,3))    
    # None相当于numpy.newaxis
    out += b[None,:,None,None]
    
    cache = (x, w, b, conv_param)
    return out, cache

代码分析:

输入包含N个数据,每个数据有C个通道,每个通道高维H,宽为W。
卷积核总共有F个,每个有C个通道,宽和高分别为HH和WW。
Input:
- x: 输入数据,维度为 (N, C, H, W)
- w: 卷积核,维度为 (F, C, HH, WW)
- b: 偏置,维度为 (F,),也就是每个卷积核一个偏置
- conv_param: 
  - 'stride': 水平和竖直方向上的步长
  - 'pad': 边缘补零的个数,上下左右补零的个数都相同,都是pad
Returns:
- out: 经过卷积后的数据,维度为 (N, F, H', W') 其中H'和W'由下面的式子计算得出
  H' = 1 + (H + 2 * pad - HH) / stride
  W' = 1 + (W + 2 * pad - WW) / stride
- cache: (x, w, b, conv_param)

Test_2.1 conv_forward_naive

x_shape = (2, 3, 4, 4)
w_shape = (3, 3, 4, 4)
x = np.linspace(-0.1, 0.5, num=np.prod(x_shape)).reshape(x_shape)
w = np.linspace(-0.2, 0.3, num=np.prod(w_shape)).reshape(w_shape)
b = np.linspace(-0.1, 0.2, num=3)

conv_param = {
   'stride': 2, 'pad': 1}
out, _ = conv_forward_naive(x, w, b, conv_param)
correct_out = np.array([[[[-0.08759809, -0.10987781],
                           [-0.18387192, -0.2109216 ]],
                          [[ 0.21027089,  0.21661097],
                           [ 0.22847626,  0.23004637]],
                          [[ 0.50813986,  0.54309974],
                           [ 0.64082444,  0.67101435]]],
                         [[[-0.98053589, -1.03143541],
                           [-1.19128892, -1.24695841]],
                          [[ 0.69108355,  0.66880383],
                           [ 0.59480972,  0.56776003]],
                          [[ 2.36270298,  2.36904306],
                           [ 2.38090835,  2.38247847]]]])

# Compare your output to ours; difference should be around e-8
print('Testing conv_forward_naive')
print('difference: ', rel_error(out, correct_out))

输出:

Testing conv_forward_naive
difference:  2.2121476417505994e-08
2.2 Convolution: Naive backward pass
def conv_backward_naive(dout, cache):
    dx, dw, db = None, None, None
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    x,w,b,conv_param = cache
    N,C,H,W = x.shape
    F,_,HH,WW = w.shape
    stride,pad = conv_param['stride'],conv_param['pad']
    
    H_out = int(1+(H+2*pad-HH)/stride)
    W_out = int(1+(W+2*pad-WW)/stride)
    x_pad = np.pad(x,((0,),(0,),(pad,),(pad,)),mode='constant',constant_values=0)    
    
    dw = np.zeros_like(w)
    dx = np.zeros_like(x)
    dx_pad = np.zeros_like(x_pad)    
    
    for n in range(N):
        for i in range(H_out):
            for j in range(W_out):
                dx_pad[n,:,i*stride:i*stride+HH,j*stride:j*stride+WW] += np.sum((dout[n,:,i,j])[:,None,None,None]*w,axis=0)
                mask = x_pad[n,:,i*stride:i*stride+HH,j*stride:j*stride+WW]
                dw += mask*(dout[n,:,i,j])[:,None,None,None]    
    # dx
    dx = dx_pad[:,:,pad:-pad,pad:-pad]
    
    # 卷积层输出对b求导得到np.ones_like(dout),这个矩阵和dout相乘
    # 又因为db的维度和b一样,是(F,),所以db相当于以下等式
    db = np.sum(dout,axis=(0,2,3))
    pass
    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    return dx, dw, db

Test_2.2 conv_backward_naive

np.random.seed(231)
x = np.random.randn(4, 3, 5, 5)
w = np.random.randn(2, 3, 3, 3)
b = np.random.randn(2,)
dout = np.random.randn(4, 2, 5, 5)
conv_param = {
   'stride': 1, 'pad': 1}

dx_num = eval_numerical_gradient_array(lambda x: conv_forward_naive(x, w, b, conv_param)[0], x, dout)
dw_num = eval_numerical_gradient_array(lambda w: conv_forward_naive(x, w, b, conv_param)[0], w, dout)
db_num = eval_numerical_gradient_array(lambda b: conv_forward_naive(x, w, b, conv_param)[0], b, dout)

out, cache = conv_forward_naive(x, w, b, conv_param)
dx, dw, db = conv_backward_naive(dout, cache)

# Your errors should be around e-8 or less.
print('Testing conv_backward_naive function')
print('dx error: ', rel_error(dx, dx_num))
print('dw error: ', rel_error(dw, dw_num))
print('db error: ', rel_error(db, db_num))

输出:

Testing conv_backward_naive function
dx error:  1.159803161159293e-08
dw error:  2.2471264748452487e-10
db error:  3.37264006649648e-11
2.3 Max-Pooling: Naive forward
def max_pool_forward_naive(x, pool_param):
    out = 
  • 2
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值