Assignment Two(五)卷积神经网络
一、前提
和其他任务不同的是,Fast layers需要安装一个cython的扩展包
在虚拟环境中安装cython和正常方式不同,最简单的就是在anaconda prompt中activate cs231n,然后使用pip install cython
运行命令python setup.py build_ext --inplace
报错Unable to find vcvarsall.bat
然后安装VS2015,成功后重新调试
Note:编译成功后会在cs231n中生成如下3个文件
如果编译失败,重新编译,将这3个文件删除后再运行
这样就可以成功使用fast_layer了
二、layers中的传播函数
2.1 Convolution: Naive forward pass
def conv_forward_naive(x, w, b, conv_param):
out = None
stride,pad = conv_param['stride'],conv_param['pad']
N,C,H,W = x.shape
F,_,HH,WW = w.shape
# 计算输出的大小
H_out = int(1+(H+2*pad-HH)/stride)
W_out = int(1+(W+2*pad-WW)/stride)
out = np.zeros((N,F,H_out,W_out))
# 给x补零,((0,),(0,),(pad,),(pad,))表示第一二个维度不需要补零,第三四个维度前后各补pad个零
x_pad = np.pad(x,((0,),(0,),(pad,),(pad,)),mode='constant',constant_values=0)
for k in range(N):
for i in range(H_out):
for j in range(W_out):
# 取出第k个输入数据中与卷积核相乘的部分,维度为(1, C, HH, WW)
mask = x_pad[k,:,i*stride:i*stride+HH,j*stride:j*stride+WW]
# 和F个卷积核相乘,卷积核维度为(F, C, HH, WW),相乘的结果为(F, C, HH, WW)
# 然后求和,最后的维度为(F,)
out[k,:,i,j] = np.sum(mask*w[:,:,:,:],axis=(1,2,3))
# None相当于numpy.newaxis
out += b[None,:,None,None]
cache = (x, w, b, conv_param)
return out, cache
代码分析:
输入包含N个数据,每个数据有C个通道,每个通道高维H,宽为W。
卷积核总共有F个,每个有C个通道,宽和高分别为HH和WW。
Input:
- x: 输入数据,维度为 (N, C, H, W)
- w: 卷积核,维度为 (F, C, HH, WW)
- b: 偏置,维度为 (F,),也就是每个卷积核一个偏置
- conv_param:
- 'stride': 水平和竖直方向上的步长
- 'pad': 边缘补零的个数,上下左右补零的个数都相同,都是pad
Returns:
- out: 经过卷积后的数据,维度为 (N, F, H', W') 其中H'和W'由下面的式子计算得出
H' = 1 + (H + 2 * pad - HH) / stride
W' = 1 + (W + 2 * pad - WW) / stride
- cache: (x, w, b, conv_param)
Test_2.1 conv_forward_naive
x_shape = (2, 3, 4, 4)
w_shape = (3, 3, 4, 4)
x = np.linspace(-0.1, 0.5, num=np.prod(x_shape)).reshape(x_shape)
w = np.linspace(-0.2, 0.3, num=np.prod(w_shape)).reshape(w_shape)
b = np.linspace(-0.1, 0.2, num=3)
conv_param = {
'stride': 2, 'pad': 1}
out, _ = conv_forward_naive(x, w, b, conv_param)
correct_out = np.array([[[[-0.08759809, -0.10987781],
[-0.18387192, -0.2109216 ]],
[[ 0.21027089, 0.21661097],
[ 0.22847626, 0.23004637]],
[[ 0.50813986, 0.54309974],
[ 0.64082444, 0.67101435]]],
[[[-0.98053589, -1.03143541],
[-1.19128892, -1.24695841]],
[[ 0.69108355, 0.66880383],
[ 0.59480972, 0.56776003]],
[[ 2.36270298, 2.36904306],
[ 2.38090835, 2.38247847]]]])
# Compare your output to ours; difference should be around e-8
print('Testing conv_forward_naive')
print('difference: ', rel_error(out, correct_out))
输出:
Testing conv_forward_naive
difference: 2.2121476417505994e-08
2.2 Convolution: Naive backward pass
def conv_backward_naive(dout, cache):
dx, dw, db = None, None, None
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
x,w,b,conv_param = cache
N,C,H,W = x.shape
F,_,HH,WW = w.shape
stride,pad = conv_param['stride'],conv_param['pad']
H_out = int(1+(H+2*pad-HH)/stride)
W_out = int(1+(W+2*pad-WW)/stride)
x_pad = np.pad(x,((0,),(0,),(pad,),(pad,)),mode='constant',constant_values=0)
dw = np.zeros_like(w)
dx = np.zeros_like(x)
dx_pad = np.zeros_like(x_pad)
for n in range(N):
for i in range(H_out):
for j in range(W_out):
dx_pad[n,:,i*stride:i*stride+HH,j*stride:j*stride+WW] += np.sum((dout[n,:,i,j])[:,None,None,None]*w,axis=0)
mask = x_pad[n,:,i*stride:i*stride+HH,j*stride:j*stride+WW]
dw += mask*(dout[n,:,i,j])[:,None,None,None]
# dx
dx = dx_pad[:,:,pad:-pad,pad:-pad]
# 卷积层输出对b求导得到np.ones_like(dout),这个矩阵和dout相乘
# 又因为db的维度和b一样,是(F,),所以db相当于以下等式
db = np.sum(dout,axis=(0,2,3))
pass
# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
return dx, dw, db
Test_2.2 conv_backward_naive
np.random.seed(231)
x = np.random.randn(4, 3, 5, 5)
w = np.random.randn(2, 3, 3, 3)
b = np.random.randn(2,)
dout = np.random.randn(4, 2, 5, 5)
conv_param = {
'stride': 1, 'pad': 1}
dx_num = eval_numerical_gradient_array(lambda x: conv_forward_naive(x, w, b, conv_param)[0], x, dout)
dw_num = eval_numerical_gradient_array(lambda w: conv_forward_naive(x, w, b, conv_param)[0], w, dout)
db_num = eval_numerical_gradient_array(lambda b: conv_forward_naive(x, w, b, conv_param)[0], b, dout)
out, cache = conv_forward_naive(x, w, b, conv_param)
dx, dw, db = conv_backward_naive(dout, cache)
# Your errors should be around e-8 or less.
print('Testing conv_backward_naive function')
print('dx error: ', rel_error(dx, dx_num))
print('dw error: ', rel_error(dw, dw_num))
print('db error: ', rel_error(db, db_num))
输出:
Testing conv_backward_naive function
dx error: 1.159803161159293e-08
dw error: 2.2471264748452487e-10
db error: 3.37264006649648e-11
2.3 Max-Pooling: Naive forward
def max_pool_forward_naive(x, pool_param):
out =