Assignment 2 | FullyConnectedNets
gradient check函数的不同实现
到目前为止用到了三个gradient check函数,分别是:grad_check_sparse,eval_numerical_gradient,eval_numerical_gradient_array。
eval_numerical_gradient
def eval_numerical_gradient(f, x, h=0.00001)
f = lambda W: net.loss(X, y, reg=0.05)[0]
param_grad_num = eval_numerical_gradient(f, net.params[param_name])
这里,f函数是net.loss,返回值是一个数字。
eval_numerical_gradient_array
def eval_numerical_gradient_array(f, x, df, h=1e-5):
grad = np.zeros_like(x)
it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite'])
while not it.finished:
ix = it.multi_index
oldval = x[ix]
x[ix] = oldval + h
fxph = f(x).copy()
x[ix] = oldval - h
fxmh = f(x).copy()
x[ix] = oldval
grad[ix] = np.sum((fxph - fxmh) * df)/ (2 * h)
it.iternext()
return grad
dx_num = eval_numerical_gradient_array(lambda z: affine_forward(x, w, b)[0], x, dout)
这里f函数是affine_forward,返回值out是一个矩阵。所以,要对计算出函数值后, 要对其进行深拷贝:
fxph = f(x).copy()
并且,每次循环算出的仅仅是对x中的一项进行的微分,其结果也是一个数字,并且
dxij