两层神经网络分类器–Neural Network(2layers)
全连接神经网络、单隐藏层
一、原理
1.1 全连接神经网络
基础理论见另一篇文章–全连接神经网络
1.2 损失函数
(待完善,想看可留言)
二、实现
2.1 损失函数与求导
def loss(self, X, y=None, reg=0.0):
"""
Compute the loss and gradients for a two layer fully connected neural
network.
Inputs:
- X: Input data of shape (N, D). Each X[i] is a training sample.
- y: Vector of training labels. y[i] is the label for X[i], and each y[i] is
an integer in the range 0 <= y[i] < C. This parameter is optional; if it
is not passed then we only return scores, and if it is passed then we
instead return the loss and gradients.
- reg: Regularization strength.
Returns:
If y is None, return a matrix scores of shape (N, C) where scores[i, c] is
the score for class c on input X[i].
If y is not None, instead return a tuple of:
- loss: Loss (data loss and regularization loss) for this batch of training
samples.
- grads: Dictionary mapping parameter names to gradients of those parameters
with respect to the loss function; has the same keys as self.params.
"""
# Unpack variables from the params dictionary
W1, b1 = self.params['W1'], self.params['b1']
W2, b2 = self.params['W2'], self.params['b2']
N, D = X.shape
# Compute the forward pass
scores = None
#############################################################################
# TODO: Perform the forward pass, computing the class scores for the input. #
# Store the result in the scores variable, which should be an array of #
# shape (N, C). #
#############################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
h_output = np.maximum(0,X.dot(W1)+b1) #第一层输出(N,H) relu激活
scores = h_output.dot(W2)+b2 #第二层线性输出 (N,C) 之后接softmax
pass
# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
# If the targets are not given then jump out, we're done
if y is None:
return scores
# Compute the loss
loss = None
#############################################################################
# TODO: Finish the forward pass, and compute the loss. This should include #
# both the data loss and L2 regularization for W1 and W2. Store the result #
# in the variable loss, which should be a scalar. Use the Softmax #
# classifier loss. #
#############################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
#经过softmax层计算loss
shift_scores = scores - np.max(scores,axis=1).reshape((-1,1))
softmax_output = np.exp(shift_scores)/np.sum(np.exp(shift_scores),axis=1).reshape(-1,1)
loss = -np.sum(np.log(softmax_output[range(N),list(y)]))
loss/=N
loss+=reg*(np.sum(W1*W1)+np.sum(W2*W2))
pass
# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
# Backward pass: compute gradients
grads = {
}
#############################################################################
# TODO: Compute the backward pass, computing the derivatives of the weights #
# and biases. Store the results in the grads dictionary. For example, #
# grads['W1'] should store the gradient on W1, and be a matrix of same size #
#############################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
dscores = softmax_output.copy()
dscores[range(N),list(y)]-=1
dscores/=N
grads['W2'] = h_output.T.dot(dscores) + 2*reg*W2
grads['b2'] = np.sum(dscores,axis=0)
dh = dscores.dot(W2.T)
dh_ReLu = (h_output>0)*dh
grads['W1'] = X.T.dot(dh_ReLu) + 2*reg*W1
grads['b1'] = np.sum(dh_ReLu,axis = 0)
pass
# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
return loss, grads
代码分析:
关键点在于y是否为none
此函数用于计算loss,以及各个参数的梯度。就是把两个神经层以及最后一层softmax串联起来
2.2 训练过程
神经网络训练过程主要是完成以下工作:
1、接收数据,进行批处理和迭代相关参数的设置
2、喂数据,得到 loss和梯度
3、更新神经网络参数w1,w2,b1,b2
4、设置指数衰减学习率
def train(self, X, y, X_val, y_val,
learning_rate