文章目录
作业介绍
-
作业主页:Assignment #1
-
官方给的示例代码:assigment #1 code
知识点简单回顾
神经网络(Neural Networks) 是一种非线性分类器,区别于 SVM分类器 和 Softmax分类器,它的中间网络层通过激活函数等手段引入了非线性性,而一个比较简单的网络模型如下,它只有一个中间隐藏层:
一般的神经网络模型都包括 前向传播 和 反向传播 两个过程。以上图为例:
- 前向传播接收输入 X,然后映射到中间隐藏层 H:
H = f ( W 1 X + b 1 ) H = f(W_1X + b_1) H=f(W1X+b1)
当然,你也可以把每一个隐藏层的行为看成与之前的线性分类器类似,但是需要添加非线性的激活函数 f ( ) f() f()。 - 然后中间隐藏层 H H H 再经过映射得到我们的预测分数层 S S S
S = f ( W 2 H + b 2 ) S = f(W_2H+b_2) S=f(W2H+b2) - 这里的 S S S 我们可以看成之前线性分类器的得分,然后后面使用 S V M SVM SVM 或者 S o f t m a x Softmax Softmax 进行分类。
- 反向传播阶段就是根据 计算图 和 链式法则 计算出 损失函数对于参数的 梯度,然后用 SGD来更新梯度。
- 上图中,绿色代表前向传播,红色代表反向传播过程。
1. 下载数据集
参照之前的 KNN 分类器
2. 神经网络分类器
- 使用 jupyter nodetebook 打开文件
two_layer_net.ipynb
梳理一下流程(期间没有的python库需要自己手动安装一下) - 首先,我们需要在
cs231n/classifiers/neural_net.py
中实现我们的分类器 - 而我们的分类器就类似于上图中的两层神经网络分类器,其网络结果如下
- Input -> Fully Connected Layer -> ReLu -> Fully Connected Layer -> Softmax
2.1 实现神经网络分类器
- 我们实现
neural_net.py\TwoLayerNet
中简单分类器的权重初始化、前向传播、反向传播和预测过程。
class TwoLayerNet(object):
"""
A two-layer fully-connected neural network.
The net has an input dimension of N, a hidden layer dimension of H, and performs classification over C classes.
We train the network with a softmax loss function and L2 regularization on the
weight matrices. The network uses a ReLU nonlinearity after the first fully
connected layer.
The outputs of the second fully-connected layer are the scores for each class.
"""
def __init__(self, input_size, hidden_size, output_size, std=1e-4):
"""
Initialize the model. Weights are initialized to small random values and
biases are initialized to zero. Weights and biases are stored in the
variable self.params, which is a dictionary with the following keys:
W1: First layer weights; has shape (D, H)
b1: First layer biases; has shape (H,)
W2: Second layer weights; has shape (H, C)
b2: Second layer biases; has shape (C,)
Inputs:
- input_size: The dimension D of the input data.
- hidden_size: The number of neurons H in the hidden layer.
- output_size: The number of classes C.
"""
self.params = {
}
self.params['W1'] = std * np.random.randn(input_size, hidden_size)
self.params['b1'] = np.zeros(hidden_size)
self.params['W2'] = std * np.random.randn(hidden_size, output_size)
self.params['b2'] = np.zeros(output_size)
def loss(self, X, y=None, reg=0.0):
"""
Compute the loss and gradients for a two layer fully connected neural
network.
Inputs:
- X: Input data of shape (N, D). Each X[i] is a training sample.
- y: Vector of training labels. y[i] is the label for X[i], and each y[i] is
an integer in the range 0 <= y[i] < C. This parameter is optional; if it
is not passed then we only return scores, and if it is passed then we
instead return the loss and gradients.
- reg: Regularization strength.
Returns:
If y is None, return a matrix scores of shape (N, C) where scores[i, c] is
the score for class c on input X[i].
If y is not None, instead return a tuple of:
- loss: Loss (data loss and regularization loss) for this batch of training
samples.
- grads: Dictionary mapping parameter names to gradients of those parameters
with respect to the loss function; has the same keys as self.params.
"""
# Unpack variables from the params dictionary
W1, b1 = self.params['W1'], self.params['b1']
W2, b2 = self.params['W2'], self.params['b2']
N, D = X.shape
# Compute the forward pass
scores = None
h1 = np.dot(X,W1) + b1
h2 = np.maximum(0,h1) # relu 激活函数
scores = np.dot(h2,W2) + b2
# If the targets are not given then jump out, we're done
if y is None:
return scores
# Compute the loss
loss = None
# Use the Softmax classifier loss.
# 得分函数减一个最大值,避免指数结果太大
scores -= np.max(scores,axis = 1,keepdims = True)
softmax_output = np.exp(scores) / np.sum(np.exp(scores),axis = 1, keepdims = True)
loss = np.sum(-np.log(softmax_output[range(N),y])) # 只有正确分类才有损失
# 添加正则项
loss /= N# 平均损失
# 注意:这里我在正则项前面加了一个0.5
# 所以,可能