CS231n 课程作业 Assignment One(五)两层神经网络分类器(0816)

两层神经网络分类器–Neural Network(2layers)

全连接神经网络、单隐藏层

一、原理

1.1 全连接神经网络

基础理论见另一篇文章–全连接神经网络

1.2 损失函数

在这里插入图片描述
(待完善,想看可留言)

二、实现

2.1 损失函数与求导
    def loss(self, X, y=None, reg=0.0):
        """
        Compute the loss and gradients for a two layer fully connected neural
        network.

        Inputs:
        - X: Input data of shape (N, D). Each X[i] is a training sample.
        - y: Vector of training labels. y[i] is the label for X[i], and each y[i] is
          an integer in the range 0 <= y[i] < C. This parameter is optional; if it
          is not passed then we only return scores, and if it is passed then we
          instead return the loss and gradients.
        - reg: Regularization strength.

        Returns:
        If y is None, return a matrix scores of shape (N, C) where scores[i, c] is
        the score for class c on input X[i].

        If y is not None, instead return a tuple of:
        - loss: Loss (data loss and regularization loss) for this batch of training
          samples.
        - grads: Dictionary mapping parameter names to gradients of those parameters
          with respect to the loss function; has the same keys as self.params.
        """
        # Unpack variables from the params dictionary
        W1, b1 = self.params['W1'], self.params['b1']
        W2, b2 = self.params['W2'], self.params['b2']
        N, D = X.shape

        # Compute the forward pass
        scores = None
        #############################################################################
        # TODO: Perform the forward pass, computing the class scores for the input. #
        # Store the result in the scores variable, which should be an array of      #
        # shape (N, C).                                                             #
        #############################################################################
        # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

        h_output = np.maximum(0,X.dot(W1)+b1) #第一层输出(N,H) relu激活
        scores = h_output.dot(W2)+b2 #第二层线性输出 (N,C) 之后接softmax
        
        pass

        # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

        # If the targets are not given then jump out, we're done
        if y is None:
            return scores

        # Compute the loss
        loss = None
        #############################################################################
        # TODO: Finish the forward pass, and compute the loss. This should include  #
        # both the data loss and L2 regularization for W1 and W2. Store the result  #
        # in the variable loss, which should be a scalar. Use the Softmax           #
        # classifier loss.                                                          #
        #############################################################################
        # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        
        #经过softmax层计算loss
        shift_scores = scores - np.max(scores,axis=1).reshape((-1,1))
        softmax_output = np.exp(shift_scores)/np.sum(np.exp(shift_scores),axis=1).reshape(-1,1)
        loss = -np.sum(np.log(softmax_output[range(N),list(y)]))
        loss/=N
        loss+=reg*(np.sum(W1*W1)+np.sum(W2*W2))
        
        pass

        # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

        # Backward pass: compute gradients
        grads = {
   }
        #############################################################################
        # TODO: Compute the backward pass, computing the derivatives of the weights #
        # and biases. Store the results in the grads dictionary. For example,       #
        # grads['W1'] should store the gradient on W1, and be a matrix of same size #
        #############################################################################
        # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        dscores = softmax_output.copy()
        dscores[range(N),list(y)]-=1
        dscores/=N
        grads['W2'] = h_output.T.dot(dscores) + 2*reg*W2
        grads['b2'] = np.sum(dscores,axis=0)

        dh = dscores.dot(W2.T)
        dh_ReLu = (h_output>0)*dh
        grads['W1'] = X.T.dot(dh_ReLu) + 2*reg*W1
        grads['b1'] = np.sum(dh_ReLu,axis = 0)
        pass

        # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

        return loss, grads

代码分析:

关键点在于y是否为none
此函数用于计算loss,以及各个参数的梯度。就是把两个神经层以及最后一层softmax串联起来
2.2 训练过程

神经网络训练过程主要是完成以下工作:

1、接收数据,进行批处理和迭代相关参数的设置
2、喂数据,得到 loss和梯度
3、更新神经网络参数w1,w2,b1,b2
4、设置指数衰减学习率
    def train(self, X, y, X_val, y_val,
              learning_rate
  • 2
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值