cs231n的第一次作业2层神经网络

最新推荐文章于 2024-07-07 14:05:28 发布

icameling

最新推荐文章于 2024-07-07 14:05:28 发布

阅读量3.2k

点赞数 4

分类专栏： cs231n 文章标签：神经网络 cs231n

本文链接：https://blog.csdn.net/icameling/article/details/55802946

版权

本文介绍了在cs231n课程中完成的2层神经网络实现，详细讨论了初始化、损失计算、梯度求解过程，并通过CIFAR10数据集进行实战，展示训练过程与结果。最终在调整超参数后，验证集上的准确率达到50.2%。

摘要由CSDN通过智能技术生成

一个小测试，测试写的函数对不对

首先是初始化


    input_size = 4
    hidden_size = 10
    num_classes = 3
    num_inputs = 5

    def init_toy_model():
      np.random.seed(0)
      return TwoLayerNet(input_size, hidden_size, num_classes, std=1e-1)

    def init_toy_data():
      np.random.seed(1)
      X = 10 * np.random.randn(num_inputs, input_size)
      y = np.array([0, 1, 2, 2, 1])
      return X, y

    net = init_toy_model()
    X, y = init_toy_data()
    print X.shape, y.shape

初始化

class TwoLayerNet(object):
          def __init__(self, input_size, hidden_size, output_size, std=1e-4):
        """
        Initialize the model. Weights are initialized to small random values and
        biases are initialized to zero. Weights and biases are stored in the
        variable self.params, which is a dictionary with the following keys:

        W1: First layer weights; has shape (D, H)
        b1: First layer biases; has shape (H,)
        W2: Second layer weights; has shape (H, C)
        b2: Second layer biases; has shape (C,)

        Inputs:
        - input_size: The dimension D of the input data.
        - hidden_size: The number of neurons H in the hidden layer.
        - output_size: The number of classes C.
        """
        self.params = {}
        self.params['W1'] = std * np.random.randn(input_size, hidden_size)
        self.params['b1'] = np.zeros(hidden_size)
        self.params['W2'] = std * np.random.randn(hidden_size, output_size)
        self.params['b2'] = np.zeros(output_size)

对于W1的维数，即将输入样本的个数每个分配一个权重，最后输出相当于是hidden_size个分数，然后这些分数和激活函数相比较，b1应该是比较的阈值吧（自己觉得），有些分数就不会起作用。这样得到处理后的分数，在与W2相乘，与激活函数相比较，可以看到，W2输出是output_size，也就是说，输出的分数和类别数一样，即最终的分数。这里初始化这四个参数的意思大概就是这样子。

X的大小X.shape = (5, 4)
y的大小y.shape = (5, )
net.params[‘W1’].shape = (4, 10)
net.params[‘b1’].shape = (10, )
net.params[‘W2’].shape = (10, 3)
net.params[‘b2’].shape = (3, )
知道了维数关系，也就清楚了是 X*W 而不是 W*X，这个按实际去写，不要硬记。

计算loss和grad

def loss(self, X, y=None, reg=0.0):
    """
    Compute the loss and

最低0.47元/天解锁文章

icameling

关注

4
点赞
踩
9

收藏

觉得还不错? 一键收藏
3
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录