斯坦福cs231n课程记录——assignment2 FullyConnectedNets

目录

  • 作业目的
  • 网络层实现
  • 优化方法实现
  • 作业问题记录
  • 参考文献

一、作业目的

之前做了一个Two-layer neural network的作业,但是其损失函数和反向传播都是在一个函数中实现的,并没有实现模块化,因此不适合复杂网络结构的开发。因此本作业目的在于将各功能模块化,从而较好地实现复杂网络的搭建。

二、网络层实现

1.affine_layer(layers.py)

1.1 affine_forward

Inputs:
    - x: A numpy array containing input data, of shape (N, d_1, ..., d_k)
    - w: A numpy array of weights, of shape (D, M)
    - b: A numpy array of biases, of shape (M,)

 Returns a tuple of:
    - out: output, of shape (N, M)
    - cache: (x, w, b)

def affine_forward(x, w, b):
    out = None
    out = np.dot(x.reshape((x.shape[0], -1)), w) + b
    cache = (x, w, b)
    return out, cache

前向传播较为简单,就是out = x * w + b,维度是(N,M)=(N,D) * (D,M) +(M,),这里与b相加用到了broadcast机制。

1.2 affine_backward

Inputs:
    - dout: Upstream derivative, of shape (N, M)
    - cache: Tuple of:
      - x: Input data, of shape (N, d_1, ... d_k)
      - w: Weights, of shape (D, M)
      - b: Biases, of shape (M,)

    Returns a tuple of:
    - dx: Gradient with respect to x, of shape (N, d1, ..., d_k)
    - dw: Gradient with respect to w, of shape (D, M)
    - db: Gradient with respect to b, of shape (M,)

def affine_backward(dout, cache):
    x, w, b = cache
    dx, dw, db = None, None, None
    dw = np.dot(x.reshape((x.shape[0], -1)).T, dout)
    db = dout.sum(axis=0)
    dx = np.dot(dout, w.T)
    dx = dx.reshape(x.shape)
    return dx, dw, db

反向传播主要注意维度的问题。    

dw = x.T * dout   (D,M)=(D,N)*(N,M)
db = dout 的列向量之和 (M,) =(N,M)[0]
dx = dout * w.T (N,D) = (N,M) *(M,D)

2.ReLU layer(layers.py)

2.1 relu_forward 

Input:
    - x: Inputs, of any shape

 Returns a tuple of:
    - out: Output, of the same shape as x
    - cache: x

def relu_forward(x):
    out = x * (x > 0)
    cache = x
    return out, cache

(x > 0 ) 是一个布尔判断,输出大于0的x。

2.2 relu_backward

Input:
    - dout: Upstream derivatives, of any shape
    - cache: Input x, of same shape as dout

Returns:
    - dx: Gradient with respect to x

dx, x = None, cache
    dx = dout * (x > 0)
    return dx

同样,大于0的数才会得到反向传播的值。

3.Loss layers: Softmax and SVM(layers.py)

def svm_loss(x, y):
    """
    Computes the loss and gradient using for multiclass SVM classification.

    Inputs:
    - x: Input data, of shape (N, C) where x[i, j] is the score for the jth
      class for the ith input.
    - y: Vector of labels, of shape (N,) where y[i] is the label for x[i] and
      0 <= y[i] < C

    Returns a tuple of:
    - loss: Scalar giving the loss
    - dx: Gradient of the loss with respect to x
    """
    N = x.shape[0]
    correct_class_scores = x[np.arange(N), y]
    margins = np.maximum(0, x - correct_class_scores[:, np.newaxis] + 1.0)
    margins[np.arange(N), y] = 0
    loss = np.sum(margins) / N
    num_pos = np.sum(margins > 0, axis=1)
    dx = np.zeros_like(x)
    dx[margins > 0] = 1
    dx[np.arange(N), y] -= num_pos
    dx /= N
    return loss, dx

def softmax_loss(x, y):
    """
    Computes the loss and gradient for softmax classification.

    I
  • 2
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值