cs231n assignment1 softmax

Softmax

Softmax的损失函数为
L i = − log ⁡ p y i = − log ⁡ ( e f y i ∑ j e f j ) = − f y i + log ⁡ ∑ j e f j L_{i}=-\log p_{y_{i}}=-\log \left(\frac{e^{f_{y_{i}}}}{\sum_{j} e^{f_{j}}}\right)=-f_{y_{i}}+\log \sum_{j} e^{f_{j}} Li=logpyi=log(jefjefyi)=fyi+logjefj

梯度推导参考文章:【学习笔记】cs231n中assignment1中的 Softmax exercise

代码:
softmax.py 中的softmax_loss_naive()函数

# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    num_train = X.shape[0]
    num_class = W.shape[1]
    for i in range(num_train):
        score = X[i].dot(W)
        score -= np.max(score)  # 提高计算中的数值稳定性
        correct_score = score[y[i]]  # 取分类正确的评分值
        exp_sum = np.sum(np.exp(score))
        loss += np.log(exp_sum) - correct_score
        for j in xrange(num_class):
            if j == y[i]:
                dW[:, j] += np.exp(score[j]) / exp_sum * X[i] - X[i]
            else:
                dW[:, j] += np.exp(score[j]) / exp_sum * X[i]
    loss /= num_train
    loss += 0.5 * reg * np.sum(W * W)
    dW /= num_train
    dW += reg * W

    pass

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

softmax.py 中的softmax_loss_vectorized()函数

# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    num_train = X.shape[0]

    score = X.dot(W)
    # axis = 1每一行的最大值,score仍为500*10
    score -= np.max(score,axis=1)[:,np.newaxis]
    # correct_score变为500 * 1
    correct_score = score[range(num_train), y]
    exp_score = np.exp(score)
    # sum_exp_score维度为500*1
    sum_exp_score = np.sum(exp_score,axis=1)
    # 计算loss
    loss = np.sum(np.log(sum_exp_score) - correct_score)
    loss /= num_train
    loss += 0.5 * reg * np.sum(W * W)

    # 计算梯度
    margin = np.exp(score) / sum_exp_score.reshape(num_train,1)
    margin[np.arange(num_train), y] += -1
    dW = X.T.dot(margin)
    dW /= num_train
    dW += reg * W


    pass

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

接下来选择合适的超参
softmax文件

# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
iters = 2000
for lr in learning_rates:
    for rs in regularization_strengths:
        softmax = Softmax()
        loss_hist = softmax.train(X_train,y_train,learning_rate=lr,reg=rs,num_iters=iters)
        plt.plot(loss_hist)
        plt.xlabel('Iteration number')
        plt.ylabel('Loss value')
        plt.show()
        
        y_train_pred = softmax.predict(X_train)
        acc_train = np.mean(y_train == y_train_pred)
        y_val_pred = softmax.predict(X_val)
        acc_val = np.mean(y_val == y_val_pred)

        results[(lr, rs)] = (acc_train, acc_val)
        
        if best_val < acc_val:
            best_val = acc_val
            best_softmax = softmax
pass

# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

这个超参选择可以一次选多个参数,迭代次数大一些,然后逐步根据准确率缩小区间。
最终测试集准确率达到0.38

Inline Question

Inline Question 1

Why do we expect our loss to be close to -log(0.1)? Explain briefly.**

Y o u r A n s w e r : \color{blue}{\textit Your Answer:} YourAnswer: Fill this in

因为W是随机生成的,故分类正确的概率为1/10,即损失函数为-log(0.1)。

Inline Question 2 - True or False

Suppose the overall training loss is defined as the sum of the per-datapoint loss over all training examples. It is possible to add a new datapoint to a training set that would leave the SVM loss unchanged, but this is not the case with the Softmax classifier loss.

Y o u r A n s w e r : \color{blue}{\textit Your Answer:} YourAnswer:
正确

Y o u r E x p l a n a t i o n : \color{blue}{\textit Your Explanation:} YourExplanation:
由于SVM损失函数计算时如果新加入的测试图片分类正确,则loss一定为0;但是对与softmax而言,不论分类是否正确,loss总会是存在的,即使loss趋近于0,但整体而言,损失值变化了。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值