cs231n assignment1 softmax

最新推荐文章于 2024-01-22 16:16:19 发布

一叶知秋Autumn

最新推荐文章于 2024-01-22 16:16:19 发布

阅读量1.9k

点赞数

分类专栏：计算机视觉 CS231N 文章标签：计算机视觉

本文链接：https://blog.csdn.net/SpicyCoder/article/details/95588287

版权

计算机视觉同时被 2 个专栏收录

14 篇文章

订阅专栏

CS231N

10 篇文章

订阅专栏

Softmax

Softmax的损失函数为
$L_{i}=-\log p_{y_{i}}=-\log \left(\frac{e^{f_{y_{i}}}}{\sum_{j} e^{f_{j}}}\right)=-f_{y_{i}}+\log \sum_{j} e^{f_{j}}$

梯度推导参考文章：【学习笔记】cs231n中assignment1中的 Softmax exercise

代码：
softmax.py 中的softmax_loss_naive()函数

# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    num_train = X.shape[0]
    num_class = W.shape[1]
    for i in range(num_train):
        score = X[i].dot(W)
        score -= np.max(score)  # 提高计算中的数值稳定性
        correct_score = score[y[i]]  # 取分类正确的评分值
        exp_sum = np.sum(np.exp(score))
        loss += np.log(exp_sum) - correct_score
        for j in xrange(num_class):
            if j == y[i]:
                dW[:, j] += np.exp(score[j]) / exp_sum * X[i] - X[i]
            else:
                dW[:, j] += np.exp(score[j]) / exp_sum * X[i]
    loss /= num_train
    loss += 0.5 * reg * np.sum(W * W)
    dW /= num_train
    dW += reg * W

    pass

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

softmax.py 中的softmax_loss_vectorized()函数

# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    num_train = X.shape[0]

    score = X.dot(W)
    # axis = 1每一行的最大值，score仍为500*10
    score -= np.max(score,axis=1)[:,np.newaxis]
    # correct_score变为500 * 1
    correct_score = score[range(num_train), y]
    exp_score = np.exp(score)
    # sum_exp_score维度为500*1
    sum_exp_score = np.sum(exp_score,axis=1)
    # 计算loss
    loss = np.sum(np.log(sum_exp_score) - correct_score)
    loss /= num_train
    loss += 0.5 * reg * np.sum(W * W)

    # 计算梯度
    margin = np.exp(score) / sum_exp_score.reshape(num_train,1)
    margin[np.arange(num_train), y] += -1
    dW = X.T.dot(margin)
    dW /= num_train
    dW += reg * W


    pass

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

接下来选择合适的超参
softmax文件

# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
iters = 2000
for lr in learning_rates:
    for rs in regularization_strengths:
        softmax = Softmax()
        loss_hist = softmax.train(X_train,y_train,learning_rate=lr,reg=rs,num_iters=iters)
        plt.plot(loss_hist)
        plt.xlabel('Iteration number')
        plt.ylabel('Loss value')
        plt.show()
        
        y_train_pred = softmax.predict(X_train)
        acc_train = np.mean(y_train == y_train_pred)
        y_val_pred = softmax.predict(X_val)
        acc_val = np.mean(y_val == y_val_pred)

        results[(lr, rs)] = (acc_train, acc_val)
        
        if best_val < acc_val:
            best_val = acc_val
            best_softmax = softmax
pass

# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

这个超参选择可以一次选多个参数，迭代次数大一些，然后逐步根据准确率缩小区间。
最终测试集准确率达到0.38

Inline Question

Inline Question 1

Why do we expect our loss to be close to -log(0.1)? Explain briefly.**

$\color{blue}{\textit Your Answer:}$ Fill this in

因为W是随机生成的，故分类正确的概率为1/10，即损失函数为-log(0.1)。

Inline Question 2 - True or False

Suppose the overall training loss is defined as the sum of the per-datapoint loss over all training examples. It is possible to add a new datapoint to a training set that would leave the SVM loss unchanged, but this is not the case with the Softmax classifier loss.

$\color{blue}{\textit Your Answer:}$
正确