# cs231n的第一次作业Softmax

## Softmax分类器

### 损失函数

softmax的损失函数为

SVM只选自己喜欢的男神，Softmax把所有备胎全部拉出来评分，最后还归一化一下。

### 关于数值稳定

def softmax_loss_naive(W, X, y, reg):
"""
Softmax loss function, naive implementation (with loops)

Inputs have dimension D, there are C classes, and we operate on minibatches of N examples.

Inputs:
- W: A numpy array of shape (D, C) containing weights.
- X: A numpy array of shape (N, D) containing a minibatch of data.
- y: A numpy array of shape (N,) containing training labels; y[i] = c means
that X[i] has label c, where 0 <= c < C.
- reg: (float) regularization strength

Returns a tuple of:
- loss as single float
- gradient with respect to weights W; an array of same shape as W
"""
# Initialize the loss and gradient to zero.
loss = 0.0
dW = np.zeros_like(W)

#############################################################################
# TODO: Compute the softmax loss and its gradient using explicit loops.     #
# Store the loss in loss and the gradient in dW. If you are not careful     #
# here, it is easy to run into numeric instability. Don't forget the        #
# regularization!                                                           #
#############################################################################
#pass
# Get shapes
num_classes = W.shape[1]
num_train = X.shape[0]

for i in xrange(num_train):
scores = X[i].dot(W)
shift_scores = scores - max(scores)
loss_i = - shift_scores[y[i]] + np.log(sum(np.exp(shift_scores)))
loss += loss_i
for j in xrange(num_classes):
softmax_output = np.exp(shift_scores[j]) / sum(np.exp(shift_scores))
if j == y[i]:
dW[:, j] += (-1 + softmax_output) * X[i]
else:
dW[:, j] += softmax_output * X[i]

loss /= num_train
loss += 0.5 * reg * np.sum(W * W)
dW = dW / num_train + reg * W

return loss, dW

num_classes = W.shape[1]
num_train = X.shape[0]

Since the weight matrix W is uniform randomly selected, the predicted probability of each class is uniform distribution and identically equals 1/10, where 10 is the number of classes. So the cross entroy for each example is -log(0.1), which should equal to the loss.

softmax on raw pixels final test set accuracy: 0.334000