CS231n_2020（2）—— 线性分类

最新推荐文章于 2022-04-21 11:17:39 发布

chaney1ee

最新推荐文章于 2022-04-21 11:17:39 发布

阅读量284

点赞数 1

文章标签：机器学习神经网络

本文链接：https://blog.csdn.net/u011610471/article/details/107226813

版权

CS231n_2020（2）—— 线性分类

Intro to Linear classification
Linear score function
- Linear classifier
Interpreting a linear classifier
Loss function
Interactive Web Demo of Linear Classification
Summary

Intro to Linear classification

如第一节所讲，knn的图像分类是非常昂贵的。为此，卷积神经网络可以解决这个问题。它主要分为两部分：第一部分是一个score function将原始数据映射到class scores，以及一个损失函数；第二部分是将这个问题转化为优化问题（optimization problem），即得到使损失函数最小化的参数。

Linear score function

假设一个图像的训练集为 $x_i \in R^D$ ，每个都有相对应的标签 $y_i$ ，其中， $\dots N, y_i \in { 1 \dots K }$ ，这里，我们有N个例子（每个都有D个维度）和K个分类。例如，在CIFAR-10中，N=50000个图像，D = 32 x 32 x 3 = 3072个像素，K=10。
然后定义score function： $R^D \mapsto R^K$ 。

Linear classifier

$f(x_i, W, b) = W x_i + b$
在CIFAR-10, $x_i$ 包含了第i个（i-th）图像里所有像素[3072 * 1]，W为[10 * 3072]，b为[10 * 1]。

Interpreting a linear classifier

在这里插入图片描述

Loss function

Multiclass SVM

score function 记作 $s_j = f(x_i, W)_j$ ，那么第i个图像的Multiclass SVM损失函数如下：
$L_i = \sum_{j\neq y_i} \max(0, s_j - s_{y_i} + \Delta)$
加入L2 的正规化惩罚 $\sum_k\sum_l W_{k,l}^2$ ，得到完整的Multiclass SVM损失函数：
$\underbrace{ \frac{1}{N} \sum_i L_i }_\text{data loss} + \underbrace{ \lambda R(W) }_\text{regularization loss} \\\\$
即
$\frac{1}{N} \sum_i \sum_{j\neq y_i} \left[ \max(0, f(x_i; W)_j - f(x_i; W)_{y_i} + \Delta) \right] + \lambda \sum_k\sum_l W_{k,l}^2$
Code:

def L_i(x, y, W):
  """
  unvectorized version. Compute the multiclass svm loss for a single example (x,y)
  - x is a column vector representing an image (e.g. 3073 x 1 in CIFAR-10)
    with an appended bias dimension in the 3073-rd position (i.e. bias trick)
  - y is an integer giving index of correct class (e.g. between 0 and 9 in CIFAR-10)
  - W is the weight matrix (e.g. 10 x 3073 in CIFAR-10)
  """
  delta = 1.0 # see notes about delta later in this section
  scores = W.dot(x) # scores becomes of size 10 x 1, the scores for each class
  correct_class_score = scores[y]
  D = W.shape[0] # number of classes, e.g. 10
  loss_i = 0.0
  for j in range(D): # iterate over all wrong classes
    if j == y:
      # skip for the true class to only loop over incorrect classes
      continue
    # accumulate loss for the i-th example
    loss_i += max(0, scores[j] - correct_class_score + delta)
  return loss_i

def L_i_vectorized(x, y, W):
  """
  A faster half-vectorized implementation. half-vectorized
  refers to the fact that for a single example the implementation contains
  no for loops, but there is still one loop over the examples (outside this function)
  """
  delta = 1.0
  scores = W.dot(x)
  # compute the margins for all classes in one vector operation
  margins = np.maximum(0, scores - scores[y] + delta)
  # on y-th position scores[y] - scores[y] canceled and gave delta. We want
  # to ignore the y-th position and only consider margin on max wrong class
  margins[y] = 0
  loss_i = np.sum(margins)
  return loss_i

def L(X, y, W):
  """
  fully-vectorized implementation :
  - X holds all the training examples as columns (e.g. 3073 x 50,000 in CIFAR-10)
  - y is array of integers specifying correct class (e.g. 50,000-D array)
  - W are weights (e.g. 10 x 3073)
  """
  # evaluate loss over all examples in X without using any for loops
  # left as exercise to reader in the assignment

Softmax classifier

softmax function： $f_j(z) = \frac{e^{z_j}}{\sum_k e^{z_k}}$
$L_i = -\log\left(\frac{e^{f_{y_i}}}{ \sum_j e^{f_j} }\right) \hspace{0.5in} \text{or equivalently} \hspace{0.5in} L_i = -f_{y_i} + \log\sum_j e^{f_j}$
信息熵：
$\sum_x p(x) \log q(x)$
Code

f = np.array([123, 456, 789]) # example with 3 classes and each having large scores
p = np.exp(f) / np.sum(np.exp(f)) # Bad: Numeric problem, potential blowup

# instead: first shift the values of f so that the highest number is 0:
f -= np.max(f) # f becomes [-666, -333, 0]
p = np.exp(f) / np.sum(np.exp(f)) # safe to do, gives the correct answer

SVM vs Softmax

在这里插入图片描述

Interactive Web Demo of Linear Classification

在这里插入图片描述

Summary

定义了一个score function，表示图像像素到分类器中的映射（本文中用了线性函数）
与KNN分类不同，这种方法确定了参数后即可以丢弃训练集，对可以快速预测新图像。
引入了线性分类器常用的两种损失函数：SVM和Softmax。损失函数越小，对训练的预测就越好。

chaney1ee

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
CS231n_2020（2）—— 线性分类

CS231n_2020（2）—— 线性分类Intro to Linear classificationLinear score functionLinear classifierInterpreting a linear classifierLoss functionMulticlass SVMSoftmax classifierSVM vs SoftmaxInteractive Web Demo of Linear ClassificationSummaryIntro to Linear classif
复制链接

扫一扫