SVM/hinge loss function

最新推荐文章于 2021-07-25 15:50:57 发布

Pxmzhao

最新推荐文章于 2021-07-25 15:50:57 发布

阅读量2.9k

点赞数 1

分类专栏： Deep Learning

本文链接：https://blog.csdn.net/shizhuoduao/article/details/53130039

版权

Deep Learning 专栏收录该内容

5 篇文章 2 订阅

订阅专栏

SVM/hinge loss function

loss function

CS231n课程作业一中，涉及到了SVM损失函数，经过研究，应该指的是hinge loss。其公式为：

L i = \sum j \neq y i m a x (0, w T j x i - w T y i x i + Δ)

$L_i=\sum_{j\neq y_i}max(0,w_j^Tx_i-w_{y_i}^Tx_i+\Delta)$

循环方式实现：

def svm_loss_naive(W, X, y, reg):
  """
  Structured SVM loss function, naive implementation (with loops).

  Inputs have dimension D, there are C classes, and we operate on minibatches
  of N examples.

  Inputs:
  - W: A numpy array of shape (D, C) containing weights.
  - X: A numpy array of shape (N, D) containing a minibatch of data.
  - y: A numpy array of shape (N,) containing training labels; y[i] = c means
    that X[i] has label c, where 0 <= c < C.
  - reg: (float) regularization strength

  Returns a tuple of:
  - loss as single float
  - gradient with respect to weights W; an array of same shape as W
  """
  dW = np.zeros(W.shape) # initialize the gradient as zero

  # compute the loss and the gradient
  num_classes = W.shape[1]
  num_train = X.shape[0]
  loss = 0.0
  for i in xrange(num_train):
    scores = X[i].dot(W)
    correct_class_score = scores[y[i]]
    for j in xrange(num_classes):
      margin = scores[j] - correct_class_score + 1 # note delta = 1
      if margin > 0:
        if j != y[i]:
          loss += margin


  # Right now the loss is a sum over all training examples, but we want it
  # to be an average instead so we divide by num_train.
  loss /= num_train

  # Add regularization to the loss.
  loss += 0.5 * reg * np.sum(W * W)

  return loss, dW

对应的向量化代码（自己实现）：

def svm_loss_vectorized(W, X, y, reg):
  """
  Structured SVM loss function, vectorized implementation.

  Inputs and outputs are the same as svm_loss_naive.
  """
  loss = 0.0
  dW = np.zeros(W.shape) # initialize the gradient as zero

  scores = X.dot(W)
  num_train = X.shape[0]
  rows = range(num_train)
  correct_class_score = scores[rows,y]
  margins = np.maximum(0,scores-np.reshape(correct_class_score,[num_train,1])+1)
  margins[rows,y] = 0
  loss = np.sum(margins)
  loss /= num_train
  loss += 0.5 * reg * np.sum(W * W)
  pass

  return loss, dW

gradient of loss function

对higne函数求导推导如下：
由公式 $L_i=\sum_{j\neq y_i}max(0,w_j^Tx_i-w_{y_i}^Tx_i+\Delta)$ ，考虑两种情况，

$j\neq y_i$ :
此时，对 $w_j$ 求偏导，式 $w_j^Tx_i-w_{y_i}^Tx_i+\Delta$ 中只有第一项有效果，第二项为常数。如果 $w_j^Tx_i-w_{y_i}^Tx_i+\Delta>0$ ,那么求得的值为 $x_i$ ；如果 $w_j^Tx_i-w_{y_i}^Tx_i+\Delta\le0$ ，由于 $max$ 的作用，函数返回常数0，故偏导数为0。综上，可知，此时的偏导为：
$\nabla w y i L i = 1 (w T j x i - w T y i x i + Δ > 0) x i$ $\nabla_{w_{y_i}}L_i=1(w_j^Tx_i-w_{y_i}^Tx_i+\Delta>0)x_i$
$j=y_i$ ：
此时，对 $w_j$ 求偏导，式 $w_j^Tx_i-w_{y_i}^Tx_i+\Delta$ 中只有第二项有效果，第一项为常数( $j=y_i$ )。如果 $w_j^Tx_i-w_{y_i}^Tx_i+\Delta>0$ ,那么求得的值为 $-x_i$ ；如果 $w_j^Tx_i-w_{y_i}^Tx_i+\Delta\le0$ ，由于 $max$ 的作用，函数返回常数0，故偏导数同样为0。需要注意的是，此时由于 $\sum$ 的存在，导致偏导结果如下：

$\nabla w y i L i = - ⎛ ⎝ \sum j \neq y i 1 (w T j x i - w T y i x i + Δ > 0) ⎞ ⎠ x i$ $\nabla_{w_{y_i}}L_i=-\left(\sum_{j\neq y_i}1(w_j^Tx_i-w_{y_i}^Tx_i+\Delta>0)\right)x_i$
循环方式实现

def svm_loss_naive(W, X, y, reg):
  """
  Structured SVM loss function, naive implementation (with loops).

  Inputs have dimension D, there are C classes, and we operate on minibatches
  of N examples.

  Inputs:
  - W: A numpy array of shape (D, C) containing weights.
  - X: A numpy array of shape (N, D) containing a minibatch of data.
  - y: A numpy array of shape (N,) containing training labels; y[i] = c means
    that X[i] has label c, where 0 <= c < C.
  - reg: (float) regularization strength

  Returns a tuple of:
  - loss as single float
  - gradient with respect to weights W; an array of same shape as W
  """
  dW = np.zeros(W.shape) # initialize the gradient as zero

  # compute the loss and the gradient
  num_classes = W.shape[1]
  num_train = X.shape[0]
  loss = 0.0
  for i in xrange(num_train):
    scores = X[i].dot(W)
    correct_class_score = scores[y[i]]
    for j in xrange(num_classes):
      margin = scores[j] - correct_class_score + 1 # note delta = 1
      if margin > 0:
        if j != y[i]:
          loss += margin
          dW[:, y[i]] += -1 * X[i]
          dW[:, j] += 1 * X[i]


  # Right now the loss is a sum over all training examples, but we want it
  # to be an average instead so we divide by num_train.
  loss /= num_train
  dW /= num_train

  # Add regularization to the loss.
  loss += 0.5 * reg * np.sum(W * W)
  dW += reg*W

  return loss, dW

向量化实现

def svm_loss_vectorized(W, X, y, reg):
  """
  Structured SVM loss function, vectorized implementation.

  Inputs and outputs are the same as svm_loss_naive.
  """
  loss = 0.0
  dW = np.zeros(W.shape) # initialize the gradient as zero

  scores = np.dot(X,W)
  num_train = X.shape[0]
  rows = range(num_train)
  correct_class_score = scores[rows,y]
  margins = np.maximum(0,scores-np.reshape(correct_class_score,[num_train,1])+1)
  margins[rows,y] = 0
  loss = np.sum(margins)
  loss /= num_train
  loss += 0.5 * reg * np.sum(W * W)

  margins01 = 1 * (margins > 0)
  margins01[rows,y] = -1*np.sum(margins01, axis=1)
  dW = np.dot(X.transpose(), margins01)
  dW /= num_train
  dW += reg * W

  return loss, dW

Pxmzhao

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
SVM/hinge loss function

SVM/hinge loss functionloss functionCS231n课程作业一中，涉及到了SVM损失函数，经过研究，应该指的是hinge loss。其公式为： Li=∑j≠yimax(0,wTjxi−wTyixi+Δ)L_i=\sum_{j\neq y_i}max(0,w_j^Tx_i-w_{y_i}^Tx_i+\Delta)循环方式实现：def svm_loss_naive(
复制链接

扫一扫

专栏目录