CS231n——Assignment1 SVM

最新推荐文章于 2024-07-07 13:33:40 发布

MargaretWG

最新推荐文章于 2024-07-07 13:33:40 发布

阅读量1.6k

点赞数

本文链接：https://blog.csdn.net/MargretWG/article/details/70036802

版权

这篇博客详细介绍了如何实现支持向量机（SVM）分类器，包括读入数据、图像预处理（减去均值，加上偏差）、SVM分类器的实现、使用numpy进行最值计算和逐位比较、梯度检查、随机梯度下降（SGD）、验证集上的超参数调整以及在测试集上的性能评估。整个过程涵盖了机器学习中SVM的基本操作和优化技巧。

摘要由CSDN通过智能技术生成

1.读入数据

同softmax

2.预处理图像

减去均值图像，加上偏差

3.SVM 分类器的实现

np.max(a,axis=None,out=None,keepdims=False)

求序列的最值

最少接受一个参数

axis默认为列向，即axis=0

np.maximum(X,Y,out=None)

X与Y逐位比较取其大者

最少接受两个参数，XY必须可broadcast

假设a是一个矩阵

a[a>0]=1可以将a里面所有大于0的元素值改为1

def svm_loss_naive(W, X, y, reg):

  dW = np.zeros(W.shape) # initialize the gradient as zero

  # compute the loss and the gradient
  num_classes = W.shape[1]
  num_train = X.shape[0]
  loss = 0.0
  for i in range (num_train):
    scores=X[i].dot(W)
    correct_class_score=scores[y[i]]
    for j in range(num_classes):
      if j==y[i]:
        continue
      margin=scores[j]-correct_class_score+1 #此处delta=1
      if margin>0:  #max(0,--)操作
        loss+=margin
        dW[:,y[i]]+=-X[i]#对应正确分类的梯度 (D,)
        dW[:,j]+=X[i]#对应不正确分类的梯度

  # Right now the loss is a sum over all training examples, but we want it
  # to be an average instead so we divide by num_train.
  loss /= num_train
  dW/=num_train
  # Add regularization to the loss.
  loss += 0.5 * reg * np.sum(W * W)
  dW+=reg*W

  return loss, dW


def svm_loss_vectorized(W, X, y, reg):
  """
  Structured SVM loss function, vectorized implementation.

  Inputs and outputs are the same as svm_loss_naive.
  """
  loss = 0.0
  dW = np.zeros(W.shape) # initialize the gradient as zero
  num_train=X.shape[0]
  num_class=W.shape[1]
  scores=X.dot(W) #N by C
  margin=np.maximum(0,scores-np.reshape(scores[range(num_train),y],(num_train,1))+1)
  margin[range(num_train),y]=0  #N by C
  loss=np.sum(margin)/num_train
  loss+=0.5*reg*np.sum(W*W)

  #compute the gradient
  margin[margin>0]=1.0
  margin[range(num_train,y)]=-np.sum(margin,axis=1)
  dW=np.dot(X.T,margin)/num_train+reg*W

  return loss, dW

4.梯度检查

ix=tuple([randrange(m) for m in x.shape ]) 随机从x索引中抽一组

randrange([start,] stop [,step])方法返回指定基数集合中的一个随机数

from cs231n.gradient_check import grad_check_sparse
f=lambda w: svm_loss_naive(w,X_dev,y_dev,0.0)[0]
grad_numerical=grad_check_sparse(f,W,grad)

def grad_check_sparse(f, x, analytic_grad, num_checks=10, h=1e-5):
  """
  sample a few random elements and only return numerical
  in this dimensions.
  """

  for i in range(num_checks):
    ix = tuple([randrange(m) for m in x.shape])

    oldval = x[ix]
    x[ix] = oldval + h # increment by h
    fxph = f(x) # evaluate f(x + h)
    x[ix] = oldval - h # increment by h
    fxmh = f(x) # evaluate f(x - h)
    x[ix] = oldval # reset

    grad_numerical = (fxph - fxmh) / (2 * h)
    grad_analytic = analytic_grad[ix]
    rel_error = abs(grad_numerical - grad_analytic) / (abs(grad_numerical) + abs(grad_analytic))
    print ('numerical: %f analytic: %f, relative error: %e' % (grad_numerical, grad_analytic, rel_error))

5.随机梯度下降（SGD）

def train(self, X, y, learning_rate=1e-3, reg=1e-5, num_iters=100,
          batch_size=200, verbose=False):
  num_train, dim = X.shape
  num_classes = np.max(y) + 1 # assume y takes values 0...K-1 where K is number of classes
  if self.W is None:
    # lazily initialize W
    self.W = 0.001 * np.random.randn(dim, num_classes)

  # Run stochastic gradient descent to optimize W
  loss_history = []
  for it in range(num_iters):
    X_batch = None
    y_batch = None

    #子采样
    sample_index=np.random.choice(num_train,batch_size,replace=False)#replace=False意思是说没有重复的意思
    X_batch=X[sample_index]
    y_batch=y[sample_index]

    # evaluate loss and gradient
    loss, grad = self.loss(X_batch, y_batch, reg)
    loss_history.append(loss)

    #梯度更新
    self.W+=-learning_rate*grad


    if verbose and it % 100 == 0:
      print ('iteration %d / %d: loss %f' % (it, num_iters, loss))

  return loss_history

6.查看预测准确度

y_train_pred=svm.predict(X_train)
print("Training accuracy: %f" %(np.mean(y_train_pred==y_train)))
y_val_pred=svm.predict(X_val)
print("validation accuracy: %f"%(np.mean(y_val==y_val_pred)))

def predict(self, X):

  y_pred = np.zeros(X.shape[1])

  y_pred=np.dot(X,self.W)# N by C
  y_pred=np.argmax(y_pred,axis=1)# N,
  return y_pred

7.用验证集调整超参数

learning_rates=[1e-7,5e-5]
regularizaion_strengths=[5e4,1e5]
results={}
best_val=-1
best_svm=None # The LinearSVM object that achieved the highest validation rate
for rate in learning_rates:
    for reg in regularizaion_strengths:
        svm_new=LinearSVM()
        loss=svm_new.train(X_train,y_train,rate,reg,1500)
        y_pred_val=svm_new.predict(X_val)
        y_pred_train=svm_new.predict(X_train)
        train_accuracy=np.mean(y_pred_train==y_train)
        val_accuracy=np.mean(y_pred_val==y_val)
        results[rate,reg]=(train_accuracy,val_accuracy)

        if val_accuracy>best_val:
            best_val=val_accuracy
            best_svm=svm_new

for lr,reg in sorted(results):
    train_accuracy,val_accuracy=results[(lr,reg)]
    print("lr %e reg %e \n train accuracy:%f val accuracy: %f"
          %(lr,reg,train_accuracy,val_accuracy))
print("best validation accuracy achieved during cross validation: %f" %best_val)

8.最终在测试集上测试

y_test_pred=best_svm.predict(X_test)
test_accuracy=np.mean(y_test_pred==y_test)
print("SVM on raw pixels final test set accuracy: %f" %test_accuracy)

MargaretWG

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
CS231n——Assignment1 SVM

1.读入数据同softmax2.预处理图像减去均值图像，加上偏差3.SVM 分类器的实现np.max(a,axis=None,out=None,keepdims=False)求序列的最值最少接受一个参数axis默认为列向，即axis=0np.maximum(X,Y,out=None)X与Y逐位比较取其大者最少接受两个参数，XY必须可broadcast
复制链接

扫一扫