Compute Loss and Gradient of LinearSVM with Two_Loops

lamprophony

已于 2022-11-14 16:06:19 修改

阅读量344

点赞数

分类专栏：深度学习文章标签： python 开发语言

于 2022-11-14 15:36:02 首次发布

本文链接：https://blog.csdn.net/qq_51933861/article/details/127848638

版权

深度学习专栏收录该内容

3 篇文章 0 订阅

订阅专栏

## introduction

- video explain in bilibil https://www.bilibili.com/video/BV19z411b7u9?p=9&vd_source=b6763bca1e70e8b7ca1770e409d21089

- goal: in order to know the way of how to compute loss and gradient of LinearSVM with two_loops deeply. I scale the X(50000,3072) and W(3072,10) down to x_pic(5,3) and w(3,4).

- application: the code is used to compute loss and gradient of LinearSVM. It can combine with svm.ipynb and cs231n\classifiers\linear_classifier.py.

## code

import numpy as np
# 1. picture (RGB)
pic = np.ones((5,3,1))
for i in range(5): # make five pictures
    pic[i] = pic[i] * (i+1)
'''
print(pic)
[[[1.] [1.] [1.]]
 [[2.] [2.] [2.]]
 [[3.] [3.] [3.]]
 [[4.] [4.] [4.]]
 [[5.] [5.] [5.]]]
'''
# 2. flatten the data
x_pic = np.reshape(pic, (pic.shape[0], -1))
'''
print(x_pic) #(5,3) five pictures, each picture has three pixels
[[1. 1. 1.]
 [2. 2. 2.]
 [3. 3. 3.]
 [4. 4. 4.]
 [5. 5. 5.]]
''' 
# 3. make weight W
# real：w = np.random.randn(3,4) * 0.0001
w = np.array([[1,0,1,0],[2,1,2,1],[1,2,0,1]])
'''
print(w) # each column represent one class
[[1, 0, 1, 0]
 [2, 1, 2, 1]
 [1, 2, 0, 1]]
'''
# 4. make the labels of data
y_pic = list(range(5))
for i in y_pic:
    if i<3:
        y_pic[i] = i+1        
    elif i==3:
        y_pic[i] = 0
    else:
        y_pic[i] = 1
'''
print(y_pic) # e.g. the label of the fifth picture is 1
[1, 2, 3, 0, 1] # 0 <= c < C
'''
# 5. compute the loss and gradient
dW = np.zeros(w.shape) # w and dw have the same size
num_classes = w.shape[1] # 4
num_train_pic = x_pic.shape[0] # 5
loss = 0.0
for i in range(num_train_pic):
    scores = x_pic[i].dot(w) # each pic has a table of scores; (N,C)
    '''
    print(scores)
    [4. 3. 3. 2.]
    [8. 6. 6. 4.]
    [12.  9.  9.  6.]
    [16. 12. 12.  8.]
    [20. 15. 15. 10.]
    '''
    correct_class_score = scores[y_pic[i]]
    '''
    print(correct_class_score)
    3
    6
    6
    16
    15
    '''
    for j in range(num_classes):
        if j == y_pic[i]:
            continue # the right don't need to compute the loss
        margin = scores[j] - correct_class_score + 1 # note delta = 1
        '''
        print(margin)
        2 # 1  0
        3 1 # -1
        7 4 4 #
        # -3 -3 -7
        6 # 1 -4
        '''
        if margin > 0: # <=0:represent the classification is good
            loss += margin
            '''
            print(loss)
            3:2+1=3
            7:3+3+1=7
            22:7+7+4+4=22
            22:22+0+0+0=22
            29:22+6+1=29
            '''
            dW[:, j] += x_pic[i]
            dW[:, y_pic[i]] -= x_pic[i]
loss = loss / num_train_pic # no regularization
dw = dw / num_train_pic # no regularization
'''
print(dW)
[[11. -7.  5. -9.]
 [11. -7.  5. -9.]
 [11. -7.  5. -9.]]
'''

## the calculation process of dW