(转)CS231n Assignment2 Support Vector Machine

Begin


本文主要介绍CS231N系列课程的第一项作业,写一个SVM无监督学习训练模型。

课程主页:网易云课堂CS231N系列课程

语言:Python3.6

 

1线形分类器


 

        以图像为例,一幅图像像素为32*32*3代表长32宽32有3通道的衣服图像,将其变为1*3072的一个向量,即该图像的特征向量。

我们如果需要训练1000幅图像,那么输入则为1000*3072的矩阵X。

  我们用X点乘矩阵W得到一个计分矩阵如下所示,W乘以一幅图像的特征向量的转置得到一列代表分数。

       每个分数对应代表一个类别,分数越高代表她所属于此类别纪律越大,所以W其实是一个类别权重的概念。

 注意:下图为CS231N中的一张图,它是以一幅图为例,将X转至为3072*1,大家理解即可,在程序中我们采用X*W来编写。

 更多细节可以参考CS231N作业1KNN详解

 

 

 

2损失函数


 

       得到每一幅图像对应每一个类别的分数之后,我们需要计算一个损失,去评估一下W矩阵的好坏。

如下右侧为SVM损失函数计算公式。

        对每一幅图像的损失用其错误类别的分数减去正确类别的分数,并与0比较求最大值

一般我们应该正确类别的分数高就证明没有损失,此时错误类别减去正确类别一定为负值,比0小故取损失为0.

为了提高鲁棒性,这里给他加了一个1。

 

        计算所有的损失后,我们把损失累加作为最后的损失

        整理后我们得到如下的公式,但是其存在一个问题,没有考虑W的影响,不同的W可能得到同样的损失,

 因此我们引入一个正则,正则系数可以调节W对整个损失的影响,W越分散当然越好

 

代码如下:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

def svm_loss_native(W,X,Y,reg):

    '''

    本函数用于计算SVM分类器的损失以及梯度

    输入参数:

        W(D,C)代表权重

            D为特征向量的维度,C为分类类别的数量

        X(N,D)代表训练样本的特征,0维代表每一个样本,1维代表某一样本的特征向量

            对于32*32图像,N代表有N个样本,D=32*32*3全体像素值代表特征向量

        Y(N,1)代表训练样本的标签,0维代表每一个样本,1维代表某一样本的标签

    输出参数:

        Loss损失

 

    '''

    #获取基础参数

    num_train = X.shape[0]#训练样本的数量

    num_classes = W.shape[1]#划分的种类

    loss = 0.0#初始化损失

    dW = np.zeros(W.shape)#创建一个梯度

    for in range(num_train):#分别求每一个训练样本的损失

        score = X[i].dot(W)#计算每个样本的分数

 

        #计算损失

        for in range(num_classes):

            if == Y[i]:

                continue

            margin = score[j] - score[Y[i]] + 1

            #margin =  np.max(0,score[j] - score[Y[i]] + 1)#计算损失

            if margin > 0:

                loss += margin

    loss /= num_train

    #加入正则

    loss += reg * np.sum(W*W)

    return loss

  

 

 

  如此一套完整的损失函数就构造完成了,我们通过看损失可以知道这个W矩阵的好坏,那么如果损失过大该怎么调剂每一个参数呢?

        此时我们引入梯度下降法和梯度的概念

3梯度


 

梯度下降法:

        首先,我们有一个可微分的函数。这个函数就代表着一座山。我们的目标就是找到这个函数的最小值,也就是山底。根据之前的场景假设,最快的下山的方式就是找到当前位置最陡峭的方向,然后沿着此方向向下走,对应到函数中,就是找到给定点的梯度 ,然后朝着梯度相反的方向,就能让函数值下降的最快!因为梯度的方向就是函数之变化最快的方向(在后面会详细解释)
        所以,我们重复利用这个方法,反复求取梯度,最后就能到达局部的最小值,这就类似于我们下山的过程。而求取梯度就确定了最陡峭的方向,也就是场景中测量方向的手段。

梯度如同求导一样,如下图所示,损失的导数反应着梯度状况

如果W向前变化一格,损失增大,则dW梯度应该为正值,此时应该W向相反方向变化。

 

 

对于本例中对于损失函数,可以改写为如下:

 

 

对于Lij,用其对Wj求偏导

 

 

CODE2 LOSS & 梯度 循环形式

 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

def svm_loss_native(W,X,Y,reg):

    '''

    本函数用于计算SVM分类器的损失以及梯度

    输入参数:

        W(D,C)代表权重

            D为特征向量的维度,C为分类类别的数量

        X(N,D)代表训练样本的特征,0维代表每一个样本,1维代表某一样本的特征向量

            对于32*32图像,N代表有N个样本,D=32*32*3全体像素值代表特征向量

        Y(N,1)代表训练样本的标签,0维代表每一个样本,1维代表某一样本的标签

    输出参数:

        Loss损失

 

    '''

    #获取基础参数

    num_train = X.shape[0]#训练样本的数量

    num_classes = W.shape[1]#划分的种类

    loss = 0.0#初始化损失

    dW = np.zeros(W.shape)#创建一个梯度

    for in range(num_train):#分别求每一个训练样本的损失

        score = X[i].dot(W)#计算每个样本的分数

 

        #计算损失

        for in range(num_classes):

            if == Y[i]:

                continue

            margin = score[j] - score[Y[i]] + 1

            #margin =  np.max(0,score[j] - score[Y[i]] + 1)#计算损失

            if margin > 0:

                loss += margin

                dW[:,Y[i]] += -X[i,:].T

                dW[:,j] += X[i,:].T

    loss /= num_train

    dW /= num_train

    #加入正则

    loss += reg * np.sum(W*W)

    dW += reg * W

    return loss,dW

 

  

 

CODE3 LOSS & 梯度 向量矩阵形式

 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

def svm_loss_vectorized(W,X,Y,reg):

 

    loss = 0.0

    num_train = X.shape[0]

    dW = np.zeros(W.shape)

    scores = np.dot(X,W)

    correct_class_score = scores[np.arange(num_train),Y]

    correct_class_score = np.reshape(correct_class_score,(num_train,-1))

    margin = scores - correct_class_score + 1.0

    margin[np.arange(num_train),Y] = 0.0

    margin[margin<0= 0.0

    loss += np.sum(margin)/num_train

    loss += 0.5*reg*np.sum(W*W)

 

    margin[margin>0= 1.0

    row_sum = np.sum(margin,axis = 1)

    margin[np.arange(num_train),Y] = -row_sum

    dW = 1.0/num_train*np.dot(X.T,margin) + reg*W  # ** #

    return loss,dW

 

 

4训练函数


 

 

在得到损失和梯度后我们就可以根据梯度去调节W矩阵,这里需要引入TRAIN函数的一些参数。

一般需要有以下参数:

训练次数:要循环训练多少步。

学习率:每一次根据梯度去修正W矩阵的系数。

样本数:每一次训练可能不是选择所有样本,需要取样一定样本。

核心点在于在循环中不断去计算损失以及梯度,然后利用下面公式去调节。

 

self.W = self.W - learning_rate * grade

 

 

CODE4 梯度下降法

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

def train(self,X,Y,learning_rate=1e-3,reg=1e-5,num_iters=100,batch_size=200,verbose=False):

        '''

        随机梯度下降法训练分类器

        输入参数:

        -learning_rate学习率

        -reg正则化强度

        -num_iters步长值

        -batch_size每一步使用的样本数量

        -verbose若为真则打印过程

        输出参数:

        list损失值

        '''

        num_train,dim = X.shape

        num_classes = np.max(Y) + 1

         

        #if self.W is None:

            #初始化W矩阵

        self.W = 0.001 * np.random.randn(dim,num_classes)

        loss_history = []

        #开始训练num_iters步

        for it in range(num_iters):

            X_batch = None

            Y_batch = None

            ########################

            # 选取部分训练样本

            # 随机生成一个序列

            batch_inx = np.random.choice(num_train,batch_size)

            X_batch = X[batch_inx,:]

            Y_batch = Y[batch_inx]

            #########################

            # 计算损失与梯度

            loss,grade = self.loss(self.W,X_batch,Y_batch,reg)

            loss_history.append(loss)

 

            ########################

            # 参数更新

            # 梯度为正表示损失增大,应该减少,成负相关

            self.W = self.W - learning_rate * grade

            #打印结果

            if verbose and it % 100 == 0:

                print('iteration %d / %d : loss %f'%(it ,num_iters,loss))

        return loss_history

 

运行结果如

 

  

 

 5预测predict


 

 

在训练完模型后会得到一个较好的W矩阵,然后根据这个W去预测一下测试集看看模型的效果

1

2

3

4

5

6

7

def predict(self,X_train):

    y_predict = np.zeros(X_train.shape[1])

    #根据训练后的W矩阵计算分数

    scores = X_train.dot(self.W)

    #找到得分中最大的值作为类别

    y_predict = np.argmax(scores,axis = 1)#计算每一行最大值

    return y_predict

在主函数中运行如下代码观察预测情况

1

2

3

4

score1 = SVM1.predict(X_dev)

print('The predit result %f' %(np.mean(score1 == Y_dev)))

score1 = SVM1.predict(X_test)

print('The Test Data predit result %f' %(np.mean(score1 == Y_test)))

  

  

 

预测结果如下,用训练集本身去预测得到0.756,用测试集去预测才0.218,不是太好

 

 

 

6参数调整


 

上述即完成了一整体的SVM模型库,那么我们如何自动训练出一个好的学习率和正则化强度参数呢?

 我们需要不断去测试每一个参数的好坏,用下面一个程序可以完成这个任务

 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

#调参

#两个参数,学习率;正则化强度

learning_rate = [2e-7,0.75e-7,1.5e-7,1.25e-7,0.75e-7]

regularization_strengths = [3e4,3.25e4,3.5e4,3.75e4,4e4]

 

results = {}

best_val = 0

best_svm = None

######################################

# 循环执行代码

# 对不同的学习率以及正则化强度进行测试

#

for rate in learning_rate:

    for regular in regularization_strengths:

        SVM2 = SVM()

        #训练

        SVM2.train(X_train,Y_train,learning_rate=rate,reg=regular,num_iters=1000)

        #预测

        Y1 = SVM2.predict(X_train)

        Y2 = SVM2.predict(X_val)

        accuracy_train = np.mean(Y1==Y_train)

        accuracy_val = np.mean(Y2==Y_val)

        #判断优略

        if best_val < accuracy_val:

            best_val = accuracy_val

            best_svm = SVM2#保存当前模型

        #存储数据

        results[rate,regular] = (accuracy_train,accuracy_val)

#打印数据

for lr,reg in sorted(results):

    accuracy_train,accuracy_val = results[(lr,reg)]

    print('lr:%e reg %e train accuracy: %f val val accuracy : %f'%(lr,reg,accuracy_train,accuracy_val))

  

运行结果如下:

 

7 可视化效果


 

在得到最优W时,我们有时要看一下W的可视化效果,从w的图像可以看出权重高低,类似于一个反应这个类别的模板。

1

2

3

4

5

6

7

8

9

10

11

12

13

#可视化结果数据

= best_svm.W[:,:]

w=w.reshape(32,32,3,10)

w_min,w_max = np.min(w),np.max(w)

classes = ['plane','car','bird','cat','deer','dog','frog','hors','ships','truck']#类别划分  列表

for in range(10):

    plt.subplot(2,5,i+1)

    wimg = 255.0 * (w[:,:,:,i].squeeze()-w_min) / (w_max - w_min)

 

    plt.imshow(wimg.astype('uint8'))

    plt.axis('off')

    plt.title(classes[i])

plt.show()

  如下图所示

 

完整代码(第一个代码是data_util用来读取数据集的工具包源码)

from __future__ import print_function

from six.moves import cPickle as pickle
import numpy as np
import os
from matplotlib.pyplot import imread
import platform


def load_pickle(f):
    version = platform.python_version_tuple()
    if version[0] == '2':
        return pickle.load(f)
    elif version[0] == '3':
        return pickle.load(f, encoding='latin1')
    raise ValueError("invalid python version: {}".format(version))


def load_CIFAR_batch(filename):
    """ load single batch of cifar """
    with open(filename, 'rb') as f:
        datadict = load_pickle(f)
        X = datadict['data']
        Y = datadict['labels']
        X = X.reshape(10000, 3, 32, 32).transpose(0, 2, 3, 1).astype("float")
        Y = np.array(Y)
        return X, Y


def load_CIFAR10(ROOT):
    """ load all of cifar """
    xs = []
    ys = []
    for b in range(1, 6):
        f = os.path.join(ROOT, 'data_batch_%d' % (b,))
        X, Y = load_CIFAR_batch(f)
        xs.append(X)
        ys.append(Y)
    Xtr = np.concatenate(xs)
    Ytr = np.concatenate(ys)
    del X, Y
    Xte, Yte = load_CIFAR_batch(os.path.join(ROOT, 'test_batch'))
    return Xtr, Ytr, Xte, Yte


def get_CIFAR10_data(num_training=49000, num_validation=1000, num_test=1000,
                     subtract_mean=True):
    """
    Load the CIFAR-10 dataset from disk and perform preprocessing to prepare
    it for classifiers. These are the same steps as we used for the SVM, but
    condensed to a single function.
    """
    # Load the raw CIFAR-10 data
    cifar10_dir = 'datasets/cifar-10-batches-py'
    X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)

    # Subsample the data
    mask = list(range(num_training, num_training + num_validation))
    X_val = X_train[mask]
    y_val = y_train[mask]
    mask = list(range(num_training))
    X_train = X_train[mask]
    y_train = y_train[mask]
    mask = list(range(num_test))
    X_test = X_test[mask]
    y_test = y_test[mask]

    # Normalize the data: subtract the mean image
    if subtract_mean:
        mean_image = np.mean(X_train, axis=0)
        X_train -= mean_image
        X_val -= mean_image
        X_test -= mean_image

    # Transpose so that channels come first
    X_train = X_train.transpose(0, 3, 1, 2).copy()
    X_val = X_val.transpose(0, 3, 1, 2).copy()
    X_test = X_test.transpose(0, 3, 1, 2).copy()

    # Package data into a dictionary
    return {
        'X_train': X_train, 'y_train': y_train,
        'X_val': X_val, 'y_val': y_val,
        'X_test': X_test, 'y_test': y_test,
    }


def load_tiny_imagenet(path, dtype=np.float32, subtract_mean=True):
    """
    Load TinyImageNet. Each of TinyImageNet-100-A, TinyImageNet-100-B, and
    TinyImageNet-200 have the same directory structure, so this can be used
    to load any of them.

    Inputs:
    - path: String giving path to the directory to load.
    - dtype: numpy datatype used to load the data.
    - subtract_mean: Whether to subtract the mean training image.

    Returns: A dictionary with the following entries:
    - class_names: A list where class_names[i] is a list of strings giving the
      WordNet names for class i in the loaded dataset.
    - X_train: (N_tr, 3, 64, 64) array of training images
    - y_train: (N_tr,) array of training labels
    - X_val: (N_val, 3, 64, 64) array of validation images
    - y_val: (N_val,) array of validation labels
    - X_test: (N_test, 3, 64, 64) array of testing images.
    - y_test: (N_test,) array of test labels; if test labels are not available
      (such as in student code) then y_test will be None.
    - mean_image: (3, 64, 64) array giving mean training image
    """
    # First load wnids
    with open(os.path.join(path, 'wnids.txt'), 'r') as f:
        wnids = [x.strip() for x in f]

    # Map wnids to integer labels
    wnid_to_label = {wnid: i for i, wnid in enumerate(wnids)}

    # Use words.txt to get names for each class
    with open(os.path.join(path, 'words.txt'), 'r') as f:
        wnid_to_words = dict(line.split('\t') for line in f)
        for wnid, words in wnid_to_words.iteritems():
            wnid_to_words[wnid] = [w.strip() for w in words.split(',')]
    class_names = [wnid_to_words[wnid] for wnid in wnids]

    # Next load training data.
    X_train = []
    y_train = []
    for i, wnid in enumerate(wnids):
        if (i + 1) % 20 == 0:
            print('loading training data for synset %d / %d' % (i + 1, len(wnids)))
        # To figure out the filenames we need to open the boxes file
        boxes_file = os.path.join(path, 'train', wnid, '%s_boxes.txt' % wnid)
        with open(boxes_file, 'r') as f:
            filenames = [x.split('\t')[0] for x in f]
        num_images = len(filenames)

        X_train_block = np.zeros((num_images, 3, 64, 64), dtype=dtype)
        y_train_block = wnid_to_label[wnid] * np.ones(num_images, dtype=np.int64)
        for j, img_file in enumerate(filenames):
            img_file = os.path.join(path, 'train', wnid, 'images', img_file)
            img = imread(img_file)
            if img.ndim == 2:
                ## grayscale file
                img.shape = (64, 64, 1)
            X_train_block[j] = img.transpose(2, 0, 1)
        X_train.append(X_train_block)
        y_train.append(y_train_block)

    # We need to concatenate all training data
    X_train = np.concatenate(X_train, axis=0)
    y_train = np.concatenate(y_train, axis=0)

    # Next load validation data
    with open(os.path.join(path, 'val', 'val_annotations.txt'), 'r') as f:
        img_files = []
        val_wnids = []
        for line in f:
            img_file, wnid = line.split('\t')[:2]
            img_files.append(img_file)
            val_wnids.append(wnid)
        num_val = len(img_files)
        y_val = np.array([wnid_to_label[wnid] for wnid in val_wnids])
        X_val = np.zeros((num_val, 3, 64, 64), dtype=dtype)
        for i, img_file in enumerate(img_files):
            img_file = os.path.join(path, 'val', 'images', img_file)
            img = imread(img_file)
            if img.ndim == 2:
                img.shape = (64, 64, 1)
            X_val[i] = img.transpose(2, 0, 1)

    # Next load test images
    # Students won't have test labels, so we need to iterate over files in the
    # images directory.
    img_files = os.listdir(os.path.join(path, 'test', 'images'))
    X_test = np.zeros((len(img_files), 3, 64, 64), dtype=dtype)
    for i, img_file in enumerate(img_files):
        img_file = os.path.join(path, 'test', 'images', img_file)
        img = imread(img_file)
        if img.ndim == 2:
            img.shape = (64, 64, 1)
        X_test[i] = img.transpose(2, 0, 1)

    y_test = None
    y_test_file = os.path.join(path, 'test', 'test_annotations.txt')
    if os.path.isfile(y_test_file):
        with open(y_test_file, 'r') as f:
            img_file_to_wnid = {}
            for line in f:
                line = line.split('\t')
                img_file_to_wnid[line[0]] = line[1]
        y_test = [wnid_to_label[img_file_to_wnid[img_file]] for img_file in img_files]
        y_test = np.array(y_test)

    mean_image = X_train.mean(axis=0)
    if subtract_mean:
        X_train -= mean_image[None]
        X_val -= mean_image[None]
        X_test -= mean_image[None]

    return {
        'class_names': class_names,
        'X_train': X_train,
        'y_train': y_train,
        'X_val': X_val,
        'y_val': y_val,
        'X_test': X_test,
        'y_test': y_test,
        'class_names': class_names,
        'mean_image': mean_image,
    }


def load_models(models_dir):
    """
    Load saved models from disk. This will attempt to unpickle all files in a
    directory; any files that give errors on unpickling (such as README.txt) will
    be skipped.

    Inputs:
    - models_dir: String giving the path to a directory containing model files.
      Each model file is a pickled dictionary with a 'model' field.

    Returns:
    A dictionary mapping model file names to models.
    """
    models = {}
    for model_file in os.listdir(models_dir):
        with open(os.path.join(models_dir, model_file), 'rb') as f:
            try:
                models[model_file] = load_pickle(f)['model']
            except pickle.UnpicklingError:
                continue
    return models
from dl.data_utils import load_CIFAR10
import numpy as np

classes = ['plane','car','bird','cat','deer','frog','horse','ship','truck']
x_train, y_train, x_test, y_test = load_CIFAR10('dataset/cifar-10-batches-py')
x_train = np.reshape(x_train, (x_train.shape[0], -1))
x_test = np.reshape(x_test, (x_test.shape[0], -1))

def svm_loss_vectorized(W, X, Y, reg):
    """
    计算loss和gradient,暂时不用正则化
    W: 10*3072
    X: num_train_3072
    """
    num_train = X.shape[0]
    scores = np.dot(X, W.T)
    correct_scores = scores[np.arange(num_train), Y]
    correct_scores  = np.reshape(correct_scores, (num_train,-1))
    loss = scores - correct_scores + 1.0  # num_train*10 , num_train*1
    loss[loss < 0] = 0.0 # max(0,sj-syi+1)
    loss[np.arange(num_train), Y] = 0.0 # 把正确分类的分数清空
    margin = loss
    loss = np.sum(loss, axis=1) # Li
    loss = np.mean(loss)
    #print('loss = ', loss)

    # 计算梯度
    dW = np.zeros(W.shape)
    margin[margin > 0] = 1.0
    row_sum = np.sum(margin, axis=1)
    margin[np.arange(num_train), Y] = -row_sum
    dW = 1.0/num_train * np.dot(margin.T, X)
    # margin[margin>0] = 1

    # dw = 1.0/num_train * np.dot(margin.T, X)
    return loss, dW

class SVM(object):
    def train(self,X,Y,learning_rate=1e-7*0.9,reg=1e-5,num_iters=6000,batch_size=256,verbose=True):
        num_train, dim = X.shape
        num_classes = np.max(Y) + 1

        self.W = 0.001 * np.random.randn(num_classes, dim)
        loss_history = []
        for it in range(num_iters):
            x_batch = []
            y_batch = []
            batch_inx = np.random.choice(num_train,batch_size)
            x_batch = X[batch_inx,:]
            y_batch = Y[batch_inx]

            loss, grad = svm_loss_vectorized(self.W, x_batch, y_batch, reg)
            loss_history.append(loss)

            self.W = self.W - learning_rate*grad
            if verbose and it%100==0:
                print('iteration %d / %d : loss %f' % (it, num_iters, loss))

        return loss_history

    def predict(self, x_train):
        y_predict = np.zeros(x_train.shape[1])
        scores = x_train.dot(self.W.T)
        y_pred = np.argmax(scores, axis=1)
        return y_pred

svm = SVM()
svm.train(x_train, y_train)
score1 = svm.predict(x_train)
print('The train ddata predict result %f' %(np.mean(score1 == y_train)))
score1 = svm.predict(x_test)
print('The Test Data predit result %f' %(np.mean(score1 == y_test)))


 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值