Begin
本文主要介绍CS231N系列课程的第一项作业,写一个SVM无监督学习训练模型。
课程主页:网易云课堂CS231N系列课程
语言:Python3.6
1线形分类器
以图像为例,一幅图像像素为32*32*3代表长32宽32有3通道的衣服图像,将其变为1*3072的一个向量,即该图像的特征向量。
我们如果需要训练1000幅图像,那么输入则为1000*3072的矩阵X。
我们用X点乘矩阵W得到一个计分矩阵如下所示,W乘以一幅图像的特征向量的转置得到一列代表分数。
每个分数对应代表一个类别,分数越高代表她所属于此类别纪律越大,所以W其实是一个类别权重的概念。
注意:下图为CS231N中的一张图,它是以一幅图为例,将X转至为3072*1,大家理解即可,在程序中我们采用X*W来编写。
更多细节可以参考CS231N作业1KNN详解
2损失函数
得到每一幅图像对应每一个类别的分数之后,我们需要计算一个损失,去评估一下W矩阵的好坏。
如下右侧为SVM损失函数计算公式。
对每一幅图像的损失用其错误类别的分数减去正确类别的分数,并与0比较求最大值
一般我们应该正确类别的分数高就证明没有损失,此时错误类别减去正确类别一定为负值,比0小故取损失为0.
为了提高鲁棒性,这里给他加了一个1。
计算所有的损失后,我们把损失累加作为最后的损失
整理后我们得到如下的公式,但是其存在一个问题,没有考虑W的影响,不同的W可能得到同样的损失,
因此我们引入一个正则,正则系数可以调节W对整个损失的影响,W越分散当然越好
代码如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
|
如此一套完整的损失函数就构造完成了,我们通过看损失可以知道这个W矩阵的好坏,那么如果损失过大该怎么调剂每一个参数呢?
此时我们引入梯度下降法和梯度的概念
3梯度
梯度下降法:
首先,我们有一个可微分的函数。这个函数就代表着一座山。我们的目标就是找到这个函数的最小值,也就是山底。根据之前的场景假设,最快的下山的方式就是找到当前位置最陡峭的方向,然后沿着此方向向下走,对应到函数中,就是找到给定点的梯度 ,然后朝着梯度相反的方向,就能让函数值下降的最快!因为梯度的方向就是函数之变化最快的方向(在后面会详细解释)
所以,我们重复利用这个方法,反复求取梯度,最后就能到达局部的最小值,这就类似于我们下山的过程。而求取梯度就确定了最陡峭的方向,也就是场景中测量方向的手段。
梯度如同求导一样,如下图所示,损失的导数反应着梯度状况
如果W向前变化一格,损失增大,则dW梯度应该为正值,此时应该W向相反方向变化。
对于本例中对于损失函数,可以改写为如下:
对于Lij,用其对Wj求偏导
CODE2 LOSS & 梯度 循环形式
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
|
CODE3 LOSS & 梯度 向量矩阵形式
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
4训练函数
在得到损失和梯度后我们就可以根据梯度去调节W矩阵,这里需要引入TRAIN函数的一些参数。
一般需要有以下参数:
训练次数:要循环训练多少步。
学习率:每一次根据梯度去修正W矩阵的系数。
样本数:每一次训练可能不是选择所有样本,需要取样一定样本。
核心点在于在循环中不断去计算损失以及梯度,然后利用下面公式去调节。
self.W = self.W - learning_rate * grade
CODE4 梯度下降法
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
|
运行结果如
5预测predict
在训练完模型后会得到一个较好的W矩阵,然后根据这个W去预测一下测试集看看模型的效果
1 2 3 4 5 6 7 |
|
在主函数中运行如下代码观察预测情况
1 2 3 4 |
|
预测结果如下,用训练集本身去预测得到0.756,用测试集去预测才0.218,不是太好
6参数调整
上述即完成了一整体的SVM模型库,那么我们如何自动训练出一个好的学习率和正则化强度参数呢?
我们需要不断去测试每一个参数的好坏,用下面一个程序可以完成这个任务
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
|
运行结果如下:
7 可视化效果
在得到最优W时,我们有时要看一下W的可视化效果,从w的图像可以看出权重高低,类似于一个反应这个类别的模板。
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
如下图所示
完整代码(第一个代码是data_util用来读取数据集的工具包源码)
from __future__ import print_function
from six.moves import cPickle as pickle
import numpy as np
import os
from matplotlib.pyplot import imread
import platform
def load_pickle(f):
version = platform.python_version_tuple()
if version[0] == '2':
return pickle.load(f)
elif version[0] == '3':
return pickle.load(f, encoding='latin1')
raise ValueError("invalid python version: {}".format(version))
def load_CIFAR_batch(filename):
""" load single batch of cifar """
with open(filename, 'rb') as f:
datadict = load_pickle(f)
X = datadict['data']
Y = datadict['labels']
X = X.reshape(10000, 3, 32, 32).transpose(0, 2, 3, 1).astype("float")
Y = np.array(Y)
return X, Y
def load_CIFAR10(ROOT):
""" load all of cifar """
xs = []
ys = []
for b in range(1, 6):
f = os.path.join(ROOT, 'data_batch_%d' % (b,))
X, Y = load_CIFAR_batch(f)
xs.append(X)
ys.append(Y)
Xtr = np.concatenate(xs)
Ytr = np.concatenate(ys)
del X, Y
Xte, Yte = load_CIFAR_batch(os.path.join(ROOT, 'test_batch'))
return Xtr, Ytr, Xte, Yte
def get_CIFAR10_data(num_training=49000, num_validation=1000, num_test=1000,
subtract_mean=True):
"""
Load the CIFAR-10 dataset from disk and perform preprocessing to prepare
it for classifiers. These are the same steps as we used for the SVM, but
condensed to a single function.
"""
# Load the raw CIFAR-10 data
cifar10_dir = 'datasets/cifar-10-batches-py'
X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)
# Subsample the data
mask = list(range(num_training, num_training + num_validation))
X_val = X_train[mask]
y_val = y_train[mask]
mask = list(range(num_training))
X_train = X_train[mask]
y_train = y_train[mask]
mask = list(range(num_test))
X_test = X_test[mask]
y_test = y_test[mask]
# Normalize the data: subtract the mean image
if subtract_mean:
mean_image = np.mean(X_train, axis=0)
X_train -= mean_image
X_val -= mean_image
X_test -= mean_image
# Transpose so that channels come first
X_train = X_train.transpose(0, 3, 1, 2).copy()
X_val = X_val.transpose(0, 3, 1, 2).copy()
X_test = X_test.transpose(0, 3, 1, 2).copy()
# Package data into a dictionary
return {
'X_train': X_train, 'y_train': y_train,
'X_val': X_val, 'y_val': y_val,
'X_test': X_test, 'y_test': y_test,
}
def load_tiny_imagenet(path, dtype=np.float32, subtract_mean=True):
"""
Load TinyImageNet. Each of TinyImageNet-100-A, TinyImageNet-100-B, and
TinyImageNet-200 have the same directory structure, so this can be used
to load any of them.
Inputs:
- path: String giving path to the directory to load.
- dtype: numpy datatype used to load the data.
- subtract_mean: Whether to subtract the mean training image.
Returns: A dictionary with the following entries:
- class_names: A list where class_names[i] is a list of strings giving the
WordNet names for class i in the loaded dataset.
- X_train: (N_tr, 3, 64, 64) array of training images
- y_train: (N_tr,) array of training labels
- X_val: (N_val, 3, 64, 64) array of validation images
- y_val: (N_val,) array of validation labels
- X_test: (N_test, 3, 64, 64) array of testing images.
- y_test: (N_test,) array of test labels; if test labels are not available
(such as in student code) then y_test will be None.
- mean_image: (3, 64, 64) array giving mean training image
"""
# First load wnids
with open(os.path.join(path, 'wnids.txt'), 'r') as f:
wnids = [x.strip() for x in f]
# Map wnids to integer labels
wnid_to_label = {wnid: i for i, wnid in enumerate(wnids)}
# Use words.txt to get names for each class
with open(os.path.join(path, 'words.txt'), 'r') as f:
wnid_to_words = dict(line.split('\t') for line in f)
for wnid, words in wnid_to_words.iteritems():
wnid_to_words[wnid] = [w.strip() for w in words.split(',')]
class_names = [wnid_to_words[wnid] for wnid in wnids]
# Next load training data.
X_train = []
y_train = []
for i, wnid in enumerate(wnids):
if (i + 1) % 20 == 0:
print('loading training data for synset %d / %d' % (i + 1, len(wnids)))
# To figure out the filenames we need to open the boxes file
boxes_file = os.path.join(path, 'train', wnid, '%s_boxes.txt' % wnid)
with open(boxes_file, 'r') as f:
filenames = [x.split('\t')[0] for x in f]
num_images = len(filenames)
X_train_block = np.zeros((num_images, 3, 64, 64), dtype=dtype)
y_train_block = wnid_to_label[wnid] * np.ones(num_images, dtype=np.int64)
for j, img_file in enumerate(filenames):
img_file = os.path.join(path, 'train', wnid, 'images', img_file)
img = imread(img_file)
if img.ndim == 2:
## grayscale file
img.shape = (64, 64, 1)
X_train_block[j] = img.transpose(2, 0, 1)
X_train.append(X_train_block)
y_train.append(y_train_block)
# We need to concatenate all training data
X_train = np.concatenate(X_train, axis=0)
y_train = np.concatenate(y_train, axis=0)
# Next load validation data
with open(os.path.join(path, 'val', 'val_annotations.txt'), 'r') as f:
img_files = []
val_wnids = []
for line in f:
img_file, wnid = line.split('\t')[:2]
img_files.append(img_file)
val_wnids.append(wnid)
num_val = len(img_files)
y_val = np.array([wnid_to_label[wnid] for wnid in val_wnids])
X_val = np.zeros((num_val, 3, 64, 64), dtype=dtype)
for i, img_file in enumerate(img_files):
img_file = os.path.join(path, 'val', 'images', img_file)
img = imread(img_file)
if img.ndim == 2:
img.shape = (64, 64, 1)
X_val[i] = img.transpose(2, 0, 1)
# Next load test images
# Students won't have test labels, so we need to iterate over files in the
# images directory.
img_files = os.listdir(os.path.join(path, 'test', 'images'))
X_test = np.zeros((len(img_files), 3, 64, 64), dtype=dtype)
for i, img_file in enumerate(img_files):
img_file = os.path.join(path, 'test', 'images', img_file)
img = imread(img_file)
if img.ndim == 2:
img.shape = (64, 64, 1)
X_test[i] = img.transpose(2, 0, 1)
y_test = None
y_test_file = os.path.join(path, 'test', 'test_annotations.txt')
if os.path.isfile(y_test_file):
with open(y_test_file, 'r') as f:
img_file_to_wnid = {}
for line in f:
line = line.split('\t')
img_file_to_wnid[line[0]] = line[1]
y_test = [wnid_to_label[img_file_to_wnid[img_file]] for img_file in img_files]
y_test = np.array(y_test)
mean_image = X_train.mean(axis=0)
if subtract_mean:
X_train -= mean_image[None]
X_val -= mean_image[None]
X_test -= mean_image[None]
return {
'class_names': class_names,
'X_train': X_train,
'y_train': y_train,
'X_val': X_val,
'y_val': y_val,
'X_test': X_test,
'y_test': y_test,
'class_names': class_names,
'mean_image': mean_image,
}
def load_models(models_dir):
"""
Load saved models from disk. This will attempt to unpickle all files in a
directory; any files that give errors on unpickling (such as README.txt) will
be skipped.
Inputs:
- models_dir: String giving the path to a directory containing model files.
Each model file is a pickled dictionary with a 'model' field.
Returns:
A dictionary mapping model file names to models.
"""
models = {}
for model_file in os.listdir(models_dir):
with open(os.path.join(models_dir, model_file), 'rb') as f:
try:
models[model_file] = load_pickle(f)['model']
except pickle.UnpicklingError:
continue
return models
from dl.data_utils import load_CIFAR10
import numpy as np
classes = ['plane','car','bird','cat','deer','frog','horse','ship','truck']
x_train, y_train, x_test, y_test = load_CIFAR10('dataset/cifar-10-batches-py')
x_train = np.reshape(x_train, (x_train.shape[0], -1))
x_test = np.reshape(x_test, (x_test.shape[0], -1))
def svm_loss_vectorized(W, X, Y, reg):
"""
计算loss和gradient,暂时不用正则化
W: 10*3072
X: num_train_3072
"""
num_train = X.shape[0]
scores = np.dot(X, W.T)
correct_scores = scores[np.arange(num_train), Y]
correct_scores = np.reshape(correct_scores, (num_train,-1))
loss = scores - correct_scores + 1.0 # num_train*10 , num_train*1
loss[loss < 0] = 0.0 # max(0,sj-syi+1)
loss[np.arange(num_train), Y] = 0.0 # 把正确分类的分数清空
margin = loss
loss = np.sum(loss, axis=1) # Li
loss = np.mean(loss)
#print('loss = ', loss)
# 计算梯度
dW = np.zeros(W.shape)
margin[margin > 0] = 1.0
row_sum = np.sum(margin, axis=1)
margin[np.arange(num_train), Y] = -row_sum
dW = 1.0/num_train * np.dot(margin.T, X)
# margin[margin>0] = 1
# dw = 1.0/num_train * np.dot(margin.T, X)
return loss, dW
class SVM(object):
def train(self,X,Y,learning_rate=1e-7*0.9,reg=1e-5,num_iters=6000,batch_size=256,verbose=True):
num_train, dim = X.shape
num_classes = np.max(Y) + 1
self.W = 0.001 * np.random.randn(num_classes, dim)
loss_history = []
for it in range(num_iters):
x_batch = []
y_batch = []
batch_inx = np.random.choice(num_train,batch_size)
x_batch = X[batch_inx,:]
y_batch = Y[batch_inx]
loss, grad = svm_loss_vectorized(self.W, x_batch, y_batch, reg)
loss_history.append(loss)
self.W = self.W - learning_rate*grad
if verbose and it%100==0:
print('iteration %d / %d : loss %f' % (it, num_iters, loss))
return loss_history
def predict(self, x_train):
y_predict = np.zeros(x_train.shape[1])
scores = x_train.dot(self.W.T)
y_pred = np.argmax(scores, axis=1)
return y_pred
svm = SVM()
svm.train(x_train, y_train)
score1 = svm.predict(x_train)
print('The train ddata predict result %f' %(np.mean(score1 == y_train)))
score1 = svm.predict(x_test)
print('The Test Data predit result %f' %(np.mean(score1 == y_test)))