CS231n-assignment1-features

这一次的作业是对SVM和2-layer-net的对比

我们可以通过在输入图像的像素上训练线性分类器来实现对图像分类任务的合理性能。在这个练习中,我们将展示我们训练图像的特征而不是像素所带来的表现。

前几个操作跟之前的类似
In[1]:

import random
import numpy as np
from cs231n.data_utils import load_CIFAR10
import matplotlib.pyplot as plt

from __future__ import print_function

#%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

%load_ext autoreload
%autoreload 2

载入数据
In[2]:

from cs231n.features import color_histogram_hsv, hog_feature

def get_CIFAR10_data(num_training=49000, num_validation=1000, num_test=1000):
    cifar10_dir = 'cs231n/datasets/cifar-10-batches-py'

    X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)
    
    mask = list(range(num_training, num_training + num_validation))
    X_val = X_train[mask]
    y_val = y_train[mask]
    mask = list(range(num_training))
    X_train = X_train[mask]
    y_train = y_train[mask]
    mask = list(range(num_test))
    X_test = X_test[mask]
    y_test = y_test[mask]
    
    return X_train, y_train, X_val, y_val, X_test, y_test

try:
   del X_train, y_train
   del X_test, y_test
   print('Clear previously loaded data.')
except:
   pass

X_train, y_train, X_val, y_val, X_test, y_test = get_CIFAR10_data()

提取的特征

对于每个图像,我们将计算一个面向梯度直方图(HOG)和一个颜色直方图,我们通过连接HOG和颜色直方图特征向量来形成每个图像的最终特征向量。

粗略地说,HOG应该捕获图像的纹理而忽略颜色信息,颜色直方图表示输入图像的颜色而忽略纹理。因此,我们期望将两者结合使用应该比单独使用两者更好。为了你的利益,验证这个假设是一件好事。

hog_feature和color_histogram_hsv函数都对单个图像进行操作,并返回该图像的特征向量。extract_features函数接受一组图像和一组特征函数,并对每个图像计算每个特征函数,将结果存储在一个矩阵中,其中的每一列是单个图像的所有特征向量的连接。

关于HOG,可以参考HOG特征(知乎)
ln[3]:

from cs231n.features import *

num_color_bins = 10 # Number of bins in the color histogram
feature_fns = [hog_feature, lambda img: color_histogram_hsv(img, nbin=num_color_bins)]
X_train_feats = extract_features(X_train, feature_fns, verbose=True)
X_val_feats = extract_features(X_val, feature_fns)
X_test_feats = extract_features(X_test, feature_fns)

# Preprocessing: Subtract the mean feature
mean_feat = np.mean(X_train_feats, axis=0, keepdims=True)
X_train_feats -= mean_feat
X_val_feats -= mean_feat
X_test_feats -= mean_feat

# Preprocessing: Divide by standard deviation. This ensures that each feature
# has roughly the same scale.
std_feat = np.std(X_train_feats, axis=0, keepdims=True)
X_train_feats /= std_feat
X_val_feats /= std_feat
X_test_feats /= std_feat

# Preprocessing: Add a bias dimension
X_train_feats = np.hstack([X_train_feats, np.ones((X_train_feats.shape[0], 1))])
X_val_feats = np.hstack([X_val_feats, np.ones((X_val_feats.shape[0], 1))])
X_test_feats = np.hstack([X_test_feats, np.ones((X_test_feats.shape[0], 1))])

接下来在特征上训练SVM

使用先前开发的多类支持向量机代码,在上面提取的特征上训练支持向量机;这应该比直接在原始像素上训练支持向量机获得更好的结果。
In[4]:

from cs231n.classifiers.linear_classifier import LinearSVM

learning_rates = [1e-3, 1e-2]
regularization_strengths = [0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7]

results = {}
best_val = -1
best_svm = None

################################################################################
# TODO:                                                                        #
# Use the validation set to set the learning rate and regularization strength. #
# This should be identical to the validation that you did for the SVM; save    #
# the best trained classifer in best_svm. You might also want to play          #
# with different numbers of bins in the color histogram. If you are careful    #
# you should be able to get accuracy of near 0.44 on the validation set.       #
################################################################################
np.random.seed(0)

grid_search = [ (lr,reg) for lr in learning_rates for reg in regularization_strengths ]

for lr, reg in grid_search:
    # Create SVM model
    svm = LinearSVM()
    
    # Train phase
    svm.train(X_train_feats, y_train, learning_rate=lr, reg=reg, num_iters=2000,
            batch_size=200, verbose=False)
    y_train_pred = svm.predict(X_train_feats)
    # Train accuracy
    train_accuracy = np.mean( y_train_pred == y_train )
    
    # Validation phase
    y_val_pred = svm.predict(X_val_feats)
    # Validation accuracy
    val_accuracy = np.mean( y_val_pred == y_val )
    
    results[lr,reg] = (train_accuracy,val_accuracy)
    
    # Save best model
    if val_accuracy > best_val:
        best_val = val_accuracy
        best_svm = svm
################################################################################
#                              END OF YOUR CODE                                #
################################################################################

# Print out results.
for lr, reg in sorted(results):
    train_accuracy, val_accuracy = results[(lr, reg)]
    print('lr %e reg %e train accuracy: %f val accuracy: %f' % (
                lr, reg, train_accuracy, val_accuracy))
    
print('best validation accuracy achieved during cross-validation: %f' % best_val)

在测试集中评估您训练过的SVM
In[5]:

y_test_pred = best_svm.predict(X_test_feats)
test_accuracy = np.mean(y_test == y_test_pred)
print(test_accuracy)

ln[6]:

examples_per_class = 8
classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
for cls, cls_name in enumerate(classes):
    idxs = np.where((y_test != cls) & (y_test_pred == cls))[0]
    idxs = np.random.choice(idxs, examples_per_class, replace=False)
    for i, idx in enumerate(idxs):
        plt.subplot(examples_per_class, len(classes), i * len(classes) + cls + 1)
        plt.imshow(X_test[idx].astype('uint8'))
        plt.axis('off')
        if i == 0:
            plt.title(cls_name)
plt.show()

在可视化后我们发现了错误,

内联问题1:

描述您看到的错误分类结果。它们有意义吗?

对于某些课程来说,它们是有意义的。例如,猫和狗在边缘上是相似的动物,它们都有四条腿和一条尾巴。因此,来自这些类的图像可以混合在一起,我们还可以在这里包括马或鹿。汽车和卡车也是一个类似的例子。飞机类是有趣的,因为错误分类的例子在颜色感上有相似之处,因此它们通常与船舶图像混合,因为天空和海洋的颜色非常相似。然而,也有一些类别完全没有意义,比如在鸟类类别中,狗、猫、飞机和其他的图像被错误分类。我们可以假设所有这些图像,因为颜色特征分类为鸟类。

综上所述,HOG和颜色直方图特征向量的结合不足以正确区分所有的类。HOG描述符非常有用,因为它考虑到了边缘,但我们必须考虑到HOG可能不会考虑的不同类型的不变性,例如,对平移的不变性。除此之外,HOG有不同的参数,我们可以交叉验证以获得更好的性能。颜色直方图特征在某些情况下(类似的颜色)是有用的,但在其他情况下不是很有用。然而,我们仍然可以考虑这个特性,但当它与其他特性(如HOG)结合时,它的重要性可能会降低。最后,我们可以使用其他类型的描述符来提高我们的模型的精度,如SIFT, LBP(纹理)等。

图像特征的神经网络

在这个任务的早些时候,我们看到在原始像素上训练一个两层神经网络比在原始像素上训练线性分类器获得更好的分类性能。在上述操作中,图像特征上的线性分类器优于在原始像素上的线性分类器。

为了完整性,我们也应该尝试在图像特征上训练神经网络。这种方法应该优于所有以前的方法:应该能够轻松地在测试集上实现超过55%的分类准确率;我们的最佳模型达到约60%的分类准确率。

In[7]:

#预处理:删除偏差尺寸
print(X_train_feats.shape)
X_train_feats = X_train_feats[:, :-1]
X_val_feats = X_val_feats[:, :-1]
X_test_feats = X_test_feats[:, :-1]

print(X_train_feats.shape)

ln[8]:

from cs231n.classifiers.fc_net import TwoLayerNet
from cs231n.solver import Solver

input_dim = X_train_feats.shape[1]
hidden_dim = 500
num_classes = 10

net = TwoLayerNet(input_dim, hidden_dim, num_classes)
best_net = None

################################################################################
# TODO: Train a two-layer neural network on image features. You may want to    #
# cross-validate various parameters as in previous sections. Store your best   #
# model in the best_net variable.                                              #
################################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

data = {'X_train': X_train_feats, 'y_train': y_train, 
        'X_val': X_val_feats, 'y_val': y_val,
        'X_test': X_test_feats, 'y_test': y_test}

solver = Solver(model=net, data=data, 
                update_rule='sgd',
                optim_config={
                      'learning_rate': 0.09,
                },
                lr_decay=0.95, num_epochs=20, 
                batch_size=100, print_every=100)
solver.train()
best_net = net

# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

(Iteration 1 / 9800) loss: 2.302590
(Epoch 0 / 20) train acc: 0.087000; val_acc: 0.105000
(Iteration 101 / 9800) loss: 2.276100
(Iteration 201 / 9800) loss: 1.789409
(Iteration 301 / 9800) loss: 1.596300
(Iteration 401 / 9800) loss: 1.542490
(Epoch 1 / 20) train acc: 0.495000; val_acc: 0.487000
(Iteration 501 / 9800) loss: 1.474890
(Iteration 601 / 9800) loss: 1.290177
(Iteration 701 / 9800) loss: 1.415911
(Iteration 801 / 9800) loss: 1.285058
(Iteration 901 / 9800) loss: 1.492515
(Epoch 2 / 20) train acc: 0.511000; val_acc: 0.521000
(Iteration 1001 / 9800) loss: 1.207404
(Iteration 1101 / 9800) loss: 1.468276
(Iteration 1201 / 9800) loss: 1.377914
(Iteration 1301 / 9800) loss: 1.333933
(Iteration 1401 / 9800) loss: 1.398797
(Epoch 3 / 20) train acc: 0.535000; val_acc: 0.518000
(Iteration 1501 / 9800) loss: 1.167448
(Iteration 1601 / 9800) loss: 1.126200
(Iteration 1701 / 9800) loss: 1.376040
(Iteration 1801 / 9800) loss: 1.427504
(Iteration 1901 / 9800) loss: 1.281142
(Epoch 4 / 20) train acc: 0.570000; val_acc: 0.566000
(Iteration 2001 / 9800) loss: 1.237984
(Iteration 2101 / 9800) loss: 1.136605
(Iteration 2201 / 9800) loss: 1.239995
(Iteration 2301 / 9800) loss: 1.084366
(Iteration 2401 / 9800) loss: 1.257484
(Epoch 5 / 20) train acc: 0.583000; val_acc: 0.539000
(Iteration 2501 / 9800) loss: 1.029579
(Iteration 2601 / 9800) loss: 1.238230
(Iteration 2701 / 9800) loss: 1.316464
(Iteration 2801 / 9800) loss: 1.212885
(Iteration 2901 / 9800) loss: 1.036326
(Epoch 6 / 20) train acc: 0.617000; val_acc: 0.563000
(Iteration 3001 / 9800) loss: 1.239503
(Iteration 3101 / 9800) loss: 1.150601
(Iteration 3201 / 9800) loss: 1.093215
(Iteration 3301 / 9800) loss: 1.081834
(Iteration 3401 / 9800) loss: 1.014506
(Epoch 7 / 20) train acc: 0.615000; val_acc: 0.569000
(Iteration 3501 / 9800) loss: 1.157668
(Iteration 3601 / 9800) loss: 0.915607
(Iteration 3701 / 9800) loss: 1.003897
(Iteration 3801 / 9800) loss: 1.398643
(Iteration 3901 / 9800) loss: 0.975194
(Epoch 8 / 20) train acc: 0.620000; val_acc: 0.582000
(Iteration 4001 / 9800) loss: 1.103585
(Iteration 4101 / 9800) loss: 1.042750
(Iteration 4201 / 9800) loss: 0.885345
(Iteration 4301 / 9800) loss: 1.010080
(Iteration 4401 / 9800) loss: 1.047311
(Epoch 9 / 20) train acc: 0.660000; val_acc: 0.604000
(Iteration 4501 / 9800) loss: 0.930028
(Iteration 4601 / 9800) loss: 0.969445
(Iteration 4701 / 9800) loss: 0.963516
(Iteration 4801 / 9800) loss: 0.995127
(Epoch 10 / 20) train acc: 0.678000; val_acc: 0.603000
(Iteration 4901 / 9800) loss: 1.168930
(Iteration 5001 / 9800) loss: 1.060691
(Iteration 5101 / 9800) loss: 0.915163
(Iteration 5201 / 9800) loss: 0.703763
(Iteration 5301 / 9800) loss: 0.932732
(Epoch 11 / 20) train acc: 0.669000; val_acc: 0.606000
(Iteration 5401 / 9800) loss: 0.899236
(Iteration 5501 / 9800) loss: 1.084170
(Iteration 5601 / 9800) loss: 1.047494
(Iteration 5701 / 9800) loss: 0.941226
(Iteration 5801 / 9800) loss: 0.831287
(Epoch 12 / 20) train acc: 0.666000; val_acc: 0.599000
(Iteration 5901 / 9800) loss: 0.945339
(Iteration 6001 / 9800) loss: 1.009333
(Iteration 6101 / 9800) loss: 0.758849
(Iteration 6201 / 9800) loss: 1.020670
(Iteration 6301 / 9800) loss: 1.067748
(Epoch 13 / 20) train acc: 0.695000; val_acc: 0.607000
(Iteration 6401 / 9800) loss: 0.899113
(Iteration 6501 / 9800) loss: 0.762420
(Iteration 6601 / 9800) loss: 0.841136
(Iteration 6701 / 9800) loss: 0.768584
(Iteration 6801 / 9800) loss: 0.875704
(Epoch 14 / 20) train acc: 0.699000; val_acc: 0.605000
(Iteration 6901 / 9800) loss: 0.937701
(Iteration 7001 / 9800) loss: 0.943159
(Iteration 7101 / 9800) loss: 0.914377
(Iteration 7201 / 9800) loss: 0.919783
(Iteration 7301 / 9800) loss: 0.898404
(Epoch 15 / 20) train acc: 0.730000; val_acc: 0.611000
(Iteration 7401 / 9800) loss: 0.814851
(Iteration 7501 / 9800) loss: 0.750340
(Iteration 7601 / 9800) loss: 0.874075
(Iteration 7701 / 9800) loss: 0.957134
(Iteration 7801 / 9800) loss: 0.782979
(Epoch 16 / 20) train acc: 0.726000; val_acc: 0.603000
(Iteration 7901 / 9800) loss: 0.840340
(Iteration 8001 / 9800) loss: 0.706303
(Iteration 8101 / 9800) loss: 0.928613
(Iteration 8201 / 9800) loss: 0.826599
(Iteration 8301 / 9800) loss: 0.683322
(Epoch 17 / 20) train acc: 0.748000; val_acc: 0.604000
(Iteration 8401 / 9800) loss: 0.671544
(Iteration 8501 / 9800) loss: 0.814938
(Iteration 8601 / 9800) loss: 0.726082
(Iteration 8701 / 9800) loss: 0.877572
(Iteration 8801 / 9800) loss: 0.840800
(Epoch 18 / 20) train acc: 0.703000; val_acc: 0.607000
(Iteration 8901 / 9800) loss: 0.894665
(Iteration 9001 / 9800) loss: 0.938692
(Iteration 9101 / 9800) loss: 0.988022
(Iteration 9201 / 9800) loss: 0.668710
(Iteration 9301 / 9800) loss: 0.787857
(Epoch 19 / 20) train acc: 0.737000; val_acc: 0.606000
(Iteration 9401 / 9800) loss: 0.844016
(Iteration 9501 / 9800) loss: 0.808054
(Iteration 9601 / 9800) loss: 0.918330
(Iteration 9701 / 9800) loss: 0.743772
(Epoch 20 / 20) train acc: 0.744000; val_acc: 0.614000

In[9]:

test_acc = (best_net.predict(X_test_feats) == y_test).mean()
print(test_acc)

  • 5
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Esaka7

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值