20161106#cs231n#1.最近邻分类器 Assignment1-KNN

最新推荐文章于 2021-11-18 19:52:37 发布

LiuSpark

最新推荐文章于 2021-11-18 19:52:37 发布

阅读量707

点赞数

分类专栏：机器学习

本文链接：https://blog.csdn.net/SPARKKKK/article/details/53323784

版权

机器学习专栏收录该内容

30 篇文章 0 订阅

订阅专栏

课程网址

Image Classification: Data-driven Approach, k-Nearest Neighbor, train/val/test splits

线性代数SVD

李飞飞Ted演讲视频

The Image Classification Pipeline图像分类的流程

The image classification pipeline. We’ve seen that the task in Image Classification is to take an array of pixels that represents a single image and assign a label to it. Our complete pipeline can be formalized as follows:
Input: Our input consists of a set of N images, each labeled with one of K different classes. We refer to this data as the training set.
Learning: Our task is to use the training set to learn what every one of the classes looks like. We refer to this step as training a classifier, or learning a model.
Evaluation: In the end, we evaluate the quality of the classifier by asking it to predict labels for a new set of images that it has never seen before. We will then compare the true labels of these images to the ones predicted by the classifier. Intuitively, we’re hoping that a lot of the predictions match up with the true answers (which we call the ground truth).

其实就是

输入TrainingSet的图片和标签
让机器去学习这个模型model，training分类器
利用第二步得到的classifier算法对输入的test图片进行处理，然后输出得到机器估计的test图片的标签

Data-Driven Approach数据驱动法

provide the computer with many examples of each class and then develop learning algorithms that look at these examples and learn about the visual appearance of each class
也就是利用大量数据，开发学习算法，令电脑能够读取和获得数据中的共性，然后利用这些共性去进行下一步的判断

代码思想

读入一个四维数组（1全部的train2px3R4G5B）,然后reshape降成二维数组（1包含数组全部的train2一张图片包含px个数*3(即RGB)）xtr.shape[0]代表总图片数
zeros初始化数组,用xrange()返回而不用range()返回是因为xrange是一个生成器，效率远高于range这种返回整个数组的函数for i in xrange(num_test):
distances = np.sum(np.abs(self.Xtr - X[i,:]), axis=1);每行减去同一个test然后求和得到一组distance数组
min_index = np.argmin(distances)得到数组中最小元素的下标
Ypred[i] = self.ytr[min_index]将下标对应的label赋值给predict的label

Hyperparameters超参数

The k-nearest neighbor classifier requires a setting for k. But what number works best? Additionally, we saw that there are many different distance functions we could have used: L1 norm, L2 norm, there are many other choices we didn’t even consider (e.g. dot products). These choices are called hyperparameters and they come up very often in the design of many Machine Learning algorithms that learn from data

什么是超参数
 超参数-wiki

感觉超参数就是一种未定的参数，例如knn里面的k或者里面的distance或者px点之间的关系等等各种无限可能的参数。现在学的比较少，所以了解也不多，以后再慢慢深入吧。
如果我们使用从头到尾使用同一组数据去调试超参数，很有可能会出现过拟合现象（overfit，一个假设在训练数据上能够获得比其他假设更好的拟合，但是在训练数据外的数据集上却不能很好的拟合数据。

为了实现更好的算法，逐渐调整和测试超参数是很有必要的。

Validation 检验用的数据

抽出数据的小一部分去作为对training结果的检测，即在数据集里面抽出一小部分作为假的test，这样做的好处在于即时检验。

Cross-Validation

这是一种hyperparameter tuning，是一种对Hyperparameter进行调试和修正的方法。
一种情况是缺少数据的时候才用。
是将一组数据分成N组，然后将这N组数据轮流当做validation去使用。
可以提高精确度。但是会使用大量的时间和空间资源。
在最后的实际估计的时候不会去浪费资源使用，但是选择合适hyperparameters的时候要用。（个人觉得这个一定要用，因为精确估算的时候作用比较大）

NNC的局限性

在数据是低维度的时候比较有用，但在图像处理这种高维度的作用不大，而且有很多干扰因素。通过像素差异去判断图像是很不合适的。对每一个像素点都取样的话，很可能会因为背景或者大体颜色相同就判断为同一个类型。但是这种思想挺重要的

Python代码

#coding=utf-8
import numpy as np
import cPickle as pickle
import os
from scipy.misc import imread
from collections import Counter


#从源码copy过来的两个读文件函数，示例这样写真的比较好
def load_CIFAR_batch(filename):
    """ load single batch of cifar """
    #rb二进制读文件
    with open(filename, 'rb') as f:
        datadict = pickle.load(f)
        X = datadict['data']
        Y = datadict['labels']
        # 生成一个四维数组X，并用transpose对维度进行排序
        X = X.reshape(10000, 3, 32, 32).transpose(0,2,3,1).astype("float")
        Y = np.array(Y)
        return X, Y
def load_CIFAR10(ROOT):
    """ load all of cifar """
    xs = []
    ys = []
    for b in range(1,6):
        f = os.path.join(ROOT, 'data_batch_%d' % (b, ))
        X, Y = load_CIFAR_batch(f)
        xs.append(X)
        ys.append(Y)
    #用concatenate（array,axis=0）对xs的第一维度（即axis=0）进行合并处理，生成总数组Xtr
    Xtr = np.concatenate(xs)
    Ytr = np.concatenate(ys)
    del X, Y
    #处理test文件
    Xte, Yte = load_CIFAR_batch(os.path.join(ROOT, 'test_batch'))
    return Xtr, Ytr, Xte, Yte



class KNearestNeighbor(object):

    def __init__(self):
        pass

    def train(self, X, y):

        self.X_train = X
        self.y_train = y

    def predict(self, X, k=1, num_loops=1):
        """
        Predict labels for test data using this classifier.

        Inputs:
        - X: A numpy array of shape (num_test, D) containing test data consisting
             of num_test samples each of dimension D.
        - k: The number of nearest neighbors that vote for the predicted labels.
        - num_loops: Determines which implementation to use to compute distances
          between training points and testing points.

        Returns:
        - y: A numpy array of shape (num_test,) containing predicted labels for the
          test data, where y[i] is the predicted label for the test point X[i].
        """
        if num_loops == 0:
            dists = self.compute_distances_no_loops(X)
        elif num_loops == 1:
            dists = self.compute_distances_one_loop(X)
        elif num_loops == 2:
            dists = self.compute_distances_two_loops(X)
        else:
            raise ValueError('Invalid value %d for num_loops' % num_loops)

        return self.predict_labels(dists, k=k)

    def compute_distances_two_loops(self, X):
        """
        Compute the distance between each test point in X and each training point
        in self.X_train using a nested loop over both the training data and the
        test data.

        Inputs:
        - X: A numpy array of shape (num_test, D) containing test data.

        Returns:
        - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
          is the Euclidean distance between the ith test point and the jth training
          point.
        """
        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train))
        for i in xrange(num_test):
            for j in xrange(num_train):
                dict[i, j] = np.sqrt(np.sum(np.square(self.X_train[j, :] - X[i, :]), axis=1));
        #######################################################################
        #                         END OF YOUR CODE                            #
        #######################################################################
        return dists

    def compute_distances_one_loop(self, X):

        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train))
        for i in xrange(num_test):
            dists[i, :] = np.sqrt(np.sum(np.square(self.X_train - X[i, :]), axis=1));
        #######################################################################
        #                         END OF YOUR CODE                            #
        #######################################################################
        return dists

    def compute_distances_no_loops(self, X):

        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train))
        dists = np.multiply(np.dot(X,self.X_train.T),-2)#利用(x1-x2)^2=x1^2-2x1x2+x2^2
        distssqx=np.sum(np.square(X),axis=1)
        distssqxtr = np.sum(np.square(self.X_train), axis=1)
        dists=np.add(dists, distssqx)
        dists = np.add(dists, distssqxtr)
        #########################################################################
        #                         END OF YOUR CODE                              #
        #########################################################################
        return dists

    def predict_labels(self, dists, k=1):
        """
        Given a matrix of distances between test points and training points,
        predict a label for each test point.

        Inputs:
        - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
          gives the distance betwen the ith test point and the jth training point.

        Returns:
        - y: A numpy array of shape (num_test,) containing predicted labels for the
          test data, where y[i] is the predicted label for the test point X[i].
        """
        num_test = dists.shape[0]
        y_pred = np.zeros(num_test)
        for i in xrange(num_test):
            # A list of length k storing the labels of the k nearest neighbors to
            # the ith test point.
            closest_y = []
            # self.y_train[np.argsort(dists[i, :])[:k]]是个嵌套列表用flatten()转为列表
            closest_y = self.y_train[np.argsort(dists[i, :])[:k]].flatten()
            #计数器函数
            c = Counter(closest_y)
            y_pred[i]=c.most_common(1)[0][0]
        #########################################################################
        #                           END OF YOUR CODE                            #
        #########################################################################
        return y_pred

Xtr, Ytr, Xte, Yte = load_CIFAR10('data/cifar10/')
Xtr_rows = Xtr.reshape(Xtr.shape[0], 32 * 32 * 3)
Xte_rows = Xte.reshape(Xte.shape[0], 32 * 32 * 3)
Xval_rows = Xtr_rows[:1000, :]
Yval = Ytr[:1000]
Xtr_rows = Xtr_rows[1000:, :]
Ytr = Ytr[1000:]
validation_accuracies = []
for k in [1, 5, 20, 100]:
    nn = KNearestNeighbor()
    nn.train(Xtr_rows, Ytr)
    Yval_predict = nn.predict(Xval_rows, k=k)
    acc = np.mean(Yval_predict == Yval)
    print 'Acc: %f' % (acc,)
    validation_accuracies.append((k, acc))
print validation_accuracies