李飞飞计算机视觉课程学习笔记 第二章

2 图像分类方法

作业题:https://cs231n.github.io/assignments2018/assignment1/
工具:Python + Numpy
Numpy的相关资料:Python之Numpy详细教程

2.1 数据驱动方法

图像分类算法是需要大量的数据,是数据驱动的方法。
在这里插入图片描述
数据集:CIFAR-10 50000训练图片 32323 10000测试图片

第一个分类器:最邻近算法(Nearest Neighbor)

训练时记录所有的数据和标签,预测时,找到与待预测的数据的最相似的训练图片,其标签就是待预测的数据的标签。
在这里插入图片描述
**临近的标准:**L1距离和L2距离
在这里插入图片描述
代码如下:

import nmpy as ap
class NearestNeighbor:
	def __init__(self):
		pass
	def train(self,X,y):
	#存储训练数据
		self.Xtr = X
		self.ytr = y
		
	def predict(self,X):
		num_test = X.shape[0]
		Ypred = np.zeros(num_test, dtpye = self.ytr.dtype)
		for i in range(nm_test):
			diatance = np.sum(np.abs(self.Xtr - X[i,:]), axis =1)
			min_index = np.argmin(distance)
			Ypred[i] = self.ytr[min_index]
		return Ypred

在这里插入图片描述

2.2 KNN

NN算法对噪声的鲁棒性太差,所以有了KNN算法,选取最临近的K个点,采用投票的方式决定待预测数据的标签。K的值越大,边界越平滑。
在这里插入图片描述
超参: K和最佳距离
在这里插入图片描述
KNN总结: KNN不适用于图像分类,没有考虑到图像像素之间的距离的信息,而且存在维数的灾难问题。
在这里插入图片描述
在这里插入图片描述

2.3 线性分类器

神经网络比作玩乐高,可以将不同种类组件组合在一起,构建大型的卷积神经网络,线性分类器是基本的组件。
在这里插入图片描述
x是一个长向量,b是偏置项,它不与训练集数据交互,而只会给我们一些数据独立的偏好值。例如,当数据集中猫的数量多于狗时,则与猫对应的偏差元素就会比其他的高。
在这里插入图片描述
线性分类器是一种模板匹配的方法,线性分类器只允许每个类别只能学习一个模板,如果这个类别出现了某种类型的变体,那么它将尝试求去所有不同变体的均值,并且只使用一个单独的模板来识别其中的每一个类别。对于神经网络和其他更复杂的模型,没有以上的限制,因此能达到更高的准确率。
在这里插入图片描述
线性分类器的另一个观点是回归到图像,作为点和高维空间的概念。它在先行决策边界上尝试画一个线性分类面来决策类别

线性分类过程中无法解决的问题:
在这里插入图片描述

  • 奇数偶数划分问题
  • 多分类问题
  • 有多模态数据,例如一个类别出现在不同的领域空间中
    例子
    在这里插入图片描述

2.4 作业

KNN

import numpy as np


class KNearestNeighbor(object):
    """ a kNN classifier with L2 distance """

    def __init__(self):
        pass

    def train(self, X, y):
        """
        Train the classifier. For k-nearest neighbors this is just
        memorizing the training data.

        Inputs:
        - X: A numpy array of shape (num_train, D) containing the training data
          consisting of num_train samples each of dimension D.
        - y: A numpy array of shape (N,) containing the training labels, where
             y[i] is the label for X[i].
        """
        self.X_train = X
        self.y_train = y

    def predict(self, X, k=1, num_loops=0):
        """
        Predict labels for test data using this classifier.

        Inputs:
        - X: A numpy array of shape (num_test, D) containing test data consisting
             of num_test samples each of dimension D.
        - k: The number of nearest neighbors that vote for the predicted labels.
        - num_loops: Determines which implementation to use to compute distances
          between training points and testing points.

        Returns:
        - y: A numpy array of shape (num_test,) containing predicted labels for the
          test data, where y[i] is the predicted label for the test point X[i].
        """
        if num_loops == 0:
            dists = self.compute_distances_no_loops(X)
        elif num_loops == 1:
            dists = self.compute_distances_one_loop(X)
        elif num_loops == 2:
            dists = self.compute_distances_two_loops(X)
        else:
            raise ValueError('Invalid value %d for num_loops' % num_loops)

        return self.predict_labels(dists, k=k)

    def compute_distances_two_loops(self, X):
        """
        Compute the distance between each test point in X and each training point
        in self.X_train using a nested loop over both the training data and the
        test data.

        Inputs:
        - X: A numpy array of shape (num_test, D) containing test data.

        Returns:
        - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
          is the Euclidean distance between the ith test point and the jth training
          point.
        """
        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train))
        for i in range(num_test):
            for j in range(num_train):
                #####################################################################
                # TODO:                                                             #
                # Compute the l2 distance between the ith test point and the jth    #
                # training point, and store the result in dists[i, j]. You should   #
                # not use a loop over dimension.                                    #
                #####################################################################
                #相减,平方,求和,开方
                dis = X[i,:]-self.X_train[j,:]
                distance = np.sqrt(np.sum(dis**2))
                dists[i,j]=distance
                #####################################################################
                #                       END OF YOUR CODE                            #
                #####################################################################
        return dists

    def compute_distances_one_loop(self, X):
        """
        Compute the distance between each test point in X and each training point
        in self.X_train using a single loop over the test data.

        Input / Output: Same as compute_distances_two_loops
        """
        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train))
        for i in range(num_test):
            #######################################################################
            # TODO:                                                               #
            # Compute the l2 distance between the ith test point and all training #
            # points, and store the result in dists[i, :].                        #
            #######################################################################
            dists[i, :] = np.sqrt(np.sum(np.square(self.X_train - X[i, :]), axis=1))
            #######################################################################
            #                         END OF YOUR CODE                            #
            #######################################################################
        return dists

    def compute_distances_no_loops(self, X):
        """
        Compute the distance between each test point in X and each training point
        in self.X_train using no explicit loops.

        Input / Output: Same as compute_distances_two_loops
        """
        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train))
        #########################################################################
        # TODO:                                                                 #
        # Compute the l2 distance between all test points and all training      #
        # points without using any explicit loops, and store the result in      #
        # dists.                                                                #
        #                                                                       #
        # You should implement this function using only basic array operations; #
        # in particular you should not use functions from scipy.                #
        #                                                                       #
        # HINT: Try to formulate th	e l2 distance using matrix multiplication    #
        #       and two broadcast sums.                                         #
        #########################################################################
        sq_train = np.sum(np.square(self.X_train), axis=1)
        sq_test = np.sum(np.square(X), axis=1)
        mul = np.multiply(np.matmul(X, self.X_train.T), -2)
        sq_train = np.reshape(sq_train, (1, sq_train.shape[0]))
        sq_test = np.reshape(sq_test, (sq_test.shape[0], 1))
        dists = mul + sq_train + sq_test
        dists = np.sqrt(dists)
        #########################################################################
        #                         END OF YOUR CODE                              #
        #########################################################################
        return dists

    def predict_labels(self, dists, k=1):
        """
        Given a matrix of distances between test points and training points,
        predict a label for each test point.

        Inputs:
        - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
          gives the distance betwen the ith test point and the jth training point.

        Returns:
        - y: A numpy array of shape (num_test,) containing predicted labels for the
          test data, where y[i] is the predicted label for the test point X[i].
        """
        num_test = dists.shape[0]
        y_pred = np.zeros(num_test)
        for i in range(num_test):
            # A list of length k storing the labels of the k nearest neighbors to
            # the ith test point.
            closest_y = []
            #########################################################################
            # TODO:                                                                 #
            # Use the distance matrix to find the k nearest neighbors of the ith    #
            # testing point, and use self.y_train to find the labels of these       #
            # neighbors. Store these labels in closest_y.                           #
            # Hint: Look up the function numpy.argsort.                             #
            #########################################################################
            #对距离进行排序
            idx = np.argsort(dists[i, :], axis=0)
            #########################################################################
            # TODO:                                                                 #
            # Now that you have found the labels of the k nearest neighbors, you    #
            # need to find the most common label in the list closest_y of labels.   #
            # Store this label in y_pred[i]. Break ties by choosing the smaller     #
            # label.                                                                #
            #########################################################################
            #选取前k个标签
            closest_y = self.y_train[idx[: k]]
            #求前k个标签中出现次数最多的一个
            y_pred[i] = np.argmax(np.bincount(closest_y))
            #########################################################################
            #                           END OF YOUR CODE                            #
            #########################################################################

        return y_pred

作业涉及:SVM、softmax和神经网络,要用到线性SVM的损失、softmax的损失、神经网络的方向传播。

二层神经网络:

正向传播

正向传播(forward propagation)是指对神经⽹络沿着从输⼊层到输出层的顺序,依次计算并存储模型的中间变量(包括输出)。为简单起⻅,假设输⼊是⼀个特征为x 2 Rd的样本,且不考虑偏差项,那么中间变量

在这里插入图片描述
在这里插入图片描述

方向传播

在这里插入图片描述
在这里插入图片描述
代码链接如下:https://download.csdn.net/download/qq_35494379/12329653

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值