【CS231n】Spring 2020 Assignments - Assignment1 - kNN


前言

我的作业是在 Google Colaboratory 上完成的,还是挺方便的。


一、Implement compute_distances_two_loops

这题比较简单,需要注意的就是 X_train 等数组的shape,这些其实在上面的代码块里面已经输出提示我们是shape为 (5000, 3072) 和 (500, 3072) 的二维数组了(如下)。

shapes of arrays

比较简单,直接贴源码

def compute_distances_two_loops(self, X):
        """
        Compute the distance between each test point in X and each training point
        in self.X_train using a nested loop over both the training data and the
        test data.

        Inputs:
        - X: A numpy array of shape (num_test, D) containing test data.

        Returns:
        - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
          is the Euclidean distance between the ith test point and the jth training
          point.
        """
        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train))
        for i in range(num_test):
            for j in range(num_train):
                #####################################################################
                # TODO:                                                             #
                # Compute the l2 distance between the ith test point and the jth    #
                # training point, and store the result in dists[i, j]. You should   #
                # not use a loop over dimension, nor use np.linalg.norm().          #
                #####################################################################
                # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

                dists[i,j] = np.sqrt(np.sum(np.square(self.X_train[j] - X[i])))

                # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        return dists


二、Implement predict_labels

这题根据给出的提示即可。numpy.argsort 函数返回的是某个数组按升序(从小到大)排序后的下标数组,如输入 [1, 4, 3] 返回的是 [0, 2, 1]。然后利用切片取出前K个下标,再到 y_train 里面取出标签即可。最后用 max 函数获得出现最多的标签。

下面给出源代码:

def predict_labels(self, dists, k=1):
        """
        Given a matrix of distances between test points and training points,
        predict a label for each test point.

        Inputs:
        - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
          gives the distance betwen the ith test point and the jth training point.

        Returns:
        - y: A numpy array of shape (num_test,) containing predicted labels for the
          test data, where y[i] is the predicted label for the test point X[i].
        """
        num_test = dists.shape[0]
        y_pred = np.zeros(num_test)
        for i in range(num_test):
            # A list of length k storing the labels of the k nearest neighbors to
            # the ith test point.
            closest_y = []
            #########################################################################
            # TODO:                                                                 #
            # Use the distance matrix to find the k nearest neighbors of the ith    #
            # testing point, and use self.y_train to find the labels of these       #
            # neighbors. Store these labels in closest_y.                           #
            # Hint: Look up the function numpy.argsort.                             #
            #########################################################################
            # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

            k_indexs = np.argsort(dists[i])[0:k]
            closest_y = [self.y_train[j] for j in k_indexs]

            # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
            #########################################################################
            # TODO:                                                                 #
            # Now that you have found the labels of the k nearest neighbors, you    #
            # need to find the most common label in the list closest_y of labels.   #
            # Store this label in y_pred[i]. Break ties by choosing the smaller     #
            # label.                                                                #
            #########################################################################
            # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
            
            y_pred[i] = max(closest_y, key=closest_y.count)

            # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

        return y_pred

三、Implement compute_distances_one_loop

这个就是部分利用向量计算来提高运算速度,感觉考察的其实是对Python以及Numpy的掌握程度。

直接贴源码:

def compute_distances_one_loop(self, X):
        """
        Compute the distance between each test point in X and each training point
        in self.X_train using a single loop over the test data.

        Input / Output: Same as compute_distances_two_loops
        """
        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train))
        for i in range(num_test):
            #######################################################################
            # TODO:                                                               #
            # Compute the l2 distance between the ith test point and all training #
            # points, and store the result in dists[i, :].                        #
            # Do not use np.linalg.norm().                                        #
            #######################################################################
            # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

            dists[i, :] = np.sqrt(np.sum(np.square(self.X_train - X[i]), axis=1))

            # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        return dists

四、Implement compute_distances_no_loops

这个问题要求全部用向量计算来得出距离矩阵,不能使用任何显式的循环。这就比较复杂了,主要考察的是线性代数中对矩阵运算的掌握吧。

首先看提示,需要用到一个矩阵乘法和两个广播求和。乍一看很莫名其妙吧,但是我们来回顾一下 L1 距离的定义吧:

d 2 ( I 1 , I 2 ) = ∑ p ( I 1 p − I 2 p ) 2 d_2(I_1, I_2) = \sqrt{\sum_{p}(I_1^p - I_2^p)^2} d2(I1,I2)=p(I1pI2p)2

对于我们而言, I I I 就是一张图片, I 1 I_1 I1 I 2 I_2 I2 分别对应训练图片和测试图片,而且本次作业已经给我们做了简化处理了,每张图片都只是一个一维数组。假设一张图片有n个像素点P,即shape = (n, ),我们可以将上述公式转换为:

d 2 ( I 1 , I 2 ) = ∑ n ( P 1 − P 2 ) 2 d_2(I_1, I_2) = \sqrt{\sum_{n}(P_1 - P_2)^2} d2(I1,I2)=n(P1P2)2

再回到给出的提示,一个乘法和两个加法,是不是突然豁然开朗了?将平方项展开可以得到下列公式:

d 2 ( I 1 , I 2 ) = ∑ n ( P 1 2 − 2 ∗ P 1 ∗ P 2 + P 2 2 ) d_2(I_1, I_2) = \sqrt{\sum_{n}(P_1^2 - 2 * P_1 * P_2 + P_2^2)} d2(I1,I2)=n(P122P1P2+P22)

我们使用一个矩阵乘法来计算一次项,即将 X_train 乘以 X_test 的转置矩阵再乘以2,得出来的就是只包含 2 ∗ P 1 ∗ P 2 2 * P_1 * P_2 2P1P2 的 dists 数组。

接下来就比较简单了,用 np.square 分别计算 X_train 和 X_test 的每一项的平方,然后按照第2纬度相加(即把每张图片的每个像素的平方集中为一项),最后相加即可。

需要注意的是 dists 数组中纵向是 X_test (即测试集),横向是 X_train (即训练集),所以有些地方需要进行转置。

下面是源码(其实就一行):

def compute_distances_no_loops(self, X):
        """
        Compute the distance between each test point in X and each training point
        in self.X_train using no explicit loops.

        Input / Output: Same as compute_distances_two_loops
        """
        num_test = X.shape[0]
        num_train = self.X_train.shape[0]
        dists = np.zeros((num_test, num_train))
        #########################################################################
        # TODO:                                                                 #
        # Compute the l2 distance between all test points and all training      #
        # points without using any explicit loops, and store the result in      #
        # dists.                                                                #
        #                                                                       #
        # You should implement this function using only basic array operations; #
        # in particular you should not use functions from scipy,                #
        # nor use np.linalg.norm().                                             #
        #                                                                       #
        # HINT: Try to formulate the l2 distance using matrix multiplication    #
        #       and two broadcast sums.                                         #
        #########################################################################
        # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

        dists = np.sqrt((np.sum(np.square(self.X_train), axis=1).reshape(num_train, 1) - 2 * np.dot(self.X_train, X.T) + np.sum(np.square(X), axis=1))).T

        # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        return dists

五、Cross-validation

交叉验证的话,上课好好听讲其实很容易就懂了。需要注意的地方是 k_to_accuracies[k] 里面存放的是 k 在不同验证集下面的准确率数组

num_folds = 5
k_choices = [1, 3, 5, 8, 10, 12, 15, 20, 50, 100]

X_train_folds = []
y_train_folds = []

X_train_folds = np.array_split(X_train, num_folds)
y_train_folds = np.array_split(y_train, num_folds)

k_to_accuracies = {}

new_num_test = num_test / num_folds

for k in k_choices:
  accuracy = 0.0
  k_accuracies = np.zeros((num_folds))
  for f in range(num_folds):
    # train
    new_X_train = np.concatenate(([X_train_folds[i] for i in range(num_folds) if i != f]), axis=0)
    new_y_train = np.concatenate(([y_train_folds[i] for i in range(num_folds) if i != f]), axis=0)
    classifier.train(new_X_train, new_y_train)
    # pred
    new_X_test = X_train_folds[f]
    new_y_test = y_train_folds[f]
    dists = classifier.compute_distances_no_loops(new_X_test)
    y_test_pred = classifier.predict_labels(dists, k)
    num_correct = np.sum(y_test_pred == new_y_test)
    k_accuracies[f] = float(num_correct) / num_test
  k_to_accuracies[k] = k_accuracies


总结

这些题目讲道理不算很难,但是因为Python、Numpy不是很熟悉的以及线性代数有些知识点有些忘记所以还是话费了比较长的时间。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值