EECS 498-007/598-005 Assignment 1-2: K-Nearest Neighbors (k-NN)

最新推荐文章于 2024-09-26 19:15:00 发布

我不会离散数学

最新推荐文章于 2024-09-26 19:15:00 发布

阅读量991

点赞数 4

分类专栏：深度学习文章标签： python html

本文链接：https://blog.csdn.net/qq_55288004/article/details/120424982

版权

深度学习专栏收录该内容

2 篇文章 0 订阅

订阅专栏

EECS 498-007/598-005 Assignment 1-2: K-Nearest Neighbors (k-NN)

注意：我是在colab上做的，密歇根大学2019年秋的课程作业，题目放在下面的链接里了
https://web.eecs.umich.edu/~justincj/teaching/eecs498/FA2019/assignment1.html

代码先放在这里了，这几天把自己的理解感悟，还有一些解释补上
https://github.com/Icy-liver/EECS-498-007-598-005-Assignment-1-2-K-Nearest-Neighbors-k-NN-

前面加载数据包等的前期工作，这里就省略了，需要的话就在上面↑的链接里找吧。提醒一下，这里要登入colab，需要挂梯子。

来！上题！

Compute distances: Naive implementation

Now that we have examined and prepared our data, it is time to implement the kNN classifier. We can break the process down into two steps:

Compute the (squared Euclidean) distances between all training examples and all test examples
Given these distances, for each test example find its k nearest neighbors and have them vote for the label to output

Lets begin with computing the distance matrix between all training and test examples. First we will implement a naive version of the distance computation, using explicit loops over the training and test sets:

NOTE: When implementing distance functions in this notebook, you may not use the torch.norm function (or its instance method variant x.norm); you may not use any functions from torch.nn or torch.nn.functional.

从题目可以看到，这里要我们计算每两张图片间的欧式距离。我们分别要用3种方法计算其欧氏距离。n维空间的欧式距离公式如下：
在这里插入图片描述
顺便提醒一句，计算图片的欧式距离时，图片是看作n维的。（我一开始搞不清图片的维度走了好多弯路5555）

使用两个循环compute_distances_two_loops

def compute_distances_two_loops(x_train, x_test):
  """
  Computes the squared Euclidean distance between each element of the training
  set and each element of the test set. Images should be flattened and treated
  as vectors.

  This implementation uses a naive set of nested loops over the training and
  test data.
  
  Inputs:
  - x_train: Torch tensor of shape (num_train, C, H, W)
  - x_test: Torch tensor of shape (num_test, C, H, W)

  Returns:
  - dists: Torch tensor of shape (num_train, num_test) where dists[i, j] is the
    squared Euclidean distance between the ith training point and the jth test
    point.
  """
  # Initialize dists to be a tensor of shape (num_train, num_test) with the
  # same datatype and device as x_train
  num_train = x_train.shape[0]
  num_test = x_test.shape[0]
  dists = x_train.new_zeros(num_train, num_test)
  ##############################################################################
  # TODO: Implement this function using a pair of nested loops over the        #
  # training data and the test data.                                           #
  #                                                                            #
  # You may not use torch.norm (or its instance method variant), nor any       #
  # functions from torch.nn or torch.nn.functional.                            #
  ##############################################################################
  # Replace "pass" statement with your code
  
  for i in range(0,num_train):
    for j in range(0,num_test):
      dists[i,j]=torch.sqrt(torch.sum((x_train[i]-x_test[j])**2))
  #pass
 ################################################################################
  #                             END OF YOUR CODE                               #
  ##############################################################################
  return dists

这里就不过多解释了，就是一个个计算dists，根据公式来就好了
后面输出的图像大概长这样：在这里插入图片描述

使用一个循环compute_distances_one_loop

def compute_distances_one_loop(x_train, x_test):
  """
  Computes the squared Euclidean distance between each element of the training
  set and each element of the test set. Images should be flattened and treated
  as vectors.

  This implementation uses only a single loop over the training data.

  Inputs:
  - x_train: Torch tensor of shape (num_train, C, H, W)
  - x_test: Torch tensor of shape (num_test, C, H, W)

  Returns:
  - dists: Torch tensor of shape (num_train, num_test) where dists[i, j] is the
    squared Euclidean distance between the ith training point and the jth test
    point.
  """
  # Initialize dists to be a tensor of shape (num_train, num_test) with the
  # same datatype and device as x_train
  num_train = x_train.shape[0]
  num_test = x_test.shape[0]
  dists = x_train.new_zeros(num_train, num_test)
  ##############################################################################
  # TODO: Implement this function using only a single loop over x_train.       #
  #                                                                            #
  # You may not use torch.norm (or its instance method variant), nor any       #
  # functions from torch.nn or torch.nn.functional.                            #
  ##############################################################################
  # Replace "pass" statement with your code

  new_x_train=x_train.reshape(num_train,-1)
  new_x_test=x_test.reshape(num_test,-1)
  for i in range(0,num_train):
    dists[i]=torch.sqrt(torch.sum((new_x_test-new_x_train[i])**2,dim=1))
  #pass
  ##############################################################################
  #                             END OF YOUR CODE                               #
  ##############################################################################
  return dists

这里首先要把图片打扁！！打成向量的形式。每一张图片就是一个向量，32323=3072，shape就是（1,3072）
后面就是每行每行地解决dists
new_x_test - new_x_train[i]就是每个test都减去train[i]这张图片

无循环compute_distances_no_loops

def compute_distances_no_loops(x_train, x_test):
  """
  Computes the squared Euclidean distance between each element of the training
  set and each element of the test set. Images should be flattened and treated
  as vectors.

  This implementation should not use any Python loops. For memory-efficiency,
  it also should not create any large intermediate tensors; in particular you
  should not create any intermediate tensors with O(num_train*num_test)
  elements.

  Inputs:
  - x_train: Torch tensor of shape (num_train, C, H, W)
  - x_test: Torch tensor of shape (num_test, C, H, W)

  Returns:
  - dists: Torch tensor of shape (num_train, num_test) where dists[i, j] is the
    squared Euclidean distance between the ith training point and the jth test
    point.
  """
  # Initialize dists to be a tensor of shape (num_train, num_test) with the
  # same datatype and device as x_train
  num_train = x_train.shape[0]
  num_test = x_test.shape[0]
  dists = x_train.new_zeros(num_train, num_test)
  ##############################################################################
  # TODO: Implement this function without using any explicit loops and without #
  # creating any intermediate tensors with O(num_train * num_test) elements.   #
  #                                                                            #
  # You may not use torch.norm (or its instance method variant), nor any       #
  # functions from torch.nn or torch.nn.functional.                            #
  #                                                                            #
  # HINT: Try to formulate the Euclidean distance using two broadcast sums     #
  #       and a matrix multiply.                                               #
  ##############################################################################
  # Replace "pass" statement with your code
  new_x_train=x_train.reshape(num_train,-1)
  new_x_test=x_test.reshape(num_test,-1)

  s=torch.mm(new_x_train,new_x_test.transpose(0,1))
  sq1=torch.sum(new_x_train**2,dim=1)
  sq2=torch.sum(new_x_test**2,dim=1)
  dists=-2*s+dists+sq1.reshape(-1,1)+sq2.reshape(1,-1)
  dists=torch.sqrt(dists)
  
  #pass
  ##############################################################################
  #                             END OF YOUR CODE                               #
  ##############################################################################
  return dists