two loops to none loop进阶——像素矩阵处理
问题来源——cs231n 作业 assignment1 knn,计算 test data 与 train data 两两图片欧式距离
双层循环
def compute_distances_two_loops(self, X):
"""
Compute the distance between each test point in X and each training point
in self.X_train using a nested loop over both the training data and the
test data.
Inputs:
- X: A numpy array of shape (num_test, D) containing test data.
Returns:
- dists: A numpy array of shape (num_test, num_train) where dists[i, j]
is the Euclidean distance between the ith test point and the jth training
point.
"""
num_test = X.shape[0]
num_train = self.X_train.shape[0]
dists = np.zeros((num_test, num_train))
for i in range(num_test):
for j in range(num_train):
#####################################################################
# TODO:
# Compute the l2 distance between the ith test point and the jth
# training point, and store the result in dists[i, j]. You should
# not use a loop over dimension.
#####################################################################
dists[i][j] = np.sqrt(np.sum(np.square(X[i,:] - self.X_train[j,:])))
return dists
一层循环
dists[i, :] = np.sqrt(np.sum(np.square(self.X_train - X[i, :]),axis=1))
无循环
temp_2xy = np.dot(X,self.X_train.T) * (-2)
temp_x2 = np.sum(np.square(X),axis=1,keepdims=True)
temp_y2 = np.sum(np.square(self.X_train),axis=1)
dists = temp_x2 + temp_2xy + temp_y2
dists = np.sqrt(dists)
原理解析——无循环(numpy矩阵运算和broadcast)
import numpy as np
#a.shape=(num,D)=(6,3),a有6张照片,每张照片3个像素,代表train_data
#b.shape=(num,D)=(6,3),b有6张照片,每张照片3个像素,代表test_data
a = np.array([[1,2,3],[2,3,4],[3,4,5],[4,5,6],[5,6,7],[6,7,8]])
b = np.arange(18).reshape((6,3))
ab = np.dot(a,b.T)
aa = np.sum(np.square(a),axis=1).reshape((6,1))
bb = np.sum(np.square(b),axis=1,keepdims=True)
dists =np.sqrt(aa - 2*ab + bb)