python计算n维欧几里得距离_在Python中计算numpy的欧几里得距离

最新推荐文章于 2022-12-03 20:01:29 发布

weixin_39884100

最新推荐文章于 2022-12-03 20:01:29 发布

阅读量491

点赞数

文章标签： python计算n维欧几里得距离

I am new to Python so this question might look trivia. However, I did not find a similar case to mine. I have a matrix of coordinates for 20 nodes. I want to compute the euclidean distance between all pairs of nodes from this set and store them in a pairwise matrix. For example, If I have 20 nodes, I want the end result to be a matrix of (20,20) with values of euclidean distance between each pairs of nodes. I tried to used a for loop to go through each element of the coordinate set and compute euclidean distance as follows:

ncoord=numpy.matrix('3225 318;2387 989;1228 2335;57 1569;2288 8138;3514 2350;7936 314;9888 4683;6901 1834;7515 8231;709 3701;1321 8881;2290 2350;5687 5034;760 9868;2378 7521;9025 5385;4819 5943;2917 9418;3928 9770')

n=20

c=numpy.zeros((n,n))

for i in range(0,n):

for j in range(i+1,n):

c[i][j]=math.sqrt((ncoord[i][0]-ncoord[j][0])**2+(ncoord[i][1]-ncoord[j][1])**2)

How ever, I am getting an error of "input must be a square array

". I wonder if anybody knows what is happening here.

Thanks

解决方案

There are much, much faster alternatives to using nested for loops for this. I'll show you two different approaches - the first will be a more general method that will introduce you to broadcasting and vectorization, and the second uses a more convenient scipy library function.

1. The general way, using broadcasting & vectorization

One of the first things I'd suggest doing is switching to using np.array rather than np.matrix. Arrays are preferred for a number of reasons, most importantly because they can have >2 dimensions, and they make element-wise multiplication much less awkward.

import numpy as np

ncoord = np.array(ncoord)

With an array, we can eliminate the nested for loops by inserting a new singleton dimension and broadcasting the subtraction over it:

# indexing with None (or np.newaxis) inserts a new dimension of size 1

print(ncoord[:, :, None].shape)

# (20, 2, 1)

# by making the 'inner' dimensions equal to 1, i.e. (20, 2, 1) - (1, 2, 20),

# the subtraction is 'broadcast' over every pair of rows in ncoord

xydiff = ncoord[:, :, None] - ncoord[:, :, None].T

print(xydiff.shape)

# (20, 2, 20)

This is equivalent to looping over every pair of rows using nested for loops, but much, much faster!