用python计算邮费考虑是否加急,Python – 低效的空间距离计算(如何加速)

World VIII

于 2021-03-27 16:18:21 发布

阅读量231

点赞数

文章标签：用python计算邮费考虑是否加急

你这样做的方式很慢,因为你使用的是O(n²)算法：每一行都会查看每一行.

Georgy’s answer虽然引入了矢量化,但并未解决这种基本的低效问题.

我建议将数据点加载到kd-tree：这种数据结构提供了一种快速查找多维度最近邻居的方法.这种树的构造在O(n log n)中,并且查询采用O(log n),因此总时间在O(n log n)中.

如果您的数据已本地化为可以通过平面很好地逼近的地理区域,则投影数据,然后在两个维度中执行查找.否则,如果您的数据是全局分散的,请投影到spherical cartesian coordinates并在那里执行查找.

您可以如何执行此操作的示例如下所示：

#/usr/bin/env python3

import numpy as np

import scipy as sp

import scipy.spatial

Rearth = 6371

#Generate uniformly-distributed lon-lat points on a sphere

#See: http://mathworld.wolfram.com/SpherePointPicking.html

def GenerateUniformSpherical(num):

#Generate random variates

pts = np.random.uniform(low=0, high=1, size=(num,2))

#Convert to sphere space

pts[:,0] = 2*np.pi*pts[:,0] #0-360 degrees

pts[:,1] = np.arccos(2*pts[:,1]-1) #0-180 degrees

#Convert to degrees

pts = np.degrees(pts)

#Shift ranges to lon-lat

pts[:,0] -= 180

pts[:,1] -= 90

return pts

def ConvertToXYZ(lonlat):

theta = np.radians(lonlat[:,0])+np.pi

phi = np.radians(lonlat[:,1])+np.pi/2

x = Rearth*np.cos(theta)*np.sin(phi)

y = Rearth*np.sin(theta)*np.sin(phi)

z = Rearth*np.cos(phi)

return np.transpose(np.vstack((x,y,z)))

#For each entry in qpts, find the nearest point in the kdtree

def GetNearestNeighbours(qpts,kdtree):

pts3d = ConvertToXYZ(qpts)

#See: https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.spatial.KDTree.query.html#scipy.spatial.KDTree.query

#p=2 implies Euclidean distance, eps=0 implies no approximation (slower)

return kdtree.query(pts3d,p=2,eps=0)

#Generate uniformly-distributed test points on a sphere. Note that you'll want

#to find a way to extract your pandas columns into an array of width=2, height=N

#to match this format.

df1 = GenerateUniformSpherical(10000)

df2 = GenerateUniformSpherical(10000)

#Convert df2 into XYZ coordinates. WARNING! Do not alter df2_3d or kdtree will

#malfunction!

df2_3d = ConvertToXYZ(df2)

#Build a kd-tree from df2_3D

kdtree = sp.spatial.KDTree(df2_3d, leafsize=10) #Stick points in kd-tree for fast look-up

#Return the distance to, and index of, each of df1's nearest neighbour points

distance, indices = GetNearestNeighbours(df1,kdtree)

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
用python计算邮费考虑是否加急,Python – 低效的空间距离计算(如何加速)

你这样做的方式很慢,因为你使用的是O(n²)算法：每一行都会查看每一行.Georgy’s answer虽然引入了矢量化,但并未解决这种基本的低效问题.我建议将数据点加载到kd-tree：这种数据结构提供了一种快速查找多维度最近邻居的方法.这种树的构造在O(n log n)中,并且查询采用O(log n),因此总时间在O(n log n)中.如果您的数据已本地化为可以通过平面很好地逼近的地理区域,则...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。