用python计算邮费考虑是否加急,Python – 低效的空间距离计算(如何加速)

你这样做的方式很慢,因为你使用的是O(n²)算法:每一行都会查看每一行.

Georgy’s answer虽然引入了矢量化,但并未解决这种基本的低效问题.

我建议将数据点加载到kd-tree:这种数据结构提供了一种快速查找多维度最近邻居的方法.这种树的构造在O(n log n)中,并且查询采用O(log n),因此总时间在O(n log n)中.

如果您的数据已本地化为可以通过平面很好地逼近的地理区域,则投影数据,然后在两个维度中执行查找.否则,如果您的数据是全局分散的,请投影到spherical cartesian coordinates并在那里执行查找.

您可以如何执行此操作的示例如下所示:

#/usr/bin/env python3

import numpy as np

import scipy as sp

import scipy.spatial

Rearth = 6371

#Generate uniformly-distributed lon-lat points on a sphere

#See: http://mathworld.wolfram.com/SpherePointPicking.html

def GenerateUniformSpherical(num):

#Generate random variates

pts = np.random.uniform(low=0, high=1, size=(num,2))

#Convert to sphere space

pts[:,0] = 2*np.pi*pts[:,0] #0-360 degrees

pts[:,1] = np.arccos(2*pts[:,1]-1) #0-180 degrees

#Convert to degrees

pts = np.degrees(pts)

#Shift ranges to lon-lat

pts[:,0] -= 180

pts[:,1] -= 90

return pts

def ConvertToXYZ(lonlat):

theta = np.radians(lonlat[:,0])+np.pi

phi = np.radians(lonlat[:,1])+np.pi/2

x = Rearth*np.cos(theta)*np.sin(phi)

y = Rearth*np.sin(theta)*np.sin(phi)

z = Rearth*np.cos(phi)

return np.transpose(np.vstack((x,y,z)))

#For each entry in qpts, find the nearest point in the kdtree

def GetNearestNeighbours(qpts,kdtree):

pts3d = ConvertToXYZ(qpts)

#See: https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.spatial.KDTree.query.html#scipy.spatial.KDTree.query

#p=2 implies Euclidean distance, eps=0 implies no approximation (slower)

return kdtree.query(pts3d,p=2,eps=0)

#Generate uniformly-distributed test points on a sphere. Note that you'll want

#to find a way to extract your pandas columns into an array of width=2, height=N

#to match this format.

df1 = GenerateUniformSpherical(10000)

df2 = GenerateUniformSpherical(10000)

#Convert df2 into XYZ coordinates. WARNING! Do not alter df2_3d or kdtree will

#malfunction!

df2_3d = ConvertToXYZ(df2)

#Build a kd-tree from df2_3D

kdtree = sp.spatial.KDTree(df2_3d, leafsize=10) #Stick points in kd-tree for fast look-up

#Return the distance to, and index of, each of df1's nearest neighbour points

distance, indices = GetNearestNeighbours(df1,kdtree)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值