python实现定位附近的_python中的加速地理定位算法

I have a set 100k of of geo locations (lat/lon) and a hexogonal grid (4k polygons). My goal is to calculate the total number of points which are located within each polygon.

My current algorithm uses 2 for loops to loop over all geo points and all polygons, which is really slow if I increase the number of polygons... How would you speedup the algorithm? I have uploaded a minimal example which creates 100k random geo points and uses 561 cells in the grid...

I also saw that reading the geo json file (with 4k polygons) takes some time, maybe i should export the polygons into a csv?

解决方案

You don't need to explicitly test each hexagon to see whether a given point is located inside it.

Let's assume, for the moment, that all of your points fall somewhere within the bounds of your hexagonal grid. Because your hexagons form a regular lattice, you only really need to know which of the hexagon centers is closest to each point.

This can be computed very efficiently using a scipy.spatial.cKDTree:

import numpy as np

from scipy.spatial import cKDTree

import json

with open('/tmp/grid.geojson', 'r') as f:

data = json.load(f)

verts = []

centroids = []

for hexagon in data['features']:

# a (7, 2) array of xy coordinates specifying the vertices of the hexagon.

# we ignore the last vertex since it's equal to the first

xy = np.array(hexagon['geometry']['coordinates'][0][:6])

verts.append(xy)

# compute the centroid by taking the average of the vertex coordinates

centroids.append(xy.mean(0))

verts = np.array(verts)

centroids = np.array(centroids)

# construct a k-D tree from the centroid coordinates of the hexagons

tree = cKDTree(centroids)

# generate 10000 normally distributed xy coordinates

sigma = 0.5 * centroids.std(0, keepdims=True)

mu = centroids.mean(0, keepdims=True)

gen = np.random.RandomState(0)

xy = (gen.randn(10000, 2) * sigma) + mu

# query the k-D tree to find which hexagon centroid is nearest to each point

distance, idx = tree.query(xy, 1)

# count the number of points that are closest to each hexagon centroid

counts = np.bincount(idx, minlength=centroids.shape[0])

Plotting the output:

from matplotlib import pyplot as plt

fig, ax = plt.subplots(1, 1, subplot_kw={'aspect': 'equal'})

ax.hold(True)

ax.scatter(xy[:, 0], xy[:, 1], 10, c='b', alpha=0.25, edgecolors='none')

ax.scatter(centroids[:, 0], centroids[:, 1], marker='h', s=(counts + 5),

c=counts, cmap='Reds')

ax.margins(0.01)

I can think of several different ways you could handle points that fall outside your grid depending on how much accuracy you need:

You could exclude points that fall outside the outer bounding rectangle of your hexagon vertices (i.e. x < xmin, x > xmax etc.). However, this will fail to exclude points that fall within the 'gaps' along the edges of your grid.

Another straightforward option would be to set a cut-off on distance according to the spacing of your hexagon centers, which is equivalent to using a circular approximation for your outer hexagons.

If accuracy is crucial then you could define a matplotlib.path.Path corresponding to the outer vertices of your hexagonal grid, then use its .contains_points() method to test whether your points are contained within it. Compared to the other two methods, this would probably be slower and more fiddly to code.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值