一、什么是H3?
将地球空间划分成可是识别的单元。将经纬度H3编码成六边形的网格索引。
二、为什么用H3?
2.1 GEOHASH存在一些不足
- 不同精度下网格的形状不一且精度的变化幅度时小时大
- 在不同维度的地区会出现地理单元单位面积差异较大的情况
- 存在8邻域到中心网格的距离不相等问题
2.2 H3的映射原理简述
基于正多边形内角和公式(
θ
=
(
x
−
2
)
∗
180
\theta=(x-2)*180
θ=(x−2)∗180 ), 和顶点和为360计算出,
360
y
=
(
x
−
2
)
∗
180
x
\frac{360}{y} = \frac{(x-2)*180}{x}
y360=x(x−2)∗180 ,所有y(正多边形个数), x的组合
六边形因为边数最多,最接近圆,所以理论上来说在某些场景下是最优的选择。H3干脆摒弃传统的地图投影,直接在地球上铺满六边形。采用多层网格映射
三、H3的主要应用是什么?
- 优化乘车价格和调度(动态定价)
- 地图空间数据可视化和挖掘
- 用于整个市场的分析和优化
四、Uber H3实战: 英国交通事故点聚类
脚本在notebook里运行即可
import numpy as np
import pandas as pd
import folium
from h3 import h3
from sklearn.cluster import DBSCAN
from folium.plugins import HeatMap
def creat_map(cluster):
map_fig = folium.Map(zoom_start=12)
def color_choose(cnt):
color_list = ['#FFC1C1', '#EEB4B4', '#FF6A6A', '#EE6363', '#CD5555', '#8B3A3A']
if cnt <= 14:
return color_list[0]
elif cnt <= 17:
return color_list[1]
elif cnt <= 21:
return color_list[2]
elif cnt <= 25:
return color_list[3]
elif cnt <= 30:
return color_list[4]
else:
return color_list[5]
for cluster in cluster.values():
points = cluster['geom']
ac_cnt = cluster['count']
tooltip = f'{ac_cnt} accidents'
map_fig.add_child(
folium.vector_layers.Polygon(
locations=points,
tooltip=tooltip,
fill=True,
fill_color='#ff0000',
fill_opacity=0.4,
weight=2,
opacity=0.7
))
# 边界设置
max_lat = df.Latitude.max()
min_lat = df.Latitude.min()
max_lon = df.Longitude.max()
min_lon = df.Longitude.min()
map_fig.fit_bounds([[min_lat, min_lon], [max_lat, max_lon]])
return map_fig
file = './dftRoadSafety_Accidents_2016.csv'
column_types = {'Accident_Index': np.string_, 'LSOA_of_Accident_Location': np.string_}
uk_acc = pd.read_csv(file, dtype=column_types)
# 将经纬度转换成H3s
global H3_LEVEL
H3_LEVEL = 7
def lat_lng_2_h3(row):
return h3.geo_to_h3(row['Latitude'], row['Longitude'], H3_LEVEL)
uk_acc['h3'] = uk_acc.apply(lat_lng_2_h3, axis=1)
# DBSCAN 聚类
## 角度 -> 弧度 1 * np.pi / 180
uk_acc['rad_lng'] = np.radians(uk_acc['Longitude'].values)
uk_acc['rad_lat'] = np.radians(uk_acc['Latitude'].values)
eps_in_meter = 50.0
EARTH_R = 6370996.8 # 地球半径
dbscan = DBSCAN(eps=eps_in_meter/EARTH_R, min_samples=10, metric='haversine')
uk_acc = uk_acc.loc[~uk_acc['rad_lat'].isna(), :].reset_index(drop=True)
uk_acc['cluster'] = dbscan.fit_predict(uk_acc[['rad_lat', 'rad_lng']])
df = uk_acc[(uk_acc.cluster != -1)].reset_index(drop=True).copy()
uk_acc['cluster'].value_counts()
# 绘制聚合后的数据
clusters = dict()
for idx, row in df.iterrows():
key = row['h3']
if key in clusters:
clusters[key]['count'] += 1
else:
clusters[key] = {'count' : 1, 'geom': h3.h3_to_geo_boundary(h=key)}
relevat_clusters = {
k : v for (k, v) in clusters.items() if v['count'] >= 10
}
creat_map(relevat_clusters)
# 热力图
from folium.plugins import HeatMap
map_hooray = folium.Map(location=df.loc[0, ['Latitude', 'Longitude']].tolist(), zoom_start=14)
HeatMap(df[['Latitude', 'Longitude']]).add_to(map_hooray)
for idx in range(df.shape[0]):
folium.Marker(
df.loc[idx, ['Latitude', 'Longitude']].tolist(),
tooltip=df.loc[idx, 'cluster'].tolist()
).add_to(map_hooray)
map_hooray
参考:
https://www.biaodianfu.com/uber-h3.html
- 参考链接中部分脚本进行了修改