起因
最近在做一个车联网项目,有一个场景是车辆定时上报当前所在经纬度等位置信息上报给平台,平台通过web页面在高德地图上展示车辆行驶路径。
由于车辆上报规则是每隔4s上报一次,一个小时也就是900个点,一天也就是21600个点,如果是10辆车就是216000个点,那如果是100辆车,甚至是10000辆车对于数据库存储来说会是一个灾难,对于渲染地图,过多的点,也减少页面的流畅度。
考虑到车辆直线行驶的时候,只需要记录第一个点和最后一个点即可,期间的点位都是没有必要的,如果拐弯的话,只需要记录一两个拐弯的点即可,这样就可以减少点数,进而减少存储或者绘图压力。
有没有一种合适的算法解决此问题呢,遂查阅资料,了解到一个算法----道格拉斯-普克算法
介绍
道格拉斯-普克算法 (Douglas–Peucker algorithm,亦称为拉默-道格拉斯-普克算法、迭代适应点算法、分裂与合并算法)是将曲线近似表示为一系列点,并减少点的数量的一种算法。它的优点是具有平移和旋转不变性,给定曲线与阈值后,抽样结果一定。—摘自百度百科
如果有8个点,如上图(1),抽稀步骤如下:
在曲线首尾两点间虚连一条直线,求出其余各点到该直线的距离,如右图(1)。
选到点到直线距离的最大者与阈值相比较,若大于阈值,则记录该点,否则将直线两端点间各点全部舍去,如右图(2),记录第4个点,然后根据地4个点,将点分成两段1-4,4-8
然后分别对1-4,4-8重复第1、2步操作,迭代操作,即仍选距离最大者与阈值比较,依次取舍,直到无点可舍去,最后得到满足给定精度限差的曲线点坐标,如图(3)、(4)依次保留第6点、第7点,舍去其他点,即完成线的化简。
结合步骤,这里有两点数学知识,一是两点确定一条直线方程,二是求点到直线的距离。
点到直线的距离公式如下
PHP代码
/**
* 点到直线的距离
* @param $a float
* @param $b float
* @param $c float
* @param $xy string 点坐标例如 "2,2"
* @return number
*/
public function getDistanceFromPointToLine($a, $b, $c, $xy)
{
$x = explode(",", $xy)[0];
$y = explode(",", $xy)[1];
return abs(($a * $x + $b * $y + $c) / sqrt($a * $a + $b * $b));
}
/**
* 根据两个点求直线方程 ax+by+c=0
* @param $xy1 string 点1,例如"1,1"
* @param $xy2 string 点2,例如"2,2"
* @return array
*/
public function getLineByPoint($xy1, $xy2)
{
$x1 = explode(",", $xy1)[0];//第一个点的横坐标
$y1 = explode(",", $xy1)[1];//第一个点的纵坐标
$x2 = explode(",", $xy2)[0];//第二个点的横坐标
$y2 = explode(",", $xy2)[1];//第二个点的横坐标
$a = $y2 - $y1;
$b = $x1 - $x2;
$c = ($y1 - $y2) * $x1 - $y1 * ($x1 - $x2);
return [$a, $b, $c];
}
/**
* 稀疏点
* @param $points array 参数为["1,2","2,3"]点集
* @param $max float 阈值,越大稀疏效果越好但是细节越差
* @return array
*/
public function sparePoints($points, $max)
{
if (count($points) < 3) {
return $points;
}
$xy1 = $points[0];//取第一个点
$xy2 = end($points);//取最后一个点
list($a, $b, $c) = $this->getLineByPoint($xy1, $xy2);//获取直线方程的a,b,c值
$ret = [];//最后稀疏以后的点集
$dmax = 0;//记录点到直线的最大距离
$split = 0;//分割位置
for ($i = 1; $i < count($points) - 1; $i++) {
$d = $this->getDistanceFromPointToLine($a, $b, $c, $points[$i]);
if ($d > $dmax) {
$split = $i;
$dmax = $d;
}
}
if ($tmp) {//如果存在点到首位点连成直线的距离大于max的,即需要再次划分
$child1 = $this->sparePoints(array_slice($points, 0, $split + 1), $max);//按split分成左边一份,递归
$child2 = $this->sparePoints(array_slice($points, $split), $max);//按split分成右边一份,递归
//因为child1的最后一个点和child2的第一个点,肯定是同一个(即为分割点),合并的时候,需要注意一下
$ret = array_merge($ret, $child1, array_slice($child2, 1));
return $ret;
} else {//如果不存在点到直线的距离大于阈值的,那么就直接是首尾点了
return [$points[0], end($points)];
}
}
效果
随机找了一段路,根据高德地图的驾驶导航提示,选取了80个经纬度(因为这些点都是高德算出来的重要点位,经过了一定的稀疏),如下:
$points = [
"117.212448,39.133785",
"117.212669,39.133667",
"117.213165,39.133297",
"117.213203,39.13327",
"117.213554,39.133099",
"117.213669,39.13295",
"117.213921,39.132462",
"117.214088,39.132126",
"117.214142,39.131962",
"117.214188,39.13176",
"117.214233,39.131397",
"117.21418,39.13055",
"117.214279,39.130459",
"117.214539,39.130375",
"117.214874,39.130188",
"117.216881,39.128716",
"117.217598,39.127995",
"117.217972,39.12759",
"117.218338,39.127178",
"117.218407,39.127071",
"117.218567,39.126911",
"117.219704,39.125702",
"117.219795,39.12561",
"117.220284,39.125114",
"117.220619,39.124802",
"117.221046,39.124348",
"117.221138,39.124245",
"117.221268,39.124092",
"117.222321,39.122955",
"117.222824,39.122406",
"117.222916,39.122311",
"117.223663,39.121544",
"117.2239,39.121452",
"117.224113,39.12159",
"117.224251,39.121677",
"117.225136,39.122208",
"117.225281,39.122292",
"117.225319,39.122311",
"117.226273,39.122875",
"117.226685,39.123127",
"117.227371,39.12352",
"117.227806,39.123779",
"117.228477,39.124134",
"117.228531,39.124161",
"117.228531,39.124161",
"117.228668,39.124187",
"117.228897,39.124325",
"117.229767,39.12479",
"117.230927,39.12545",
"117.231186,39.12561",
"117.231659,39.125908",
"117.231834,39.126026",
"117.232018,39.126186",
"117.232185,39.126362",
"117.232353,39.126583",
"117.232658,39.126972",
"117.232658,39.126972",
"117.233124,39.12748",
"117.233253,39.127609",
"117.233368,39.127689",
"117.233513,39.127762",
"117.233665,39.127823",
"117.233734,39.127846",
"117.233833,39.127865",
"117.233994,39.127888",
"117.234138,39.127892",
"117.234329,39.127884",
"117.234612,39.127838",
"117.234955,39.127754",
"117.235252,39.12767",
"117.236282,39.12738",
"117.237137,39.127129",
"117.237671,39.126961",
"117.237953,39.126949",
"117.238213,39.126865",
"117.238472,39.126793",
"117.2397,39.126434",
"117.242233,39.125698",
"117.243538,39.12532",
"117.243645,39.125298",
];
$data = sparePoints($points, "0.00001");
运行结果
[
"117.212448,39.133785",
"117.212669,39.133667",
"117.213203,39.13327",
"117.213554,39.133099",
"117.213669,39.13295",
"117.214088,39.132126",
"117.214188,39.13176",
"117.214233,39.131397",
"117.21418,39.13055",
"117.214279,39.130459",
"117.214539,39.130375",
"117.214874,39.130188",
"117.216881,39.128716",
"117.217598,39.127995",
"117.218338,39.127178",
"117.218407,39.127071",
"117.219704,39.125702",
"117.220284,39.125114",
"117.220619,39.124802",
"117.222824,39.122406",
"117.223663,39.121544",
"117.2239,39.121452",
"117.225136,39.122208",
"117.227806,39.123779",
"117.228531,39.124161",
"117.228668,39.124187",
"117.228897,39.124325",
"117.229767,39.12479",
"117.230927,39.12545",
"117.231659,39.125908",
"117.231834,39.126026",
"117.232018,39.126186",
"117.232185,39.126362",
"117.232658,39.126972",
"117.233253,39.127609",
"117.233368,39.127689",
"117.233513,39.127762",
"117.233734,39.127846",
"117.233994,39.127888",
"117.234329,39.127884",
"117.234612,39.127838",
"117.234955,39.127754",
"117.236282,39.12738",
"117.237671,39.126961",
"117.237953,39.126949",
"117.243645,39.125298"
]
前后图对比,图一为稀疏前,图二为稀疏后的
由于测试给的阈值为0.0001,可以看到地图点位由80个直接减少到46个点,并且保留了较多地图拐点细节。
因为通过高德拿到的路径经纬度本身就是稀疏过的,这算是进行了二次稀疏,能有这个效果,已经很不错了,在实际中,因为点数会更多,调整合适的阈值,稀疏效果会非常理想。