《Map-Matching for Low-Sampling-Rate GPS Trajectories》 读书笔记

概要

Map-matching is the process of aligning a sequence of observed user positions with the road network on a digital map. (关于mm的定义)

ST-Matching considers (1) the spatial geometric and topological structures of the road network and (2) the temporal/speed constraints of the trajectories. (本文算法的依据)

Typically a GPS trajectory consists of a sequence of points with latitude, longitude, and timestamp information. (工程实践中会考虑更多的因素,只要能提升算法的准确率)

In practice there exists large amount of low-sampling-rate (e.g., one point every 2 minutes) GPS trajectories. They are either application-logged data collected from ad-hoc location-based queries, or generated in the scenarios where saving of energy cost and communication cost are desired. (本文主要介绍低频的场景,也是其引入最短路径的前提条件。要知道如果是高频场景,相邻的点要么在同一条link上,要么在相邻link上)

In this paper we refer to low-sampling-rate as one point every 2 minutes or above. With such sampling rate, the distance between two points may reach over 1300m even a vehicle‟s speed is only 40km/h!

最短路径

The basic algorithm in shortest path computation is Dijkstra's algorithm. In practice, A* algorithm [10] is often used as a more efficient alternative. A* algorithm uses heuristic function to guide the search toward the destination. Other strategies such as bidirectional search [12], search decomposition [13], and hierarchical search [14] are often used in real applications as well. (介绍相关算法)

Some pre-processing steps can be added to speed up the shortest path search. For example, ALT algorithms in [6] employ the combination of landmarks and triangular inequality to reach a tighter lower bound than Euclidean distance used in A* algorithm. Reach-based-pruning [9] is another method to compute lower bound for pruning purpose. (预处理算法)

定义

这是我见过对场景定义最清楚的一篇论文了

Definition 1 (GPS Log): A GPS log is a collection of GPS points 𝐿 = {𝑝1, 𝑝2,... , 𝑝𝑛} . Each GPS point 𝑝𝑖 ∈ 𝐿 contains latitude 𝑝𝑖 . 𝑙𝑎𝑡, longitude 𝑝𝑖 . 𝑙𝑛𝑔 and timestamp 𝑝𝑖 . 𝑡, as illustrated in the left part of Figure 3. (定义点的集合)

Definition 2 (GPS Trajectory): A GPS Trajectory 𝑇 is a sequence of GPS points with the time interval between any consecutive GPS points not exceeding a certain threshold 𝛥𝑇, i.e. 𝑇:𝑝1 → 𝑝 →⋯→𝑝 , where𝑝 ∈𝐿, and0<𝑝 .𝑡−𝑝.𝑡<∆𝑇(1≤ 2𝑛𝑖 𝑖+1𝑖 𝑖 < 𝑛). Figure 3 shows an example of GPS trajectory. 𝛥𝑇 is the sampling interval. In this paper, we focus on low sampling rate GPS trajectories with ∆𝑇 ≥ 2𝑚𝑖𝑛. (定义轨迹)

Definition 3 (Road Segment): A road segment 𝑒 is a directed edge that is associated with an id 𝑒. 𝑒𝑖𝑑, a typical travel speed 𝑒. 𝑣, a length value 𝑒. 𝑙, a starting point 𝑒. 𝑠𝑡𝑎𝑟𝑡, an ending point 𝑒. 𝑒𝑛𝑑 and a list of intermediate points that describes the road using a polyline. Figure 4 shows several real road segments in Bing Map Search [2]. Note that a road may contain several road segments. (定义路段)

Definition 4 (Road Network): A road network is a directed graph 𝐺(𝑉, 𝐸), where 𝑉 is a set of vertices representing the intersections and terminal points of the road segments, and 𝐸 is a set of edges representing road segments. (定义路网的抽象)

Definition 5 (Path): Given two vertices 𝑉𝑖, 𝑉𝑗 in a road network 𝐺, a path 𝑃 is a set of connected road segments that start at 𝑉𝑖 and end at 𝑉𝑗 , i.e.𝑃:𝑒1→𝑒2→⋯→𝑒𝑛 , where 𝑒1.𝑠𝑡𝑎𝑟𝑡=𝑉𝑖, 𝑒𝑛.𝑒𝑛𝑑=𝑉𝑗, 𝑒𝑘.𝑒𝑛𝑑=𝑒𝑘+1.𝑠𝑡𝑎𝑟𝑡, 1≤𝑘<𝑛. (定义路径)

Given a raw GPS trajectory 𝑇 and a road network 𝐺(𝑉, 𝐸), find the path 𝑃 from 𝐺 that matches 𝑇 with its real path. (问题定义)

算法流程

It is composed of three major components: Candidate Preparation, Spatial and Temporal Analysis, and Result Matching.

Candidate Preparation

Given trajectory 𝑇 = 𝑝1 → 𝑝2 → ⋯ → 𝑝𝑛, we first retrieve a set of candidate road segments within radius 𝑟 of each point 𝑝𝑖 , 1 ≤ 𝑖 ≤ 𝑛. (候选集选取)

Definition 6 (Line Segment Projection): The line segment projection of a point 𝑝 to a road segment 𝑒 is the point 𝑐 on 𝑒 such that 𝑐 = arg 𝑚𝑖𝑛∀ 𝑐𝑖 ∈𝑒 𝑑𝑖𝑠𝑡(𝑐𝑖 , 𝑝) , where 𝑑𝑖𝑠𝑡(𝑐𝑖 , 𝑝) returns the distance between p and any point ci on 𝑒. (关于投影点的定义)

Spatial Analysis

Definition 7 (Observation Probability): The observation probability is defined as the likelihood that a GPS sampling point 𝑝𝑖 matches a candidate point 𝑐𝑖𝑗 computed based on the 𝑗 distance between the two points 𝑑𝑖𝑠𝑡(𝑐𝑖 , 𝑝𝑖) . (观察概率的定义,满足正太分布)

 转移概率这一部分有待商榷,特别是路径比原始距离要近的情况下,显得就不那么合理了。

最终概率值 

 Temporal Analysis

速度匹配,这一部分是本文的特色

定义平均速度

 

定义速度的相似度算法 

 

简而言之,行驶速度越接近该路段的平均速度,那么可能性就越大。

 Result Matching

求解和维特比算法一致

Synthetic Trajectory Data

It first randomly selects two vertices in the road network and compute top 𝐾 shortest paths between them. Then it randomly select a trajectory from the K paths as the ground truth, denoted as 𝐺: 𝑒1, 𝑒2, ... , 𝑒𝑛 . The motivation behind this is that moving objects generally follow the direction from source to destination, but not necessarily follow the shortest path strictly. Note that the time interval between any two neighboring points is not uniform. To retrieve a trajectory with desired sampling interval, the simulator select one road segment from every 𝑘′ segments on 𝐺 , The adjustment of sampling rate is therefore achieved by changing the value of 𝑘′ . The simulator generates one GPS point with estimated timestamp information for each selected road segment. The points are produced to follow the zero-mean normal distribution with the standard deviation of 20 meters. (机器模拟测试)

Evaluation Criteria:

小结

   本篇论文主要探讨了低频模式下的mm策略,虽然没有明说,但实际算法依然是采用的hmm模型。无论高频还是低频模式,转移概率均是考虑了观察点的欧式距离与最短路径的差值,越接近的则概率越大。本文的一个特色是考虑了道路的通畅行驶速度,并且提供了一个公式。另外,本文对于问题的定义非常的清晰,各项指标也有清楚的公式。最后,本文提供了一个模拟数据的方法,避免了人工标注带来的巨大工作量。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值