有向图 寻路算法
旅游旅客 (Traveling tourist)
In the first part of the series, we constructed a knowledge graph of monuments located in Spain from WikiData API. Now we’ll put on our graph data science goggles and explore various pathfinding algorithms available in the Neo4j Graph Data Science library. To top it off, we’ll look at a brute force solution for a Santa Claus problem. Now, you might wonder what a Santa Claus problem is. It is a variation of the traveling salesman problem, except we don’t require the solution to end in the same city as it started. This is because of the Santa Claus’ ability to bend the time-space continuum and instantly fly back to the North Pole once he’s finished with delivering goodies.
在本系列的第一部分中 ,我们从WikiData API构建了位于西班牙的古迹的知识图。 现在,我们将穿上我们的图形数据科学护目镜,并探索Neo4j图形数据科学库中可用的各种寻路算法。 最重要的是,我们将研究针对圣诞老人问题的蛮力解决方案。 现在,您可能想知道圣诞老人的问题是什么。 这是旅行商问题的一个变体,除了我们不要求解决方案在开始时所在的城市结束。 这是因为圣诞老人能够弯曲时空连续体,并在他完成交付好东西后立即飞回北极。
议程 (Agenda)
- Infer spatial network of monuments 推断古迹的空间网络
- Load the in-memory projected graph with cypher projection 使用密码投影加载内存中投影图
- Weakly connected component algorithm 弱连接组件算法
- Shortest path algorithm 最短路径算法
- Yen’s k-shortest path algorithm 日元的K最短路径算法
- Single source shortest paths algorithm 单源最短路径算法
- Minimum spanning tree algorithm 最小生成树算法
- Random walk algorithm 随机游走算法
- Traveling salesman problem 旅行商问题
- Conclusion 结论
推断古迹的空间网络 (Infer spatial network of monuments)
Currently, we have no direct relationships between the monuments in our graph. We do, however, have their GPS locations, which allows us to identify which monuments are nearby. This way, we can infer a spatial network of monuments.
目前,我们的图表中的纪念碑之间没有直接关系。 但是,我们确实拥有其GPS位置,这使我们能够识别附近有哪些古迹。 这样,我们可以推断出古迹的空间网络。
The process is very similar to inferring a similarity network. We usually don’t want to end up with a complete graph, where each node is connected to all the other ones. It would defeat the purpose of demonstrating pathfinding algorithms as the shortest path between any two nodes would always be a straight line, which would be represented as a direct relationship between the two nodes. In our case, we will connect each monument to the five closest monuments that are less than 100 kilometers away. These two numbers are entirely arbitrary. You can pick any other depending on your scenario.
该过程与推断相似性网络非常相似。 我们通常不希望最终得到一个完整的图,其中每个节点都与所有其他节点相连。 由于任何两个节点之间的最短路径始终是一条直线,因此将无法证明演示寻路算法的目的,这将被表示为两个节点之间的直接关系。 在我们的案例中,我们会将每个纪念碑与距离最近的100个距离不到的五座纪念碑相连。 这两个数字完全是任意的。 您可以根据情况选择其他任何一种。
MATCH (m1:Monument),(m2:Monument)
WHERE id(m1) > id(m2)
WITH m1,m2, distance(m1.location_point,m2.location_point) as distance
ORDER BY distance ASC
WHERE distance < 100000
WITH m1,collect({node:m2,distance:distance})[..5] as nearest
UNWIND nearest as near
WITH m1, near, near.node as nearest_node
MERGE (m1)-[m:NEAR]-(nearest_node) SET m.distance = near.distance
使用密码投影加载内存中投影图 (Load the in-memory projected graph with cypher projection)
Let’s just quickly refresh how does the GDS library work.
让我们快速刷新一下GDS库的工作方式。
![Image for post](https://img-blog.csdnimg.cn/img_convert/e3c51f2d5dac9535f2c7e3f76e88c588.png)
The graph analytics pipeline consists of three parts. In the first part, the graph loader reads the stored graph from Neo4j and loads it as an in-memory projected graph. We can use either native projection or cypher projection to load the projected graph. In the second step, we execute the graph algorithms in sequence. We can use the results of one graph algorithm as an input to another. Last but not least, we store or stream the results back to Neo4j.
图分析管道包括三个部分。 在第一部分中,图加载器从Neo4j中读取存储的图,并将其作为内存中的投影图加载。 我们可以使用本机投影或密码投影来加载投影图。 在第二步中,我们依次执行图算法。 我们可以将一种图形算法的结果用作另一种图形算法的输入。 最后但并非最不重要的一点是,我们将结果存储或流回Neo4j。
Here, we will use the cypher projection to load the in-memory graph. I suggest you take a look at the official documentation for more details regarding how it works. In the node statement, we will describe all monuments in our graph and add their architecture style as a node label. Adding a custom node label will allow us to filter nodes by architectural style at algorithm execution time. In the relationship statement, we will describe all the links between monuments and include the distance property, that we will use as a relationship weight.
在这里,我们将使用密码投影来加载内存中的图。 我建议您查看官方文档,以获取有关其工作原理的更多详细信息。 在node语句中,我们将在图中描述所有纪念碑,并将其建筑风格添加为node标签。 添加自定义节点标签将使我们能够在算法执行时按体系结构样式过滤节点。 在关系声明中,我们将描述纪念碑之间的所有链接,并包括distance属性,将其用作关系权重。
CALL gds.graph.create.cypher('monuments',
'MATCH (m:Monument)-[:ARCHITECTURE]->(a)
RETURN id(m) as id, collect(a.name) as labels',
'MATCH (m1:Monument)-[r:NEAR]-(m2:Monument)
RETURN id(m1) as source, id(m2) as target, r.distance as distance')
弱连接组件算法 (Weakly connected component algorithm)
Even though the