图论邮递员问题程序代码_图论的简短实用程序员指南

最新推荐文章于 2023-04-05 17:14:42 发布

weixin_26737625

最新推荐文章于 2023-04-05 17:14:42 发布

阅读量457

点赞数

文章标签：图论

原文链接：https://towardsdatascience.com/a-short-practical-programmers-guide-to-graph-theory-bfc87bc52581

版权

图论邮递员问题程序代码

Graphs are very useful structures to work with in programming, since very often computer science problems can be represented as a graph and solved with one of many existing graph techniques. In addition, not necessarily directly using graphs but approaching the problem through graph-based thinking and modelling can improve task clarity and efficiency.

图形是在编程中非常有用的结构，因为计算机科学问题通常可以表示为图形，并可以使用许多现有图形技术之一来解决。此外，不一定直接使用图形，而是通过基于图形的思维和建模来解决问题可以提高任务的清晰度和效率。

While graph theory is a deep and fascinating field, this article will use the following sections cover broad parts of graph theory relevant to the programmer:

尽管图论是一个深刻而有趣的领域，但本文将使用以下部分涵盖与程序员相关的图论的广泛部分：

Graph/node-based thinking and approaches to search problems
基于图/节点的思维和搜索问题的方法
Implementation of a graph with object-oriented programming
用面向对象编程实现图形
Different representations of graphs (adjacency lists, adjacency matrices)
图的不同表示形式(邻接表，邻接矩阵)
Types of graphs and their implementations: un/directed graphs, un/weighted graphs, cycle graphs, a/cyclic graphs
图的类型及其实现：无/有向图，无/加权图，循环图，a /循环图
Dijkstra’s algorithm, weaknesses, and alternatives
Dijkstra的算法，弱点和替代方法
Applications of graph theory
图论的应用
Summary/Key Points
摘要/重点

An undirected and unweighted graph is the simplest form of a graph (besides a single node). It consists only of two types of elements: nodes, which can be thought of as points, and edges, which connect these points together. There is no idea of distance/cost or direction, which is why it is undirected and unweighted.

无向和无权图是图的最简单形式(除单个节点外)。它仅由两种类型的元素组成：可以视为点的节点和将这些点连接在一起的边。没有距离/成本或方向的概念，这就是为什么它是无方向性的和未加权的。

For instance, consider the following search problem, and represent it as an undirected and unweighted graph.

例如，考虑以下搜索问题，并将其表示为无向和无权图。

You have a padlock with two digits, initialized to 00. Each move, you can move one of the four two up or down (moving 0 up is 1, moving 0 down is 9 — the wheels are circular). There are ‘dead combinations’ in which the padlock will permanently lock if the combination ever equals one of those values. Find the minimum number of moves it takes to reach a target combination without locking, or if it is possible at all.

您有一个带有两位数字的挂锁，均已初始化为00。每次移动时，您可以向上或向下移动四个两个中的一个(向上移动0为1，向下移动0为9-轮子是圆形的)。 在“死组合”中，如果组合等于这些值之一，则挂锁将永久锁定。 查找达到目标组合而不锁定或需要的最小移动数。

Dead combinations: [10, 90, 12]. Target: 11.

无效组合：[10、90、12]。 目标：11。

First, we construct a ‘root’, or ‘head’ of the graph, which is necessary in scenarios where we need to generate the graph as we go. This is ‘00’, or the root case from which branching cases will be generated.

首先，我们构造图的“根”或“头”，这在我们需要随行生成图的情况下是必需的。这是“ 00”，即将从其生成分支案例的根案例。

The four neighbors of node ‘00’ are ‘01’, ‘10’, ‘90’, and ‘09’, corresponding to various combinations of moving wheels up and down. Our graph already has five nodes and four edges.

节点“ 00”的四个邻居是“ 01”，“ 10”，“ 90”和“ 09”，对应于上下移动轮的各种组合。我们的图已经有五个节点和四个边。

For each newly added node, we will continue searching for its neighbors and add it to the graph, unless it is a dead node.

对于每个新添加的节点，我们将继续搜索其邻居并将其添加到图中，除非它是一个死节点。

When we have found the target combination, we can retrace our steps and count how many it takes to reach the root node again. Alternatively, we could have been keeping track of the steps each time a node is generated.

找到目标组合后，我们可以追溯步骤并计算再次到达根节点所需的时间。另外，每次生成节点时，我们可能一直在跟踪步骤。

If you want to implement the solution to this problem, you can use a queue and breadth-first search, which is more efficient than actually coding a graph.

如果要实现此问题的解决方案，可以使用队列和广度优先搜索，这比实际编码图形更为有效。

This was an example of using graph-based thinking. Since graphs are very ordered and clean structures, thinking and implementing possibilities as nodes and variants as neighbors can result in a complex but clean and understandable search. Additionally, using graphs allows for many studied methods in graph theory to be implemented to speed up search.

这是使用基于图的思维的一个示例。由于图是非常有序且干净的结构，因此将节点和变量作为邻居考虑和实现可能性的思考和实现可能会导致复杂但干净且易于理解的搜索。另外，使用图可以实现图论中许多研究方法，以加快搜索速度。

However, in this padlock problem graphs did not actually need to be implemented. The most common method to implement a complete graph is to use two objects (classes), a Node, the primary building block, and a Graph, which is comprised of Nodes and provides an interface to access information about the graph as a whole.

但是，在此挂锁中，实际上并不需要实现问题图。实现完整图的最常见方法是使用两个对象(类)： Node ，主要构造块和一个Graph ，它由Node组成，并提供一个接口来访问整个图的信息。

For example, each element in the graph below can be represented in code as their own Node. Each is connected to each other through their Neighbors. If we were to call something like NodeA.Neighbors[1].Neighbors[1].Value, we should receive 2. This is because the second index of Node A’s neighbors is Node C, and the second index of Node C’s neighbors is Node B, which has a value of 2. This kind of easy connectivity allows for easy traversal.

例如，下图中的每个元素都可以用代码表示为自己的Node 。彼此之间通过Neighbors彼此连接。如果我们调用类似NodeA.Neighbors[1].Neighbors[1].Value ，我们应该收到2。这是因为Node A的邻居的第二个索引是Node C，Node C的邻居的第二个索引是Node。 B，其值为2。这种简单的连接性允许轻松遍历。

A directed graph, or a graph where edges only go in one direction is easy to implement with this design. For example, if a one-way edge were to connect Node A to Node B, Node A’s neighbor would be Node B, but Node B would have no neighbors. Rephrased, neighbors only indicate outbound edges. Node that bidirectional edges can still exist in directed graphs.

通过这种设计，很容易实现有向图或仅沿一个方向延伸边的图。例如，如果单向边缘将节点A连接到节点B，则节点A的邻居将是节点B，而节点B将没有邻居。改写为，邻居仅表示出站边缘 。双向边仍然可以存在于有向图中的节点。

If one were to traverse the graph according to neighbors and end up at Node C or Node F, they would be stuck because those nodes have no neighbors and hence, no outbound directions.

如果要根据邻居遍历该图并最终到达节点C或节点F，则它们将被卡住，因为这些节点没有邻居，因此也没有出站方向。

Alternatively, a graph can be represented more simply, but at the cost of being less easy to traverse, with two lists representing the edges and nodes (vertices). These are sometimes called ‘adjacency lists’, since they express in list format the adjacencies between edges (adjacencies being edges, and adjacent nodes being neighbors).

备选地，可以更简单地表示图形，但是以不太容易遍历为代价，其中两个列表表示边和节点(顶点)。这些有时称为“邻接列表”，因为它们以列表格式表示边缘之间的邻接(邻接是边缘，相邻节点是邻居)。

V = [A, B, C, D, E, F]
E = [AB, AC, BC, CF, CE, DF, EA, FB, FD]

In the example above, V declares the nodes that exist, and E declares an edge from one node to another (AC means A→C). Since it is compact and simple notation, graphs will often be presented this way.

在上面的示例中， V声明存在的节点，E声明从一个节点到另一个节点的边( AC表示A → C )。由于它是紧凑且简单的表示法，因此通常会以这种方式显示图形。

Alternatively, it could be written as a dictionary (map), in which the key is a starting node and its value is a list of elements it points to.

或者，可以将其写为字典(地图)，其中键是起始节点，其值是它指向的元素列表。

adj_l = {A:[B,C], B:[C], C:[F,E], D:[F], E:[A], F:{B,D]}

Graphs, both directed and undirected, can contain loops. A cycle graph is a graph consisting of only one cycle, in which there are no terminating nodes and one could traverse infinitely throughout the graph. A cyclic graph is a graphic that consists of several cycle graphs, where traversals can still be infinite but more complex.

有向图和无向图都可以包含循环。循环图是仅包含一个循环的图，其中没有终止节点，并且一个循环可以遍历整个图。循环图是由几个循环图组成的图形，遍历仍然可以是无限的，但更为复杂。

For example, within the complete cyclic graph, A→B→C→D→A is a four-cycle graph, and E→F→G→E is a three-cycle graph. More hidden is another four-cycle graph, B→E→F→G→B.

例如，在完整循环图中，A→B→C→D→A是四循环图，而E→F→G→E是三循环图。更隐藏的是另一个四周期图，即B→E→F→G→B。

Certain types of cycles within cyclic graphs, or other components within graphs in which each node is connected to each other node, are known as strongly connected components. For instance, E→F→G→E is a strongly connected component because each of the nodes {E, F, G} has a path to another, regardless of direction. B→E→F→G→B is also a strongly connected component. On the other hand, A→B→C→D→A is not because there is no connection between component members B and D.

循环图中的某些类型的循环，或图中每个节点都彼此连接的其他组件，被称为强连通组件。例如，E→F→G→E是一个强连接的组件，因为每个节点{E，F，G}都具有到另一个节点的路径，而与方向无关。 B→E→F→G→B也是牢固连接的组件。另一方面，A→B→C→D→A不是因为构件B和D之间没有连接。

On the other hand, as the name suggests, acyclic graphs are ones where no cycle exists, and any traversal long enough will eventually terminate. In the graph below, no matter which node you start on, a traversal will always terminate.

另一方面，顾名思义，非循环图是不存在循环的图，任何足够长的遍历都将最终终止。在下图中，无论您从哪个节点开始，遍历始终会终止。

It is not always true in more complex problems that edges can naively be treated as equal to travel. For instance, if you’re planning the best route to go from a start to an end destination, you’re not going to only consider the number of segments, but also the distance and the cost.

在更复杂的问题中，并非总是天真地将边缘视为等同于行进。例如，如果您计划从起点到终点的最佳路线，则不仅要考虑路段数，还要考虑距离和成本。

For instance, the shortest path to go from S→E is S→D→F→E, which requires only three edge traversals. However, that route takes a very small, crowded street. Alternatively, S→A→B→C→E takes four edges but travels most of the distance along a highway, and the overall cost is lessened. When the notions of distance and cost are added to a graph, it becomes weighted.

例如，从S→E出发的最短路径是S→D→F→E，它仅需要三个边沿遍历。但是，那条路线走的是一条很小而拥挤的街道。或者，S→A→B→C→E占据四个边缘，但沿着高速公路大部分距离行驶，从而降低了总体成本。将距离和成本的概念添加到图形后，它将变得很重要。

To implement this into our existing framework for graph structure, we can include for each element in Neighbors another number describing the cost to reach that neighbor. For instance, it may be stored in tuples [(n, c), (n, c)], where n represents the node and c represents the cost.

为了将其实现到我们现有的图结构框架中，我们可以为Neighbors每个元素包括一个描述到达该邻居的成本的数字。例如，它可以存储在元组[(n, c), (n, c)] ，其中n表示节点， c表示成本。

Often, graphs will also be presented in the form of a matrix, known as an ‘adjacency matrix’. This is not as compact as an adjacency list, but can represent weighted graphs more naturally. In the matrix, each row and column represents a node, and the cell located at (x, y) represents the edge y→x (or vice versa, it’s a matter of notation). If there is no edge, the value is 0. If there is, the value is the cost of that edge.

通常，图形也将以矩阵形式(称为“邻接矩阵”)呈现。这不像邻接表那么紧凑，但是可以更自然地表示加权图。在矩阵中，每一行和每一列代表一个节点，位于( x ， y )的单元格代表边y → x (反之亦然，这是一种表示法)。如果没有边缘，则值为0。如果存在，则值为该边缘的成本。

Adjacency matrices also have the advantage of easy lookup of costs, even with unweighted graphs, over adjacency lists and an object-oriented representation. Note that undirected graphs will have symmetrical adjacency matrices. Since matrices are also easier to manipulate, many graph operations and algorithms are commonly implemented on adjacency matrices.

邻接矩阵还具有易于查找成本的优点，即使使用未加权图，超过邻接列表和面向对象的表示也是如此。请注意，无向图将具有对称的邻接矩阵。由于矩阵也更易于操作，因此通常在邻接矩阵上实现许多图操作和算法。

Various algorithms have been created to find the shortest path for weighted graphs, like Dijkstra’s algorithm (pronounced ‘dike-strah’). Essentially, Dijkstra’s algorithm is very similar to the brute-force style search discussed earlier with the padlock problem, but does so in a way that is most logical. The rough outline of the algorithm is as follows:

已经创建了各种算法来找到加权图的最短路径，例如Dijkstra的算法(发音为“ dike-strah”)。本质上，Dijkstra的算法与前面讨论的带有挂锁问题的蛮力样式搜索非常相似，但是这样做是最合乎逻辑的。该算法的大致概述如下：

Begin at the start node and initialize a list (priority queue) to keep track of which nodes to process.
从起始节点开始，并初始化一个列表(优先级队列)以跟踪要处理的节点。
At each iteration of the algorithm, find the first element of the list. Process the element by finding all its neighboring nodes (that haven’t been explored before).
在算法的每次迭代中，找到列表的第一个元素。通过查找元素的所有邻近节点(之前未曾探索过)来处理元素。
For each neighbor, calculate the total distance/cost to reach that node from the start node. Put these neighbor nodes into the list such that the nodes with the lowest costs are at the front.
对于每个邻居，计算从起始节点到达该节点的总距离/成本。将这些邻居节点放入列表中，以使成本最低的节点位于最前面。
Repeat until the end node has been processed.
重复直到结束节点已处理。

There are plenty more in-depth resources about Dijkstra’s algorithm, but primarily, its main difference from a brute-force search is that it processes nodes currently with the smallest costs first, which is logically correct. This can speed up redundant searching by taking into account weights.

关于Dijkstra算法，还有很多更深入的资源，但是首先，它与蛮力搜索的主要区别在于，它首先处理当前成本最低的节点，这在逻辑上是正确的。通过考虑权重，可以加快冗余搜索。

While powerful in many instances, Dijkstra is naïve in that it only chooses to process nodes that currently hold the best costs, in the hopes that the complete path will also hold a similarly small cost, when this may not be the case at all. This can be a problem in large graphs.

尽管在许多情况下功能强大，但Dijkstra只是天真地选择仅处理当前拥有最高成本的节点，希望整个路径也将拥有类似的较小成本，而事实并非如此。在大图中这可能是个问题。

For instance, consider this grid of nodes, where each connection has the same cost to traverse; Dijkstra’s algorithm (slightly varying depending on implementation) will search through all the light nodes before arriving at the end node E. It’s like pouring a bucket of water at the location of node S and hoping it eventually spreads to the target node.

例如，考虑此节点网格，其中每个连接的遍历成本相同； Dijkstra的算法(具体取决于实现而略有不同)将在到达末端节点E之前搜索所有轻节点。这就像在节点S的位置倒一桶水，并希望它最终传播到目标节点。

The A Star algorithm and many other variants take into account these weaknesses and add enhancements like stronger memory and direction to improve traversals throughout graphs. Machine learning, particularly reinforcement learning, is central to more recent methods of highly efficient graph traversal. In reinforcement learning, probabilities and states are often represented as graphs an agent traverses.

A Star算法和许多其他变体考虑了这些缺点，并添加了增强功能，例如更强的内存和方向，以改善整个图形的遍历。机器学习，尤其是强化学习，是最新高效的图形遍历方法的核心。在强化学习中，概率和状态通常表示为主体所经过的图形。

Graphs and graph-based thinking can be used in many other computer science problems, even when it is not obvious. Any time you approach a difficult problem, attempting to represent it using vertices and edges can inspire new ideas, simplify and reduce the problem, or even be one solution to the problem.

图形和基于图形的思维可以用在许多其他计算机科学问题中，即使不是很明显。每当您遇到一个困难的问题时，尝试使用顶点和边来表示它都会激发新的想法，简化和减少问题，甚至是解决问题的一种方法。

Some applications of graph theory in computer science include:

图论在计算机科学中的一些应用包括：

Modelling of complex networks, like social networks or in the simulation of a disease like the coronavirus. Each node can represent one person or a population, and edges can represent probability/easiness of transmission. In this model, we can try to identify or form circular, closed graphs.
诸如社交网络之类的复杂网络的建模，或诸如冠状病毒之类的疾病的模拟。每个节点可以代表一个人或一个群体，边缘可以代表传输的概率/容易程度。在此模型中，我们可以尝试识别或形成圆形封闭图。
Organization & anything hierarchical. Graphs don’t have to be loopy and cyclical — they can also express a hierarchy. For instance, if you were to create an API for a local library to access books by various content, you’d want to create a graph. If you wanted to create a site map for your website, you’d use a graph. Graph databases are types of databases that specifically rely on the graph’s organized hierarchies to store data.
组织和任何层次结构。图不必是循环的和循环的，它们还可以表示层次结构。例如，如果要为本地图书馆创建一个API，以按各种内容访问书籍，则需要创建一个图形。如果要为您的网站创建站点地图，则可以使用图表。图数据库是专门依赖图的组织层次结构存储数据的数据库类型。
Any problem that involves an agent travelling between many locations or states is most likely represented well with a graph. Using graphs can help reduce the complexity of almost any programming problem.
涉及代理在许多位置或状态之间移动的任何问题最有可能用图形很好地表示。使用图可以帮助降低几乎所有编程问题的复杂性。
A service like Google Maps, which tells you the best route to take, considering not only distance but traffic time, elevation, tolls, etc. It is, essentially, finding the best path in a massive weighted graph (imagine a node every few feet or so, and a graph spanning the Earth).
像Google Maps这样的服务，它不仅会考虑距离，而且还会考虑交通时间，海拔，通行费等，告诉您采取的最佳路线。从本质上讲，它是在大规模加权图中寻找最佳路径(想象每几英尺一个节点左右，以及横跨地球的图表)。
Graph theory was involved in the proving of the Four-Color Theorem, which became the first accepted mathematical proof run on a computer.
图论参与了四色定理的证明，四色定理成为第一个在计算机上运行的公认数学证明。
In Natural Language Processing, a division of machine learning that handles the modelling of language, weighted graph representations of words and text are extremely valuable because they can provide insight into, for example, words that belong to a similar cluster (‘apples’, ‘oranges’) or mean similar things through distance.
在自然语言处理(Natural Language Processing)中，机器学习的一个部门负责处理语言建模，单词和文本的加权图形表示形式非常有价值，因为它们可以洞察例如属于类似簇的单词(“苹果”，“橘子”)或通过距离表示类似的事物。

关键点 (Key Points)

Graphs are comprised of a set of nodes, also called vertices, and edges, or connections between the nodes.
图由一组节点(也称为顶点)，边或节点之间的连接组成。
Two representations of graphs include adjacency lists and adjacency matrices. The latter supports easier indexing and manipulation but takes up more space than the former.
图的两种表示形式包括邻接表和邻接矩阵。后者支持更轻松的索引和操作，但比前者占用更多空间。
Complete graphs can be implemented using Node objects, which have a value and a set of neighbors.
可以使用具有一个值和一组邻居的Node对象来实现完整的图。
Directed graphs have direction. Weighted graphs apply the idea of distance or cost to each edge. Cyclic graphs contain cycles that can be infinitely traversed.
有向图具有方向。加权图将距离或成本的概念应用于每个边。循环图包含可以无限遍历的循环。
Dijkstra’s algorithm is used to find the shortest distance between two nodes in a weighted graph. It is usually effective but somewhat naïve, which is why there exists a host of other algorithms dedicated to finding the best graph traversal.
Dijkstra的算法用于找到加权图中两个节点之间的最短距离。它通常是有效的，但有些幼稚，这就是为什么存在许多其他算法来寻找最佳图形遍历的原因。
Graphs, both in their implementation and in a thinking paradigm, can be applied to a very wide set of computer science and programming problems.
图无论在其实现方式还是在思维范式上，都可以应用于非常广泛的计算机科学和编程问题。