数据库中各表关联图及其说明_如何在图中思考:图论及其应用的说明性介绍

数据库中各表关联图及其说明

by Vardan Grigoryan (vardanator)

由Vardan Grigoryan(vardanator)

如何在图中思考:图论及其应用的说明性介绍 (How to think in graphs: An illustrative introduction to Graph Theory and its applications)

Graph theory represents one of the most important and interesting areas in computer science. But at the same time it’s one of the most misunderstood (at least it was to me).

图论是计算机科学中最重要,最有趣的领域之一。 但是同时,它也是最容易被人误解的一种(至少对我来说是这样)。

Understanding, using and thinking in graphs makes us better programmers. At least that’s how we’re supposed to think. A graph is a set of vertices V and a set of edges E, comprising an ordered pair G=(V, E).

了解,使用和思考图形使我们成为更好的程序员。 至少我们应该这样认为。 图是一组顶点V和一组边缘E,包括一组有序对G =(V,E)。

While trying to studying graph theory and implementing some algorithms, I was regularly getting stuck, just because it was so boring.

同时努力学习图论和实施一些算法,我经常被卡住,只是因为它是如此无聊。

The best way to understand something is to understand its applications. In this article, we’re going to demonstrate various applications of graph theory. But more importantly, these applications will contain detailed illustrations. So lets get started and dive in.

理解某物的最好方法是理解其应用。 在本文中,我们将演示图论的各种应用。 但更重要的是,这些应用程序将包含详细的说明。 因此,让我们开始吧!

While this approach might seem too detailed (to seasoned programmers), but believe me, as someone who was once there and tried to understand graph theory, detailed explanations are always preferred over succinct definitions.

尽管这种方法(对于经验丰富的程序员而言)似乎太详细了,但是请相信我,作为曾经在那里并试图理解图论的人,相对于简洁的定义,总是首选详细的解释。

So, if you’ve been looking for a “graph theory and everything about it tutorial for absolute unbelievable dummies”, then you’ve come to the right place. Or at least I hope. So lets get started and dive in.

因此,如果您一直在寻找“图形理论及其绝对不可思议的虚拟事物教程”,那么您来对地方了。 或者至少我希望。 因此,让我们开始吧!

目录 (Table of Contents)

免责声明 (Disclaimers)

DISCLAIMER 1: I am not an expert in CS, algorithms, data structures and especially in graph theory. I am not involved in any project for the companies discussed in this article. Solutions to the problems are not final and could be improved drastically. If you find any issue or something unreasonable, you are more than welcome to leave a comment. If you work at one of the mentioned companies or are involved in corresponding software projects, please respond with the actual solution (it will be helpful to others). To all others, be patient readers, this is a pretty LONG article.

免责声明1: 我不是CS,算法,数据结构,尤其是图论方面的专家。 我没有参与本文讨论的公司的任何项目。 解决问题的方法不是最终的,可以大幅度改善。 如果您发现任何问题或不合理的地方,欢迎您发表评论。 如果您在上述公司之一工作或参与相应的软件项目,请提供实际的解决方案(这将对其他人有所帮助)。 对于所有其他人,请耐心阅读,这是一篇很长的文章。

DISCLAIMER 2: This article is somewhat different in the style that information is provided. Sometimes it might seem a bit digressed from the sub-topic, but patient readers will eventually find themselves with a complete understanding of the bigger picture.

免责声明2: 本文在提供信息的方式上有些不同。 有时似乎与该子主题有些偏离,但耐心的读者最终会发现自己对全局有一个完整的了解。

DISCLAIMER 3: This article is written for a broad audience of programmers. While having junior programmers as the target audience, I hope it will be interesting to experienced professionals as well.

免责声明3本文是为广大程序员编写的。 在希望将初级程序员作为目标受众的同时,我希望这对经验丰富的专业人员也将很有趣。

柯尼斯堡七桥 (Seven Bridges of Königsberg)

Let’s start with something that I used to regularly encounter in graph theory books that discuss “the origins of graph theory”, the Seven Bridges of Königsberg (not really sure, but you can pronounce it as “qyonigsberg”). There were seven bridges in Kaliningrad, connecting two big islands surrounded by the Pregolya river and two portions of mainlands divided by the same river.

让我们从我经常在图论书籍中经常讨论的东西开始,这些书讨论的是“图论的起源”,即柯尼斯堡七桥 (虽然不确定,但您可以将其发音为“ qyonigsberg”)。 加里宁格勒有七座桥梁,连接被普雷戈利亚河包围的两个大岛和被同一河分开的两部分大陆。

In the 18th century this was called Königsberg (part of Prussia) and the area above had a lot more bridges. The problem or just a brain teaser with Königsberg’s bridges was to be able to walk through the city by crossing all the seven bridges only once. They didn’t have an internet connection at that time, so it should have been entertaining. Here’s the illustrated view of the seven bridges of Königsberg in 18th century.

在18世纪,这被称为Königsberg(普鲁士的一部分),上面的区域还有更多的桥梁。 问题或与柯尼斯堡(Königsberg)的桥梁只是一个脑筋急转弯的问题是,仅通过一次穿过所有七座桥梁就可以穿越城市。 当时他们没有互联网,所以应该很有趣。 这是18世纪柯尼斯堡的七座桥梁的图解视图。

Try it. See if you can walk through the city by crossing each bridge only once.

试试吧。 看看您是否可以只跨过每座桥一次就能穿越城市。

  • There should not be any uncrossed bridge(s).

    不应有任何不交叉的桥梁。
  • Each bridge must not be crossed more than once.

    每个桥梁不得超过一次。

If you are familiar with this problem, you know that it’s impossible to do it. Although you were trying hard enough and you may try even harder now, you’ll eventually give up.

如果您熟悉此问题,那么您将无法做到。 尽管您已经付出了足够的努力,现在可能会更加努力,但最终您还是会放弃。

Sometimes it’s reasonable to give up fast. That’s how Euler solved this problem - he gave up pretty soon. Instead of trying to solve it, he adopted a different approach of trying to prove that it’s not possible to walk through the city by crossing each bridge one and only time.

有时放弃快速是合理的。 欧拉就是这样解决这个问题的-他很快就放弃了。 他没有尝试解决这个问题,而是采取了另一种方法来试图证明不可能只通过一次穿过每一座桥梁就可以穿越城市。

Let’s try to understand how Euler was thinking and how he came up with the solution (if there isn’t a solution, it still needs a proof). That is a real challenge here, because walking through the thought process of such a venerable mathematician is kind of dishonorable. (Venerable so much that Knuth and friends dedicated their book to Leonhard Euler). We rather will pretend to “think like Euler”. Let’s start with picturing the impossible.

让我们尝试了解Euler的想法以及他如何提出解决方案(如果没有解决方案,它仍然需要证明)。 这是一个真正的挑战,因为走过这样一位杰出的数学家的思维过程是不光彩的。 (非常受尊敬,以至于Knuth和朋友将他们的书献给了 Leonhard Euler )。 我们宁愿假装“像欧拉一样思考”。 让我们从想象不可能开始。

There are four distinct places, two islands and two parts of mainland. And seven bridges. It’s interesting to find out if there is any pattern regarding the number of bridges connected to islands or mainland (we will use the term “land” to refer to the four distinct places).

有四个不同的地方,两个岛屿和大陆的两个部分。 还有七座桥。 找出与岛屿或大陆连接的桥梁数量是否有任何模式是很有趣的(我们将使用术语“土地”来指代四个不同的地方)。

At a first glance, there seems to be some sort of a pattern. There are an odd number of bridges connected to each land. If you have to cross each bridge once, then you can enter a land and leave it if it has 2 bridges.

乍一看,似乎有某种模式。 连接到每个区域的桥梁数量奇数。 如果您必须跨过每座桥一次,那么您可以进入一块土地,如果它有两座桥,则可以离开。

It’s easy to see in the illustrations above that if you enter a land by crossing one bridge, you can always leave the land by crossing its second bridge. Whenever a third bridge appears, you won’t be able to leave a land once you enter it by crossing all its bridges. If you try to generalize this reasoning for a single piece of land, you’ll be able to show that, in case of an even number of bridges it’s always possible to leave the land and in case of an odd number of bridges it isn’t. Try it in your mind!

从上面的插图中可以很容易地看出,如果您跨过一座桥进入一块土地,则总是可以越过另一座桥离开该土地。 每当出现第三座桥时,一旦跨过所有桥进入该地,您将无法离开该地。 如果您尝试将这一推理归纳为一块土地,那么您将能够证明,在桥梁数量偶数的情况下,总是可以离开土地,而在桥梁数量奇数的情况下,这是不可能的。 t。 在您的脑海中尝试一下!

Let’s add a new bridge to see how the number of overall connected bridges changes and whether it solves the problem.

让我们添加一个新的网桥,以查看整个连接的网桥的数量如何变化以及它是否解决了问题。

Now that we have two even (4 and 4) and two odd (3 and 5) number of bridges connecting the four pieces of land, let’s draw a new route with the addition of this new bridge.

现在我们有两个偶数(4和4)和两个奇数(3和5)的桥梁连接这四块土地,让我们在增加新桥的基础上画一条新路线。

We saw that the number of even and odd number of bridges played a role in determining if the solution was possible. Here’s a question. Does the number of bridges solve the problem? Should it be even all the time? Turns out that it’s not the case. That’s what Euler did. He found a way to show that the number of bridges matter. And more interestingly, the number of pieces of land with an odd number of connected bridges also matters. That’s when Euler started to “convert” lands and bridges into something we know as graphs. Here’s how a graph representing the Königsberg bridges problem could look like (note that our “temporarily” added bridge isn’t there).

我们看到,偶数和奇数桥的数量在确定解决方案是否可行方面发挥了作用。 这是一个问题。 桥的数量是否可以解决问题? 应该一直都这样吗? 事实并非如此。 欧拉就是这么做的。 他找到了一种方法来表明桥的数量很重要。 更有趣的是,连接桥梁数量奇数的土地数量也很重要。 从那时起,欧拉开始“转换”土地并将其桥接成我们称为图形的东西。 这是代表Königsberg桥梁问题的图形的样子(请注意,此处没有“临时”添加的桥梁)。

One important thing to note is the generalization/abstraction of a problem. Whenever you solve a specific problem, the most important thing is to generalize the solution for similar problems. In this particular case, Euler’s task was to generalize the bridge crossing problem to be able to solve similar problems in the future, i.e. for all the bridges in the world. Visualization also helps to view the problem at a different angle. The following graphs are all various representations of the same Königsberg bridge problem shown above.

要注意的重要一件事是问题的概括/抽象。 每当您解决一个特定的问题时,最重要的是将类似问题的解决方案归纳起来。 在这种特殊情况下,欧拉的任务是推广桥梁穿越问题,以便将来能够解决类似问题,即解决世界上所有桥梁的问题。 可视化还有助于从不同角度查看问题。 下图是上述同一柯尼斯堡桥问题的所有不同表示。

So yes, visually graphs are a good choice for picturing problems. But now we need to find out how the Königsberg problem can be solved using graphs. Pay attention to the number of lines coming out of each circle. And yes, let’s name them as seasoned professionals would do, from now on we will call circles, vertices and the lines connecting them, edges. You might’ve seen letter notations, V for (vendetta?) vertex, E for edge.

因此,是的,视觉图表是描绘问题的理想选择。 但是现在我们需要找出如何使用图来解决Königsberg问题。 注意每个圆圈中出现的行数。 是的,让我们以经验丰富的专业人士的名字来命名,从现在开始,我们将圆, 顶点和连接它们的线称为edge 。 您可能已经看过字母符号, V代表(vendetta?)顶点, E代表edge。

The next important thing is the so-called degree of a vertex, the number of edges incident connected to the vertex. In our example above, the number of bridges connected to lands can be expressed as degrees of the graph vertex.

下一个重要的事情是一个顶点的所谓 ,边缘事件的数量连接到顶点。 在上面的示例中,连接到平台的桥的数量可以表示为图顶点的度数。

In his endeavor Euler showed that the possibility of a walk through graph (city) traversing each edge (bridge) one and only one time is strictly dependent on the degrees of vertices (lands). The path consisting of such edges called (in his honor) an Euler path. The length of an Euler path is the number of edges. Get ready for some strict language. ?

欧拉(Euler)所做的努力表明,一次遍历每个边缘(桥)一次(仅一次)的图形(城市)的可能性严格取决于顶点(陆地)的程度。 由这些边缘组成的路径称为(以他的名义)欧拉路径。 欧拉路径的长度是边的数量。 准备一些严格的语言。 ?

An Euler path of a finite undirected graph G(V, E) is a path such that every edge of G appears on it once. If G has an Euler path, then it is called an Euler graph. [1]
有限无向图G(V,E)的Euler路径是这样的路径,使得G的每个边都出现一次。 如果G具有欧拉路径,则称为欧拉图。 [1]

Theorem. A finite undirected connected graph is an Euler graph if and only if exactly two vertices are of odd degree or all vertices are of even degree. In the latter case, every Euler path of the graph is a circuit, and in the former case, none is. [1]

定理 。 的有限无向连通图是欧拉图表当且仅当正好两个顶点是奇数度所有顶点均匀度的。 在后一种情况下,图的每个Euler路径都是一个电路,而在前一种情况下,都不是。 [1]

I used “Euler path” instead of “Eulerian path” just to be consistent with the referenced books [1] definition. If you know someone who differentiates Euler path and Eulerian path, and Euler graph and Eulerian graph, let them know to leave a comment.

为了与参考书籍[1]的定义保持一致,我使用了“欧拉路径”而不是“欧拉路径”。 如果您知道有人区分欧拉路径和欧拉路径,以及欧拉图和欧拉图,请告诉他们。

First of all, let’s clarify the new terms in the above definition and theorem.

首先,让我们在上述定义和定理中阐明新术语。

  • Undirected graph - a graph that doesn’t have a particular direction for edges.

    无向图 -没有特定方向的边的图。

  • Directed graph - a graph in which edges have a particular direction.

    有向图 -边具有特定方向的图。

  • Connected graph - a graph where there is no unreachable vertex. There must be a path between every pair of vertices.

    连接图 -没有不可达顶点的图。 每对顶点之间必须有一条路径。

  • Disconnected graph - a graph where there are unreachable vertices. There is not a path between every pair of vertices.

    断开连接的图 -顶点不可达的图。 每对顶点之间没有路径。

  • Finite graph - a graph with a finite number of nodes and edges.

    有限图 -具有有限数量的节点和边的图。

  • Infinite graph - a graph where an end of the graph in a particular direction(s) extends to infinity.

    无限图 -在特定方向上图的一端延伸到无穷大的图。

We’ll discuss some of these terms in the coming paragraphs.

我们将在接下来的段落中讨论其中一些术语。

Graphs can be directed and undirected, and that’s one of the interesting properties of graphs. You must’ve seen a popular Facebook vs Twitter example for directed and undirected graphs. A Facebook friendship relation may be easily represented as an undirected graph, because if Alice is a friend with Bob, then Bob must be a friend with Alice, too. There is no direction, both are friends with each other.

图可以是有向的和无向的,这是图的有趣特性之一。 您必须已经看到了有向图和无向图的流行的Facebook vs Twitter示例。 Facebook友谊关系很容易表示为无向图,因为如果Alice是Bob的朋友,那么Bob也必须也是Alice的朋友。 没有方向,彼此是朋友。

Also note the vertex labeled as “Patrick”, it is kind of special (he’s got no friends), as it doesn’t have any incident edges. It is still a part of the graph, but in this case we will say that this graph is not connected, it is a disconnected graph (same goes with “John”, “Ashot” and “Beth” as they are interconnected with each other but separated from others). In a connected graph there is no unreachable vertex, there must be a path between every pair of vertices.

还要注意标记为“ Patrick”的顶点,这有点特殊(他没有朋友),因为它没有任何入射边缘。 它仍然是图的一部分,但在这种情况下,我们将说该图未连接,它是一个断开的图 (“ John”,“ Ashot”和“ Beth”彼此连接在一起但与其他人分开) 连接的图中,没有不可达的顶点,每对顶点之间必须有一条路径。

Contrary to the Facebook example, if Alice follows Bob on Twitter, that doesn’t require Bob to follow Alice back. So a “follow” relation must have a direction indicator, showing which vertex (user) has a directed edge (follows) to the other vertex.

与Facebook示例相反,如果Alice在Twitter上关注Bob,则不需要Bob跟随Alice。 因此,“跟随”关系必须具有方向指示器,以显示哪个顶点(用户)具有指向另一个顶点的有向边(跟随)。

Now, knowing what is a finite connected undirected graph, let’s get back to Euler’s graph:

现在,知道什么是有限 连接 无向图,让我们回到欧拉图:

So why did we discuss Königsberg bridges problem and Euler graphs in the first place? Well, it’s not so boring and by investigating the problem and foregoing solution we touched the elements behind graphs (vertex, edge, directed, undirected) avoiding a dry theoretical approach. And no, we are not done with Euler graphs and the problem above, yet. ?

那么,为什么我们首先讨论柯尼斯堡桥问题和欧拉图呢? 好吧,这不是那么无聊,通过研究问题和前述解决方案,我们触及了图形背后的元素(顶点,边,有向,无向),从而避免了枯燥的理论方法。 不,我们还没有完成Euler图和上述问题。 ?

We should now move on to the computer representation of graphs as that is the topic of interest for us programmers. By representing a graph in a computer program, we will be able to devise an algorithm for tracing graph path(s), and therefore find out if it is an Euler path. Before that, try to think of a good application for an Euler graph (besides fiddling around with bridges).

现在,我们应该继续进行图形的计算机表示,因为这是我们程序员感兴趣的主题。 通过在计算机程序中表示图形,我们将能够设计出一种用于跟踪图形路径的算法,从而确定其是否为欧拉路径。 在此之前,请尝试为Euler图(除了摆弄桥)之外的一个好应用。

图表表示形式:简介 (Graph representation: Intro)

Now this is quite a tedious task, so be patient. Remember the fight between Arrays and Linked Lists? Use arrays if you need fast element access, use lists if you need fast element insertion/deletion, etc. I hardly believe you ever struggled with something like “how to represent lists”. Well, in case of graphs the actual representation is really bothering, because first you should decide how exactly are you going to represent a graph. And believe me, you are not going to like this. Adjacency list, adjacency matrix, maybe edge lists? Toss a coin.

现在,这是一个繁琐的任务,因此请耐心等待。 还记得数组和链接列表之间的斗争吗? 如果需要快速访问元素,请使用数组;如果需要快速插入/删除元素,请使用列表,等等。我几乎不相信您曾经为“如何表示列表”而苦苦挣扎。 好吧,在图形的情况下,实际的表示真的很麻烦,因为首先您应该确定要如何精确地表示图形。 相信我,你不会喜欢这样的。 邻接表,邻接矩阵,或者边缘列表? 抛硬币。

You should have tossed hard, because we are starting with a tree. You must have seen a binary tree (or BT for short) at least once (the following is not a binary search tree).

您应该辛苦地折腾,因为我们从一棵树开始。 你一定见过一个二叉树(或BT的简称)至少一次(以下不是一个二叉搜索树)。

Just because it consists of vertices and edges, it’s a graph. You also may recall how most commonly a binary tree is represented (at least in textbooks).

仅仅因为它由顶点和边组成,所以它是一个图。 您可能还记得二叉树的表示方式(至少在教科书中如此)。

It might seem too basic for people who are already familiar with binary trees, but I still have to illustrate it to make sure we are on the same page (note that we are still dealing with pseudocode).

对于已经熟悉二叉树的人来说,这似乎太基本了,但是我仍然必须说明它,以确保我们在同一页上(请注意,我们仍在处理伪代码)。

If you are new to trees, read the pseudocode above carefully, then follow the steps in the illustration below.

如果您不熟悉树,请仔细阅读上面的伪代码,然后按照下图中的步骤进行操作。

While a binary tree is a simple “collection” of nodes, each of which has left and right child nodes. A binary search tree is much more useful as it applies one simple rule which allows fast key lookups. Binary search trees (BST) keep their keys in sorted order. You are free to implement your BT with any rule you want (although it might change its name based on the rule, for instance, min-heap or max-heap). The most important expectation for a BST is that it satisfies the binary search property (that’s where the name comes from). Each node’s key must be greater than any key in its left sub-tree and less than any key in its right sub-tree.

二叉树是节点的简单“集合”,每个节点都有左右子节点。 二进制搜索树更有用,因为它应用了一条允许快速键查找的简单规则。 二进制搜索树(BST)使其关键字保持排序顺序。 您可以随意使用所需的任何规则来实现BT(尽管它可能会根据规则更改其名称,例如min-heap或max-heap)。 对BST的最重要期望是它满足二进制搜索属性(即名称的来源)。 每个节点的密钥必须于其左子树中的任何密钥,并且小于其右子树中的任何密钥。

I’d like to point out a very interesting point regarding the statement “greater than” that’s crucial to understand how BST’s function. Whenever you change the property to “greater than or equal”, your BST will be able to save duplicate keys when inserting new nodes, otherwise it will keep only nodes with unique keys. You can find really good articles on the web about binary search trees. We won’t be providing a full implementation of a binary search tree, but for the sake of consistency, we’ll illustrate a simple binary search tree here.

我想指出一个关于“大于”的非常有趣的观点,这对于理解BST的功能至关重要。 只要将属性更改为“大于或等于”,BST便可以在插入新节点时保存重复的键,否则BST将仅保留具有唯一键的节点。 您可以在网上找到有关二叉搜索树的非常好的文章。 我们不会提供二进制搜索树的完整实现,但是为了保持一致,我们将在此处说明一个简单的二进制搜索树。

图形表示法和二叉树简介(Airbnb示例) (Intro to Graph representation and binary trees (Airbnb example))

Trees are very useful data structures. You might not have implemented a tree from scratch in your projects. But you’ve probably used them even without noticing. Let’s look at an artificial yet valuable example and try to answer the “why” question, “Why use a binary search tree in the first place”.

树是非常有用的数据结构。 您可能尚未在项目中从头实现树。 但是,即使您没有注意到,您也可能使用了它们。 让我们看一个人为但有价值的示例,并尝试回答“为什么”问题,“为什么首先使用二进制搜索树”。

As you’ve noticed, there is a “search” in binary search tree. So basically, everything that needs a fast lookup, should be placed in a binary search tree. “Should” doesn’t mean must, the most important thing to keep in mind in programming is to solve a problem with proper tools. There are tons of cases where a simple linked list with its O(N) lookup might be more preferable than a BST with its O(logN) lookup.

您已经注意到,二进制搜索树中有一个“搜索”。 因此,基本上,所有需要快速查找的内容都应放在二进制搜索树中。 “应该”并不意味着必须,编程中要记住的最重要的事情就是使用适当的工具解决问题。 在很多情况下,使用O(N)查找的简单链表比使用O(logN)查找的BST更可取。

Typically we would use a library implementation of a BST, most likely std::set or std::map in C++. However in this tutorial we are free to reinvent our own wheel. BSTs are implemented in almost any general-purpose programming language library. You can find them in the corresponding documentation of your favorite language. Approaching a “real-life example”, here’s the problem we’ll try to tackle - Airbnb Home Search.

通常,我们将使用BST的库实现,最有可能是C ++中的std :: set或std :: map。 但是,在本教程中,我们可以自由地重新发明自己的轮子。 BST几乎在所有通用编程语言库中实现。 您可以在您喜欢的语言的相应文档中找到它们。 接近“现实生活中的例子”,这就是我们将要解决的问题-Airbnb Home Searc

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值