sql语句集合里有集合
The set theory is very important in order to understand data and databases. While you could live without it and still be a good SQL developer, understanding it will surely help a lot. So, let’s dive into the matter.
集合论对于理解数据和数据库非常重要。 尽管您可以不用它而仍然可以成为一名优秀SQL开发人员,但了解它肯定会有所帮助。 因此,让我们深入探讨这个问题。
集合论–数学与逻辑 (Set Theory – Math & Logic)
I guess you remember these lessons from high school. To me, this was one of the most boring parts of my education, because many things sounded so obvious and you just had new notation and operators to work with sets – again pretty obvious one. While most people won’t use that knowledge later in their life, that’s not the case for those who are into databases.
我想你还记得高中的这些课。 对我来说,这是我教育中最无聊的部分之一,因为听起来很多事情都很明显,而且您只有新的符号和运算符才能使用集合,这也是非常明显的。 虽然大多数人在以后的生活中不会使用这些知识,但是对于那些进入数据库的人却不是这样。
集合论中的集合是什么? (What are Sets in the Set Theory?)
Sets can contain really anything. Let’s start from the simplest possible set and that is the empty set – S = {}. As you can see, the empty set doesn’t contain any data. We don’t know anything about that set since it’s undefined – there are no data types or data values.
集可以包含任何内容。 让我们从最简单的集合开始,那就是空集合– S = {} 。 如您所见,空集不包含任何数据。 由于未定义该集合,因此我们一无所知–没有数据类型或数据值。
We could have a set of numbers S = {1, 2, 3}, or T = {1, 1, 2, 3, 1001}, or U = {3, 1, 2}. This is much more interesting, because we have values defined, and we can also tell that all of these sets contain numerical data.
我们可以有一组数字 S = {1、2、3}或T = {1、1、2、3、1001}或U = {3、1、2} 。 这更有趣,因为我们定义了值,并且我们还可以说所有这些集合都包含数值数据。
Note. Two sets are equal if they contain same element, no matter how they are ordered. In our example, sets S = {1, 2, 3} & U = {3, 1, 2}, are the same.
注意。 如果两组包含相同的元素,则无论它们如何排序,都是相等的。 在我们的示例中,集合S = {1,2,3}&U = {3,1,2}是相同的。
Sets could also contain strings, e.g. A = {“Jack”, “Jill”, “John”}, B = {“Zagreb”, “Belgrade”, “New York”, “Berlin”, “Moscow”}.
集还可以包含字符串,例如A = {“ Jack”,“ Jill”,“ John”} , B = {“ Zagreb”,“ Belgrade”,“ New York”,“ Berlin”,“ Moscow”} 。
Note: In the set theory, a set can contain anything, and the set elements even don’t have to be of the same type.
注意:在集合理论中,集合可以包含任何内容,并且集合元素甚至不必具有相同的类型。
This is also a set: C = {1, “Jack”, 3.14, 2020/02/14}. It contains 4 separate information, and in this case, they have different data types.
这也是一个集合: C = {1,“ Jack”,3.14,2020/02/14} 。 它包含4个单独的信息,在这种情况下,它们具有不同的数据类型。
We’re more interested in sets that contain structures/records/tuples. Let’s take a look at one such example country = {(1, Deutschland, Germany, DEU), (2, Srbija, Serbia, SRB), (3, Hrvatska, Croatia, HRV), (4, United States of America, United States of America, USA), (5, Polska, Poland, POL), (6, España, Spain, ESP), (7, Rossiya, Russia, RUS)}. We have a list of 7 structures containing data for 7 different countries. This is something we’ve already met, and these data are actually the contents of the country table from our model.
我们对包含结构/记录/元组的集合更感兴趣。 让我们看一个这样的示例国家= {(1,Deutschland,Germany,DEU),(2,Srbija,塞尔维亚,SRB),(3,Hrvatska,克罗地亚,HRV),(4,美利坚合众国,美国美国,(5个,波兰,波兰,POL),(6个,西班牙,西班牙,ESP),(7个,Rossiya,俄罗斯,RUS)} 。 我们有7个结构的列表,其中包含7个不同国家/地区的数据。 这是我们已经满足的,这些数据实际上是我们模型中国家/地区表的内容。
该模型 (The Model)
Let’s remind ourselves of the model we’re using in this article series.
让我们回想一下本系列文章中使用的模型。
If we’re talking from the perspective of the set theory, you can look at each table as one set. Same stands for query results. Technically the result of each query is a new table and you’ll treat it in the same manner as the regular table – this query result is also a set; you can write new queries using this query as a table etc.
如果我们从集合论的角度进行讨论,则可以将每个表看作一个集合。 相同代表查询结果。 从技术上讲,每个查询的结果都是一个新表,您将以与常规表相同的方式对待它-该查询结果也是一个集合; 您可以使用此查询作为表格等来编写新查询。
集合论和维恩图 (Set Theory and Venn Diagrams)
In SQL Server we have 3 important operators at our disposal – UNION (ALL), INTERSECT, and EXCEPT. They return the result of related operators from the set theory (on the picture below).
在SQL Server中,我们有3个重要的运算符可供使用-UNION(ALL),INTERSECT和EXCEPT。 他们从集合论中返回相关算子的结果(下图)。
The easiest way to explain this is:
解释这一点的最简单方法是:
- UNION – Returns elements from both sets (if there are duplicates, they are in the final set, only once) UNION –返回两个集合中的元素(如果有重复,则它们在最终集合中,仅一次)
- UNION ALL – Same as the UNION operator, but will contain all duplicates UNION ALL –与UNION运算符相同,但将包含所有重复项
- INTERSECT – Returns a set containing elements that are present in both sets INTERSECT –返回一个包含两个集合中都存在的元素的集合
- EXCEPT/MINUS (difference) – A MINUS B is a set containing elements from the set A that are not elements of the set B (so A MINUS (A INTERSECT B)) EXCEPT / MINUS(差异)– A MINUS B是一个集合,其中包含来自集合A的元素,而不是集合B的元素(因此A MINUS(A INTERSECT B))
We won’t analyze situations where sets don’t have any common elements (A UNION B = all elements from A and B, A INTERSECT B = {}, A EXCEPT B = A, B EXCEPT A = B) and where set A = set B (A UNION B = A = B, A INTRSECT B = A = B, A EXCEPT B = B EXCEPT A = {}).
我们不会分析集合没有任何公共元素的情况(A UNION B =来自A和B的所有元素,A INTERSECT B = {},A EXCEPT B = A,B EXCEPT A = B)以及集合A =集合B(A UNION B = A = B,A INTRSECT B = A = B,A除B = B外,A = {})。
集合论与SQL (Set Theory and SQL)
We talked a lot about the set theory so far, and now it’s time for some practice. We’ll write down a few queries which will show how UNION (ALL), INTERSECT and EXCEPT operators work.
到目前为止,我们已经谈论了很多关于集合论的知识,现在是时候进行一些实践了。 我们将写下一些查询,这些查询将显示UNION(ALL),INTERSECT和EXCEPT运算符的工作方式。
#1首先,我们将测试两个单独的查询并分析它们返回的结果集 (#1 First we’ll test two separate queries and analyze the result set they return)
-- list all customers with exactly 3 calls
select customer.*
from customer
where id in (
select customer.id
from customer
inner join call on customer.id = call.customer_id
group by customer.id
having count(*) = 3
);
-- list all customers from Berlin
select customer.*
from customer
inner join city on customer.city_id = city.id
where city.city_name = 'Berlin';
You should notice a few things:
您应该注意以下几点:
- The first query returns all customers having exactly 3 calls 第一个查询返回恰好有3个致电的所有客户
- The second query returns all customers from Berlin 第二个查询返回柏林的所有客户
- Both queries return the same columns, but rows returned are not the same. This is important because you can use operators working with sets only if these two sets are composed of elements with the same structure 这两个查询返回相同的列,但返回的行不同。 这很重要,因为只有当这两个集合由具有相同结构的元素组成时,才可以使用运算符处理集合
- Each result set has 2 rows. “Bakery” is present in both result sets, and each set has one other row 每个结果集都有2行。 两个结果集中都存在“面包店”,并且每个结果集都有另一行
#2 UNION和UNION ALL (#2 UNION and UNION ALL)
Now we’ll use two available UNION operators. Any of these operators (UNION (ALL), INTERSECT, EXCEPT) is used in a way you just place it between queries.
现在,我们将使用两个可用的UNION运算符。 这些运算符(UNION(ALL),INTERSECT,EXCEPT)中的任何一种都以将其放在查询之间的方式使用。
-- UNION
-- list all customers with exactly 3 calls
select customer.*
from customer
where id in (
select customer.id
from customer
inner join call on customer.id = call.customer_id
group by customer.id
having count(*) = 3
)
UNION
-- list all customers from Berlin
select customer.*
from customer
inner join city on customer.city_id = city.id
where city.city_name = 'Berlin';
-- UNION ALL
-- list all customers with exactly 3 calls
select customer.*
from customer
where id in (
select customer.id
from customer
inner join call on customer.id = call.customer_id
group by customer.id
having count(*) = 3
)
UNION ALL
-- list all customers from Berlin
select customer.*
from customer
inner join city on customer.city_id = city.id
where city.city_name = 'Berlin';
For these two result sets returned, you should notice the following:
对于返回的这两个结果集,您应该注意以下几点:
- Each query, the first one using UNION, and the second one using UNION ALL returns 1 result set 每个查询,第一个使用UNION的查询,第二个使用UNION ALL的查询,返回1个结果集
- The result set returned by the UNION query returned all rows returned by the two queries used. The only difference is that the duplicated row had been eliminated UNION查询返回的结果集返回了所使用的两个查询返回的所有行。 唯一的区别是重复的行已被消除
- The query using UNION ALL returned all rows from both queries, without removing duplicates 使用UNION ALL的查询返回了两个查询的所有行,而没有删除重复项
- The UNION is used more often, and you’ll probably use it when you have a few complex queries and you simply want to “join” their results without writing a single more complex query UNION的使用频率更高,如果您有一些复杂的查询,而您只是想“合并”它们的结果而无需编写单个更复杂的查询,则可能会使用UNION
#3交集 (#3 INTERSECT)
The INTERSECT should return elements/rows which appear in both sets.
INTERSECT应该返回出现在两个集合中的元素/行。
-- INTERSECT
-- list all customers with exactly 3 calls
select customer.*
from customer
where id in (
select customer.id
from customer
inner join call on customer.id = call.customer_id
group by customer.id
having count(*) = 3
)
INTERSECT
-- list all customers from Berlin
select customer.*
from customer
inner join city on customer.city_id = city.id
where city.city_name = 'Berlin';
Everything went as expected and you can see that the “Bakery” row was returned as a result.
一切都按预期进行,您可以看到结果返回了“面包店”行。
#4除 (#4 EXCEPT)
The EXCEPT operator returns all elements/rows from the first set, except those that are in the second set.
EXCEPT运算符返回第一组中的所有元素/行,第二组中的元素/行除外。
-- A EXCEPT B
-- list all customers with exactly 3 calls
select customer.*
from customer
where id in (
select customer.id
from customer
inner join call on customer.id = call.customer_id
group by customer.id
having count(*) = 3
)
EXCEPT
-- list all customers from Berlin
select customer.*
from customer
inner join city on customer.city_id = city.id
where city.city_name = 'Berlin';
-- B EXCEPT A
-- list all customers from Berlin
select customer.*
from customer
inner join city on customer.city_id = city.id
where city.city_name = 'Berlin'
EXCEPT
-- list all customers with exactly 3 calls
select customer.*
from customer
where id in (
select customer.id
from customer
inner join call on customer.id = call.customer_id
group by customer.id
having count(*) = 3
);
The most important thing you should notice here is:
您在这里应注意的最重要的事情是:
- A EXCEPT B is not the same as B EXCEPT A (A and B are names of the sets) 除B外,除B外,B与B不同(A和B是集合的名称)
- The first query returns all customers having exactly 3 calls except those from Berlin, while the second query finds and returns customers from Berlin except those with exactly 3 calls 第一个查询返回除柏林以外的所有有3个电话的客户,第二个查询查找和返回除柏林有3个电话的客户。
为什么要理解集合论? (Why Should You Understand the Set Theory?)
Theory in IT is not so “hot” as the practice is. Same stands for the set theory. Still, understanding what lies in the background of the operations you run is essential for better understanding not only these operators but databases themselves.
IT理论并不像实践那样“热门”。 集合理论也是如此。 尽管如此,了解您所运行的操作的背景是什么,对于更好地理解这些操作员以及数据库本身都是至关重要的。
目录 (Table of contents)
sql语句集合里有集合