--Step1:准备数据,创建表,插入数据,注意数据的设计,每个表是3种情况,目的是看字段为NULL时,各种连接的处理方式
create table A (idA INT, c1 int, c2 int);
create table B (idB INT, c1 int, c2 int);
create table C (idC INT, c1 int, c2 int);
create table D (idD INT, c1 int, c2 int);
insert into A values(1,11,21);
insert into A values(1,NULL,22);
insert into A values(1,13,NULL);
insert into B values(2,11,21);
insert into B values(2,NULL,22);
insert into B values(2,13,NULL);
insert into C values(3,11,21);
insert into C values(3,NULL,22);
insert into C values(3,13,NULL);
insert into D values(1,11,21);
insert into D values(1,NULL,22);
insert into D values(1,13,NULL);
--step2:查看多表连接的查询计划,了解多表连接的连接顺序
test=# explain select * from A, B, C, D;
QUERY PLAN
------------------------------------------------------------------------------
1 Nested Loop (cost=0.00..177149876477.15 rows=14164684960000 width=48)
2 -> Nested Loop (cost=0.00..91314442.90 rows=7301384000 width=36)
3 -> Nested Loop (cost=0.00..47108.65 rows=3763600 width=24)
4 -> Seq Scan on a (cost=0.00..29.40 rows=1940 width=12)
5 -> Materialize (cost=0.00..39.10 rows=1940 width=12)
6 -> Seq Scan on b (cost=0.00..29.40 rows=1940 width=12)
7 -> Materialize (cost=0.00..39.10 rows=1940 width=12)
8 -> Seq Scan on c (cost=0.00..29.40 rows=1940 width=12)
9 -> Materialize (cost=0.00..39.10 rows=1940 width=12)
10 -> Seq Scan on d (cost=0.00..29.40 rows=1940 width=12)
(10 rows)
分析:
1) 查询计划的行号1、2、3…是为了表述方便增加的
2) 每行缩进表示嵌套关系,如第2行、第9行,表示在同一层,而第1行是2、9行连接的方式和结果;第4、5行表示同一层,第2行是4、5行连接的方式和结果
3) 从4、5行可以看出,A和B先连接,生成新的关系(第3行表示);第3行再和第7行的C连接,生成新的关系(第2行表示);第2行再和第9行的D连接,生成最终的关系
4) 所以,连接顺序为AB->ABC->ABCD,这符合左深树的连接方式
5) A/B/C/D四表之间,观察建表和插入数据的语句,没有任何特殊之处,只是在where子句中,出现时有前后次序。例如,可以观察“explain select * from D, B, A, C;”的查询计划,其连接顺序为:DB->DBA->DBAC
6) 第1、2、3行,分别是关系每次连接的结果,其行头,都是“Nested Loop”,表明使用嵌套循环连接算法进行连接
--step3:查看多表连接的查询计划,了解带有连接条件的多表连接的连接顺序和连接算法的选择
test=# explain select * from A, B, C, D where A.c1=B.c1 and B.c1=C.c1 AND C.c1=D.c1;
QUERY PLAN
------------------------------------------------------------------------------
1 Merge Join (cost=541.37..27731.14 rows=1770590 width=48)
2 Merge Cond: (a.c1 = c.c1)
3 -> Merge Join (cost=270.68..562.65 rows=18818 width=24)
4 Merge Cond: (a.c1 = b.c1)
5 -> Sort (cost=135.34..140.19 rows=1940 width=12)
6 Sort Key: a.c1
7 -> Seq Scan on a (cost=0.00..29.40 rows=1940 width=12)
8 -> Sort (cost=135.34..140.19 rows=1940 width=12)
9 Sort Key: b.c1
10 -> Seq Scan on b (cost=0.00..29.40 rows=1940 width=12)
11 -> Materialize (cost=270.68..609.70 rows=18818 width=24)
12 -> Merge Join (cost=270.68..562.65 rows=18818 width=24)
13 Merge Cond: (c.c1 = d.c1)
14 -> Sort (cost=135.34..140.19 rows=1940 width=12)
15 Sort Key: c.c1
16 -> Seq Scan on c (cost=0.00..29.40 rows=1940 width=12)
17 -> Sort (cost=135.34..140.19 rows=1940 width=12)
18 Sort Key: d.c1
19 -> Seq Scan on d (cost=0.00..29.40 rows=1940 width=12)
(19 rows)
分析:
1) 查询计划的行号1、2、3…是为了表述方便增加的
2) 第3、12行,分别表明AB连接、CD连接,然后第1行表明AB连接后的新关系作为一个整体和CD生成的新关系进行连接;这样的连接顺序,符合紧密树的连接方式
3) 对比“step2”,本查询语句带有了连接条件,第5、8、14、17行均依据连接条件进行了排序,然后采用了归并算法(Merge Join)进行连接(第1、3、12行)
--step4:查看多表连接的查询计划,了解带有连接条件的多表连接的连接顺序和连接算法的选择,对比step3,观察连接条件发生变化对于查询计划的影响
test=# explain select * from A, B, C, D where A.c1=B.c1 and B.c2=C.c1 AND C.c2=D.c1;
QUERY PLAN
------------------------------------------------------------------------------
Merge Join (cost=3797.43..30450.31 rows=1770590 width=48)
Merge Cond: (b.c2 = c.c1)
-> Sort (cost=1898.72..1945.76 rows=18818 width=24)
Sort Key: b.c2
-> Merge Join (cost=270.68..562.65 rows=18818 width=24)
Merge Cond: (a.c1 = b.c1)
-> Sort (cost=135.34..140.19 rows=1940 width=12)
Sort Key: a.c1
-> Seq Scan on a (cost=0.00..29.40 rows=1940 width=12)
-> Sort (cost=135.34..140.19 rows=1940 width=12)
Sort Key: b.c1
-> Seq Scan on b (cost=0.00..29.40 rows=1940 width=12)
-> Sort (cost=1898.72..1945.76 rows=18818 width=24)
Sort Key: c.c1
-> Merge Join (cost=270.68..562.65 rows=18818 width=24)
Merge Cond: (c.c2 = d.c1)
-> Sort (cost=135.34..140.19 rows=1940 width=12)
Sort Key: c.c2
-> Seq Scan on c (cost=0.00..29.40 rows=1940 width=12)
-> Sort (cost=135.34..140.19 rows=1940 width=12)
Sort Key: d.c1
-> Seq Scan on d (cost=0.00..29.40 rows=1940 width=12)
(22 rows)
分析:
1) 连接条件与setp3比,有一定变化,主要是连接的列不再是各表的c1列
2) 与step3的查询计划相比,没有什么特殊变化