postgresql explain query中的materialize

最新推荐文章于 2024-01-17 11:34:18 发布

wh62592855

最新推荐文章于 2024-01-17 11:34:18 发布

阅读量4.8k

点赞数 2

分类专栏： PostgreSQL 文章标签： postgresql query greenplum nested filter sql

本文链接：https://blog.csdn.net/wanghai__/article/details/6660299

版权

PostgreSQL 专栏收录该内容

55 篇文章 1 订阅

订阅专栏

今天在GREENPLUM的执行计划中看到了materialize，不知道这个步骤是做什么的，查了下文档，没有相关解释。没事，GREENPLUM不行还有POSTGRESQL呢，GOOGLE了一把，找到了答案。

What does materialize do? I'm joining two tables, not views or anything like that.

A materialize node means the output of whatever is below it in the tree (which can be a scan, or a full set of joins or something like that) is materalized into memory before the upper node is executed. This is usually done when the outer node needs a source that it can re-scan for some reason or other.

So in your case, the planner is determining that the result of a scan on one of your tables will fit in memory, and it till make it possible to choose an upper join operation that requires rescans while still being cheaper.

具体的SQL如下

aligputf8=# select count(1) from ttt1;
count
-------
10000
(1 row)

aligputf8=#
aligputf8=# select count(1) from ttt2;
count
-------
10000
(1 row)

aligputf8=# explain select * from ttt1 as a,ttt2 as b where a.id=3;
                                         QUERY PLAN
---------------------------------------------------------------------------------------------
Gather Motion 6:1 (slice2; segments: 6) (cost=0.00..1409.07 rows=1667 width=8)
   -> Nested Loop (cost=0.00..1409.07 rows=1667 width=8)
         -> Broadcast Motion 1:6 (slice1; segments: 1) (cost=0.00..137.07 rows=6 width=4)
               -> Seq Scan on ttt1 a (cost=0.00..137.00 rows=1 width=4)
                     Filter: id = 3
         -> Seq Scan on ttt2 b (cost=0.00..112.00 rows=1667 width=4)
(6 rows)

aligputf8=# explain select * from ttt1 as a,ttt2 as b where a.id=3;
aligputf8=#
aligputf8=# explain select * from ttt1 as a,ttt2 as b where a.id<3000;
                                           QUERY PLAN
------------------------------------------------------------------------------------------------
Gather Motion 6:1 (slice2; segments: 6) (cost=122.00..3599268.93 rows=4998334 width=8)
   -> Nested Loop (cost=122.00..3599268.93 rows=4998334 width=8)
         -> Broadcast Motion 6:6 (slice1; segments: 6) (cost=0.00..346.93 rows=2999 width=4)
               -> Seq Scan on ttt1 a (cost=0.00..137.00 rows=500 width=4)
                     Filter: id < 3000
         -> Materialize (cost=122.00..222.00 rows=1667 width=4)
               -> Seq Scan on ttt2 b (cost=0.00..112.00 rows=1667 width=4)
(7 rows)

我的理解就是PG为了加快nested loop循环的速度，把b表的数据缓存在了内存中。我们可以看到前面一个SQL，只对b表数据进行1次扫描，因此并不需要进行缓存；后者需要进行2999次扫描。

wh62592855

关注

2
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
postgresql explain query中的materialize

今天在GREENPLUM的执行计划中看到了materialize，不知道这个步骤是做什么的，查了下文档，没有相关解释。没事，GREENPLUM不行还有POSTGRESQL呢，GOOGLE了一把，找到了答案。Q:What does materialize do? I'm j
复制链接

扫一扫

专栏目录