PG外部表代价估算的参数的理解

To execute a multi-table query, postgres_fdw fetches each foreign table using a single-table SELECT statement and then join them on the local server.

In version 9.5 or earlier, even if the foreign tables are stored in the same remote server, postgres_fdw fetches them individually and joins them.
In version 9.6 or later, postgres_fdw has been improved and can execute the remote join operation on the remote server when the foreign tables are on the same server and the use_remote_estimate option is on.

In version 9.5 or earlier, the sort operation, such as ORDER BY, is processed on the local server, i.e. the local server fetches all the target rows from the remote server prior to the sort operation.
In version 9.6 or later, postgres_fdw can execute the SELECT statements with an ORDER BY clause on the remote server if possible.

In version 9.6 or earlier, similar to the sort operation mentioned in the previous subsection, the aggregate functions such as AVG() and cont() are processed on the local server;
In version 10 or later, postgres_fdw executes the SELECT statement with the aggregate function on the remote server if possible.

外部表相关的参数

use_remote_estimate (boolean)
This option, which can be specified for a foreign table or a foreign server, controls whether postgres_fdw issues remote EXPLAIN commands to obtain cost estimates. A setting for a foreign table overrides any setting for its server, but only for that table. The default is false.

fdw_startup_cost (floating point)
This option, which can be specified for a foreign server, is a floating point value that is added to the estimated startup cost of any foreign-table scan on that server. This represents the additional overhead of establishing a connection, parsing and planning the query on the remote side, etc. The default value is 100.

fdw_tuple_cost (floating point)
This option, which can be specified for a foreign server, is a floating point value that is used as extra cost per-tuple for foreign-table scans on that server. This represents the additional overhead of data transfer between servers. You might increase or decrease this number to reflect higher or lower network delay to the remote server. The default value is 0.01.

When use_remote_estimate is true, postgres_fdw obtains row count and cost estimates from the remote server and then adds fdw_startup_cost and fdw_tuple_cost to the cost estimates. When use_remote_estimate is false, postgres_fdw performs local row count and cost estimation and then adds fdw_startup_cost and fdw_tuple_cost to the cost estimates. This local estimation is unlikely to be very accurate unless local copies of the remote table’s statistics are available. Running ANALYZE on the foreign table is the way to update the local statistics; this will perform a scan of the remote table and then calculate and store statistics just as though the table were local. Keeping local statistics can be a useful way to reduce per-query planning overhead for a remote table — but if the remote table is frequently updated, the local statistics will soon be obsolete.

测试环境

本地:192.168.10.11 PostgreSQL版本14.7
远程:192.168.10.14 PostgreSQL版本11.5

测试过程


192.168.10.14
[postgres@du104 ~]$ psql
psql (11.5)
Type "help" for help.

postgres=# create table t1(a integer, b text, c boolean);
CREATE TABLE
postgres=# insert into t1(a,b,c)   select s.id, chr((32+random()*94)::integer), random() < 0.01   from generate_series(1,100000) as s(id)   order by random();
INSERT 0 100000
postgres=# create table t2(a integer, b text, c boolean);
CREATE TABLE
postgres=# insert into t2(a,b,c)   select s.id, chr((32+random()*94)::integer), random() < 0.01   from generate_series(1,1000000) as s(id)   order by random();
INSERT 0 1000000
postgres=#



192.168.10.11
[postgres@du101 ~]$ psql
psql (14.7)
Type "help" for help.

postgres=#create extension postgres_fdw;
CREATE EXTENSION
postgres=# CREATE SERVER s_pg11 FOREIGN DATA WRAPPER postgres_fdw;
CREATE SERVER
postgres=# alter server s_pg11 options ( add hostaddr '192.168.10.14', add port '1921', add dbname 'postgres');
ALTER SERVER
postgres=# create user mapping for postgres server s_pg11 options(user 'postgres',password 'postgres');
CREATE USER MAPPING
postgres=# CREATE FOREIGN TABLE ft1 (a integer, b text, c boolean) SERVER s_pg11 options(schema_name 'public', table_name 't1');
CREATE FOREIGN TABLE
postgres=# CREATE FOREIGN TABLE ft2 (a integer, b text, c boolean) SERVER s_pg11 options(schema_name 'public', table_name 't2');
CREATE FOREIGN TABLE
postgres=#
postgres=# select * from pg_stats where tablename in ('ft1','ft2')\gx
(0 rows)
postgres=# analyze ft1;
AANALYZE
postgres=# analyze ft2;
ANALYZE
postgres=# select count(*) from pg_stats where tablename in ('ft1','ft2');
 count
-------
     6
(1 row)

postgres=# select * from pg_foreign_server;
  oid  | srvname | srvowner | srvfdw | srvtype | srvversion | srvacl |                     srvoptions
-------+---------+----------+--------+---------+------------+--------+----------------------------------------------------
 16397 | s_pg11  |       10 |  16393 |         |            |        | {hostaddr=192.168.10.14,port=1921,dbname=postgres}
(1 row)

postgres=# select * from information_schema.foreign_server_options;
 foreign_server_catalog | foreign_server_name | option_name | option_value
------------------------+---------------------+-------------+---------------
 postgres               | s_pg11              | hostaddr    | 192.168.10.14
 postgres               | s_pg11              | port        | 1921
 postgres               | s_pg11              | dbname      | postgres
(3 rows)

postgres=#

postgres=# explain analyze select * from ft2 where a<100;
                                                  QUERY PLAN
--------------------------------------------------------------------------------------------------------------
 Foreign Scan on ft2  (cost=100.00..17027.00 rows=100 width=7) (actual time=178.633..178.659 rows=99 loops=1)
 Planning Time: 0.098 ms
 Execution Time: 179.590 ms
(3 rows)

postgres=# explain analyze select * from ft1 ,ft2 where ft1.a=ft2.a and ft1.a<100;
                                                         QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=1895.66..40071.49 rows=82 width=14) (actual time=24.648..3985.056 rows=99 loops=1)
   Hash Cond: (ft2.a = ft1.a)
   ->  Foreign Scan on ft2  (cost=100.00..34525.00 rows=1000000 width=7) (actual time=0.955..3780.879 rows=1000000 loops=1)
   ->  Hash  (cost=1794.64..1794.64 rows=82 width=7) (actual time=17.378..17.379 rows=99 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 12kB
         ->  Foreign Scan on ft1  (cost=100.00..1794.64 rows=82 width=7) (actual time=17.316..17.328 rows=99 loops=1)
 Planning Time: 0.255 ms
 Execution Time: 3985.892 ms
(8 rows)

postgres=# ALTER SERVER s_pg11 OPTIONS (ADD use_remote_estimate 'true');
ALTER SERVER
postgres=# select * from pg_foreign_server;
  oid  | srvname | srvowner | srvfdw | srvtype | srvversion | srvacl |                                 srvoptions

-------+---------+----------+--------+---------+------------+--------+-----------------------------------------------------------------------------
 16397 | s_pg11  |       10 |  16393 |         |            |        | {hostaddr=192.168.10.14,port=1921,dbname=postgres,use_remote_estimate=true}
(1 row)

postgres=# select * from information_schema.foreign_server_options;
 foreign_server_catalog | foreign_server_name |     option_name     | option_value
------------------------+---------------------+---------------------+---------------
 postgres               | s_pg11              | hostaddr            | 192.168.10.14
 postgres               | s_pg11              | port                | 1921
 postgres               | s_pg11              | dbname              | postgres
 postgres               | s_pg11              | use_remote_estimate | true
(4 rows)

postgres=# explain analyze select * from ft2 where a<100;
                                                  QUERY PLAN
---------------------------------------------------------------------------------------------------------------
 Foreign Scan on ft2  (cost=1100.00..10745.33 rows=100 width=7) (actual time=193.415..193.426 rows=99 loops=1)
 Planning Time: 5.469 ms
 Execution Time: 194.014 ms
(3 rows)

postgres=# explain analyze select * from ft1 ,ft2 where ft1.a=ft2.a and ft1.a<100;
                                               QUERY PLAN
--------------------------------------------------------------------------------------------------------
 Foreign Scan  (cost=2794.22..12960.56 rows=98 width=14) (actual time=230.967..230.977 rows=99 loops=1)
   Relations: (ft1) INNER JOIN (ft2)
 Planning Time: 9.611 ms
 Execution Time: 231.647 ms
(4 rows)

postgres=#

结论

1.外部表也可以通过analyze收集统计信息到本地环境;
2.当对外部server开启参数use_remote_estimate后,优化器在远程服务器上执行远程join操作,且相关的统计信息更准确;

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值