之前写的文章,现在放出来, 2019.5.23
在验证有关sequence的问题时,构造了个sql语句:select * from tx1 where id2 = nextval('seq1'); 预想的和实际执行结果不一致。测试用例如下:
create sequence seq1;
select nextval('seq1');
create table tx1(id1 int, id2 int);
insert into tx1 select generate_series(1,100), random()*102;
explain verbose select * from tx1 where id2 = nextval('seq1');;
select * from tx1 where id2 = nextval('seq1');
postgres=# explain verbose select * from tx1 where id2 = nextval('seq1');;
QUERY PLAN
------------------------------------------------------------
Seq Scan on public.tx1 (cost=0.00..43.90 rows=11 width=8)
Output: id1, id2
Filter: (tx1.id2 = nextval('seq1'::regclass))
(3 rows)
postgres=# select * from tx1 where id2 = nextval('seq1');
id1 | id2
-----+-----
56 | 57
(1 row)
按照个人理解,nextval('seq1') 如果按照先求出值,这个值应该是2,那么执行若有结果,则id2对应的应该是2, 但是执行结果却不是2。
向pg社区发送了个邮件。pg社区邮件列表归档位置:https://www.postgresql.org/list/
这个是发送的邮件及社区回复:
https://www.postgresql.org/message-id/58540bf0.1474c.16615c4ec95.Coremail.jluwln%40163.com
个人理解上出现了偏差,也就是说nextval设计就是如此,不管什么时候insert 还是select调用这个nextval,背后都向下执行了一把,而不是先把这个nextval转化为常量; 从执行计划也可以看出,tx1.id2 = nextval('seq1'::regclass) 而不是直接将其替换为常量。
另外,如果等值判断右值是个函数,则仍不会先计算出函数值,仍会每次遍历表行的内容与该函数进行判断。
postgres=# explain verbose select * from t1 where id2= add(23);
QUERY PLAN
---------------------------------------------------------------------------------
Data Node Scan on t1 "_REMOTE_TABLE_QUERY_" (cost=0.00..117.25 rows=2 width=8)
Output: t1.id1, t1.id2
Primary node/s: dn1
Node/s: dn1, dn2
Remote query: SELECT id1, id2 FROM ONLY public.t1 t1 WHERE true
Coordinator quals: (t1.id2 = add(23))
(6 rows)
函数定义如下:
CREATE OR REPLACE FUNCTION add (id int)
RETURNS integer AS $total$
declare
total integer;
BEGIN
--SELECT nextval('seq1') into total;
insert into t2 select id;
RETURN id;
END;
$total$ LANGUAGE plpgsql;
假设t1表有100行,则执行一次 select * from t1 where id2= add(23) 会向t2 表插入100行数据。
sql 语句中含有函数表达式,对性能影响较大。
这个影响函数的性质:
/*
* Symbolic values for provolatile column: these indicate whether the result
* of a function is dependent *only* on the values of its explicit arguments,
* or can change due to outside factors (such as parameter variables or
* table contents). NOTE: functions having side-effects, such as setval(),
* must be labeled volatile to ensure they will not get optimized away,
* even if the actual return value is not changeable.
*/
#define PROVOLATILE_IMMUTABLE 'i' /* never changes for given input */
#define PROVOLATILE_STABLE 's' /* does not change within a scan */
#define PROVOLATILE_VOLATILE 'v' /* can change even within a scan */