一、表结构
注意两张表的id字段没有唯一约束。
postgres=# \d a Table "public.a" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- id | integer | | not null | v | text | | | postgres=# \d b Table "public.b" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- id | integer | | not null | v | text | | | postgres=#
二、两条SQL语句
-- 语句一 select a.* from a where exists(select 1 from b where a.id = b.id); -- 语句二 select a.* from a join b on a.id = b.id ;
三、执行计划
语句一:使用Hash Semi Join
链接,SEMI JOIN
表示当A表中的记录,在B表上产生符合条件之后就返回,不会再继续查找B表记录了,所以如果B表有重复,也不会产生重复的多条记录。
语句二:使用Hash Join
链接,JOIN
表示当A表中的记录,在B表上产生符合条件之后还需要查看符合条件的有几条,如果B表有重复,那么返回的A表记录也就有重复。
postgres=# explain select a.* from a where exists(select 1 from b where a.id = b.id); QUERY PLAN ------------------------------------------------------------- Hash Semi Join (cost=1.04..2.11 rows=2 width=36) Hash Cond: (a.id = b.id) -> Seq Scan on a (cost=0.00..1.03 rows=3 width=36) -> Hash (cost=1.02..1.02 rows=2 width=4) -> Seq Scan on b (cost=0.00..1.02 rows=2 width=4) (5 rows) postgres=# explain select a.* from a join b on a.id = b.id ; QUERY PLAN ------------------------------------------------------------- Hash Join (cost=1.04..2.11 rows=2 width=36) Hash Cond: (a.id = b.id) -> Seq Scan on a (cost=0.00..1.03 rows=3 width=36) -> Hash (cost=1.02..1.02 rows=2 width=4) -> Seq Scan on b (cost=0.00..1.02 rows=2 width=4) (5 rows) postgres=#
四、实际运行
postgres=# table a; id | v ----+--- 1 | a 2 | a 3 | a (3 rows) postgres=# table b; id | v ----+--- 1 | b 2 | b 1 | b (3 rows) postgres=# select a.* from a where exists(select 1 from b where a.id = b.id); id | v ----+--- 1 | a 2 | a (2 rows) postgres=# select a.* from a join b on a.id = b.id ; id | v ----+--- 1 | a 1 | a 2 | a (3 rows) postgres=#