postgresal去重_PostgreSQL数据去重

最新推荐文章于 2024-05-14 13:36:57 发布

weixin_39957647

最新推荐文章于 2024-05-14 13:36:57 发布

阅读量447

点赞数

文章标签： postgresal去重

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_39957647/article/details/111750373

版权

数据库中去重的需求比较常见，比较常见的如单列去重、多列去重、行去重等。pg中针对这些不同的去重要求，我们也可以使用不同的去重方法。https://www.cndba.cn/foucus/article/3916

1、单列去重单列去重应该是最常见的了，就是将某一列中重复的记录去除掉，我们可以根据要求保留最新或者最旧的记录。

—创建测试数据

bill=# create table test1(id int primary key, c1 int, c2 timestamp);

CREATE TABLE

bill=# insert into test1 select generate_series(1,1000000), random()*1000, clock_timestamp();

INSERT 0 1000000

bill=# create index idx_test1 on test1(c1,id);

CREATE INDEX

—方法1：

聚合，not in

https://www.cndba.cn/foucus/article/3916

bill=# explain delete from test1 where id not in (select max(id) from test1 group by c1);

QUERY PLAN

------------------------------------------------------------------------------------------------------------------

Delete on test1 (cost=30609.23..48515.23 rows=500000 width=6)

-> Seq Scan on test1 (cost=30609.23..48515.23 rows=500000 width=6)

Filter: (NOT (hashed SubPlan 1))

SubPlan 1

-> GroupAggregate (cost=0.42..30606.73 rows=1001 width=8)

Group Key: test1_1.c1

-> Index Only Scan using idx_test1 on test1 test1_1 (cost=0.42..25596.72 rows=1000000 width=8)

(7 rows)

—方法2:

使用窗口查询，in

bill=# explain select id from (select row_number() over(partition by c1 order by id) as rn, id from test1) t where t.rn<>1;

QUERY PLAN

--------------------------------------------------------------------------------------------------

Subquery Scan on t (cost=0.42..55596.72 rows=995000 width=4)

Filter: (t.rn <> 1)

-> WindowAgg (cost=0.42..43096.72 rows=1000000 width=16)

-> Index Only Scan using idx_test1 on test1 (cost=0.42..25596.72 rows=1000000 width=8)

(4 rows)

—方法3:

使用游标的方式去遍历，每一条记录比较一次。

bill=# do language plpgsql $$

bill$# declare

bill$# v_rec record;

bill$# v_c1 int;

bill$# cur1 cursor for select c1,id from test1 order by c1,id for update;

bill$# begin

bill$# for v_rec in cur1 loop

bill$# if v_rec.c1 = v_c1 then

bill$# delete from test1 where current of cur1;

bill$# end if;

bill$# v_c1 := v_rec.c1;

bill$# end loop;

bill$# end;

bill$# $$;

DO

上面三种方式，方法二效率最高，其次是方法三。

http

最低0.47元/天解锁文章

weixin_39957647

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
postgresal去重_PostgreSQL数据去重

数据库中去重的需求比较常见，比较常见的如单列去重、多列去重、行去重等。pg中针对这些不同的去重要求，我们也可以使用不同的去重方法。https://www.cndba.cn/foucus/article/39161、单列去重单列去重应该是最常见的了，就是将某一列中重复的记录去除掉，我们可以根据要求保留最新或者最旧的记录。—创建测试数据bill=# create table test1(id int ...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。