Postgres重复数据的更新一例

最新推荐文章于 2024-07-20 17:06:28 发布

weixin_34378045

最新推荐文章于 2024-07-20 17:06:28 发布

阅读量403

点赞数

文章标签：数据库大数据

原文链接：https://my.oschina.net/Kenyon/blog/115662

版权

为什么80%的码农都做不了架构师？>>>

同事有一个需求，要求对一张小表的重复数据进行更新，数据量大概10W。

背景数据示例：

[postgres@localhost ~]$ psql
psql (9.2.3)
Type "help" for help.

postgres=# create table t_kenyon(id int,regguid text);
CREATE TABLE
postgres=# insert into t_kenyon values(1,'a'),(1,'a');
INSERT 0 2
postgres=# insert into t_kenyon values(2,'bb'),(2,'bb'),(2,'bb');
INSERT 0 3
postgres=# insert into t_kenyon values(3,'cc'),(3,'cc'),(3,'cc'),(4,'dd'),(5,'ee');
INSERT 0 5
postgres=# insert into t_kenyon values(1,'xx');
INSERT 0 1

postgres=# select * from t_kenyon order by id;
 id | regguid 
----+---------
  1 | a
  1 | a
  1 | xx
  2 | bb
  2 | bb
  2 | bb
  3 | cc
  3 | cc
  3 | cc
  4 | dd
  5 | ee
(11 rows)

需求：
要求对regguid有重复的数据和相同的ID，更新regguid，仅保留其中一条，其他置为0，如结果应类似

可以用该表的主键字段来实现，没有主键字段可选择ctid来做。SQL如下：

postgres=# update t_kenyon a set regguid = '0' where ctid != (select min(ctid) from t_kenyon b where a.id=b.id group by id having count(1)>1);
UPDATE 5

postgres=# select * from t_kenyon order by id;
 id | regguid 
----+---------
  1 | a
  1 | xx
  1 | 0
  2 | bb
  2 | 0
  2 | 0
  3 | cc
  3 | 0
  3 | 0
  4 | dd
  5 | ee
(11 rows)

postgres=# vacuum  full  analyze t_kenyon;
VACUUM

大数据的更新最后vacuum一下，搞定.

转载于:https://my.oschina.net/Kenyon/blog/115662