一.数据表数据一致且翻倍
这里准备了两个测试表,表一为原表,表二为重复表
#1.ReplacingMergeTree引擎去重
CREATE TABLE salaries3
ENGINE = ReplacingMergeTree
ORDER BY (emp_no, salary, from_date, to_date) AS
SELECT *
FROM salaries2
#2.聚合去重,可以适用于字段较少的表
CREATE TABLE salaries4
ENGINE = MergeTree
ORDER BY emp_no AS
SELECT
emp_no,
salary,
from_date,
to_date
FROM salaries2
GROUP BY
emp_no,
salary,
from_date,
to_date
一般生产环境的表都会有更新字段,数据更新的话,如何去掉老数据可以参考我的另一篇