由于项目需求,平常每个月会有四五百万的数据,到季度考核时数据更是恐怖到两三千万条数据,postgresql数据库研发人员已经优化多次,后面只能进行分区分表优化。
研发优化的过程分为短期和长期
短期够用户使用,查询等功能需要运维或者驻场数据库查询数据,比如按年、按季度、按月导表,查询当前表的数据,或者将其他状态的数据移走,系统默认查询当天的数据等等
长期,使用索引,分区分表等。
今天讲讲我自己使用的分区分表,分区分表简单概述就是主表只是展示数据,实际数据存储都在子表中,但是逻辑是相通的,即对主表或子表进行增删改,在数据库中两个都会有影响。查询的性能可以大幅度提高。
分区分表,比较灵活,可以利用时间(年月日等等)进行分区表,后者利用某个字段的排序(长度、大小等等)进行分表,比较灵活,我们这次用的就是某个字段最后一位数进行分表。
第一步:创建主表和子表
--创建主表
CREATE TABLE "public"."jg_jgxw_check_action" (
"cd_id" varchar(100) COLLATE "pg_catalog"."default" NOT NULL,
"record_unique_identity" varchar(100) COLLATE "pg_catalog"."default",
"supervise_item_implement_code" varchar(3000) COLLATE "pg_catalog"."default",
"supervise_item_check_icode" varchar(3000) COLLATE "pg_catalog"."default",
"check_action_name" varchar(2000) COLLATE "pg_catalog"."default",
"check_action_code" varchar(50) COLLATE "pg_catalog"."default",
"implement_institution" varchar(100) COLLATE "pg_catalog"."default",
;
--创建子表
create table jg_jgxw_check_action_202301 () inherits(jg_jgxw_check_action);
create table jg_jgxw_check_action_202302 () inherits(jg_jgxw_check_action);
create table jg_jgxw_check_action_202303 () inherits(jg_jgxw_check_action);
create table jg_jgxw_check_action_202304 () inherits(jg_jgxw_check_action);
create table jg_jgxw_check_action_202305 () inherits(jg_jgxw_check_action);
create table jg_jgxw_check_action_202306 () inherits(jg_jgxw_check_action);
create table jg_jgxw_check_action_202307 () inherits(jg_jgxw_check_action);
create table jg_jgxw_check_action_202308 () inherits(jg_jgxw_check_action);
create table jg_jgxw_check_action_202309 () inherits(jg_jgxw_check_action);
create table jg_jgxw_check_action_202310 () inherits(jg_jgxw_check_action);
第二步:创建主表和子表之间的路由函数(逻辑关系)
--创建主表子表间的关系,通过cd-id的最后以为数据进行区分
CREATE OR REPLACE FUNCTION "public"."alarm_partition_trigger"()
RETURNS "pg_catalog"."trigger" AS $BODY$
BEGIN
IF "right"(NEW.cd_id, 1) = '1'
THEN
INSERT INTO jg_jgxw_check_action_202301 VALUES (NEW.*);
ELSIF "right"(NEW.cd_id, 1) = '2'
THEN
INSERT INTO jg_jgxw_check_action_202302 VALUES (NEW.*);
ELSIF "right"(NEW.cd_id, 1) = '3'
THEN
INSERT INTO jg_jgxw_check_action_202303 VALUES (NEW.*);
ELSIF "right"(NEW.cd_id, 1) = '4'
THEN
INSERT INTO jg_jgxw_check_action_202304 VALUES (NEW.*);
ELSIF "right"(NEW.cd_id, 1) = '5'
THEN
INSERT INTO jg_jgxw_check_action_202305 VALUES (NEW.*);
ELSIF "right"(NEW.cd_id, 1) = '6'
THEN
INSERT INTO jg_jgxw_check_action_202306 VALUES (NEW.*);
ELSIF "right"(NEW.cd_id, 1) = '7'
THEN
INSERT INTO jg_jgxw_check_action_202307 VALUES (NEW.*);
ELSIF "right"(NEW.cd_id, 1) = '8'
THEN
INSERT INTO jg_jgxw_check_action_202308 VALUES (NEW.*);
ELSIF "right"(NEW.cd_id, 1) = '9'
THEN
INSERT INTO jg_jgxw_check_action_202309 VALUES (NEW.*);
ELSIF "right"(NEW.cd_id, 1) = '0'
THEN
INSERT INTO jg_jgxw_check_action_202310 VALUES (NEW.*);
end if;
return null;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100
第三步:建立挂在分区(将主表和子表联系起来)
CREATE TRIGGER insert_almart_partition_trigger
BEFORE INSERT ON jg_jgxw_check_action
FOR EACH ROW EXECUTE PROCEDURE alarm_partition_trigger();
插入数据到主表或者子表,都可发现主表子表都有数据,大功告成!
当然由于数据量比较大,可以对分表分区进行优化,给每个分区表增加一个特定的约束,以防止全表查询扫描查询时间过长
特别注意:主表和子表都必须要有索引,大幅提高速度,子表仅作约束,从45秒减至20秒,所有主表子表添加索引,从20秒减至0.05秒。
--分区表做独立约束提高查询效率(没做前45秒,做完之后只要20秒)
ALTER TABLE jg_jgxw_check_action_202301
ADD CONSTRAINT jg_jgxw_check_action_202301_check_cd_id_key
CHECK (right(cd_id,1)='1');
ALTER TABLE jg_jgxw_check_action_202302
ADD CONSTRAINT jg_jgxw_check_action_202302_check_cd_id_key
CHECK (right(cd_id,1)='2');
ALTER TABLE jg_jgxw_check_action_202303
ADD CONSTRAINT jg_jgxw_check_action_202303_check_cd_id_key
CHECK (right(cd_id,1)='3');
ALTER TABLE jg_jgxw_check_action_202304
ADD CONSTRAINT jg_jgxw_check_action_202304_check_cd_id_key
CHECK (right(cd_id,1)='4');
ALTER TABLE jg_jgxw_check_action_202305
ADD CONSTRAINT jg_jgxw_check_action_202305_check_cd_id_key
CHECK (right(cd_id,1)='5');
ALTER TABLE jg_jgxw_check_action_202306
ADD CONSTRAINT jg_jgxw_check_action_202306_check_cd_id_key
CHECK (right(cd_id,1)='6');
ALTER TABLE jg_jgxw_check_action_202307
ADD CONSTRAINT jg_jgxw_check_action_202307_check_cd_id_key
CHECK (right(cd_id,1)='7');
ALTER TABLE jg_jgxw_check_action_202308
ADD CONSTRAINT jg_jgxw_check_action_202308_check_cd_id_key
CHECK (right(cd_id,1)='8');
ALTER TABLE jg_jgxw_check_action_202309
ADD CONSTRAINT jg_jgxw_check_action_202309_check_cd_id_key
CHECK (right(cd_id,1)='9');
ALTER TABLE jg_jgxw_check_action_202310
ADD CONSTRAINT jg_jgxw_check_action_202310_check_cd_id_key
CHECK (right(cd_id,1)='0');
也可按日期进行分表,按日期可借鉴进阶数据库系列(十三):PostgreSQL 分区分表-CSDN博客