亿级Mysql数据表的复杂逻辑执行脚本

1. 问题背景

1.1 全国各级地方,包括省、市、县的学生,他们用小天才电话手表给其他地方的学生发消息,接收消息也会有记录,这些消息的明细记录在消息明细表中。只有成功发送消息,对方还要成功接收消息,有至少1条发送,一条接收成功的消息,互动才算成功。
1.2 每个学生的户籍地是不一样的,只能从他们的入学登记表中查询出来户籍地,入学登记表与消息明细表通过学生ID关联。
1.3 入学登记表中查询出来户籍地可能登记到县、市、省,还需要地区关系映射表转换成省。
1.4 现在需要统计互动成功的学生,他们向户籍地在本省的学生互动成功、向户籍地在其他生活上的学生互动成功的比例
1.5 消息明细表每天流水过千万,单张表过亿;入学登记表单张表数据量过亿。

2. 建表语句

-- 地区关系映射表
create table `region`(
	`id` int(11) not null AUTO_INCREMENT COMMENT '主键',
	`province_id` int(11) COMMENT '省级代码',
	`province_name` varchar(50) COMMENT '省级名称',
	`city_id` int(11) COMMENT '市级代码',
	`city_name` varchar(50) COMMENT '市级名称',
	`county_id` int(11) COMMENT '县级代码',
	`county_name` varchar(50) COMMENT '县级名称',
	`level` varchar(50) COMMENT '地区级别,省-1,市-2,县-3',
	PRIMARY KEY(`id`) USING BTREE
)ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4 ROW_FORMAT_DYNAMIC;

-- 消息明细表
create table `messages_detail`(
	`id` int(11) not null AUTO_INCREMENT COMMENT '主键',
	`student_id` int(11) COMMENT '学号',
	`student_name` varchar(50) COMMENT '姓名',
	`status` varchar(5) COMMENT '状态,0-发送消息,1-接收消息',
	`message_id` int(11) COMMENT '消息流水号',
	`create_time` datetime  DEFAULT NULL COMMENT '消息时间',
	PRIMARY KEY(`id`) USING BTREE,
	INDEX index_create_time (`create_time`),
	INDEX index_message_id (`message_id`)
)ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4 ROW_FORMAT_DYNAMIC;

-- 入学登记表
create table `enrollment_form`(
	`id` int(11) not null AUTO_INCREMENT COMMENT '主键',
	`student_id` int(11) COMMENT '学号',
	`student_name` varchar(50) COMMENT '姓名',
	`status` varchar(5) COMMENT '状态,0-正常,1-毕业',
	`region_id` int(11) COMMENT '户籍地ID',
	`create_time` datetime  DEFAULT NULL COMMENT '时间',
	PRIMARY KEY(`id`) USING BTREE,
	INDEX index_message_id (`message_id`)
)ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4 ROW_FORMAT_DYNAMIC;

3. 联表查询

因本省的学生互动较多,普通联表查询,查询出本省学生互动时较多,查询本省学生互动时耗时30分钟,但是查询与其他省学生互动时,耗时较久,约90分钟。
思路
因为只有成功发送消息,对方还要成功接收消息,有至少1条发送,一条接收成功的消息,互动才算成功。发送多条消息的时候,流水号会有多条,但是消息明细表的状态一定至少有一个0-发送消息,与一个1-接收消息,因此通过消息流水号与状态两个字段进行去重,如果有2条记录,则代表互动成功。

3.1 本省学生互动

-- 1层
select count(*) from (
-- 2层
select mi, count(*) as cn from (
-- 3(select md.message_id as mi, md.status as sts from messages_detail md join enrollment_form ef 
on md.student_id=ef.student_id
join
(select province_id as lowId, province_id as highId from region where level=1
union
select city_id as lowId, province_id as highId from region where level=2
union
select county_id as lowId, province_id as highId from region where level=3
) as rg
on ef.region_id = rg.lowId
-- 限定消息日期,与消息类型为0-发送
where md.create_time='20240801' and md.status =0 and rg.highId ='湖北省的ID')
-- 进行两个字段的去重
union
(select md.message_id as mi, md.status as sts from messages_detail md join enrollment_form ef 
on md.student_id=ef.student_id
join
(select province_id as lowId, province_id as highId from region where level=1
union
select city_id as lowId, province_id as highId from region where level=2
union
select county_id as lowId, province_id as highId from region where level=3
) as rg
on ef.region_id = rg.lowId
-- 限定消息日期,与消息类型为1-接收消息
where md.create_time='20240801' and md.status =1 and rg.highId ='湖北省的ID')
) group by mi
) as c
where c.cn=2;

3.2 本省学生与其他省学生互动

-- 1层
select count(*) from (
-- 2层
select mi, count(*) as cn from (
-- 3(select md.message_id as mi, md.status as sts from messages_detail md join enrollment_form ef 
on md.student_id=ef.student_id
join
(select province_id as lowId, province_id as highId from region where level=1
union
select city_id as lowId, province_id as highId from region where level=2
union
select county_id as lowId, province_id as highId from region where level=3
) as rg
on ef.region_id = rg.lowId
-- 限定消息日期,与消息类型为0-发送
where md.create_time='20240801' and md.status =0 and rg.highId ='湖北省的ID')
-- 进行两个字段的去重
union
(select md.message_id as mi, md.status as sts from messages_detail md join enrollment_form ef 
on md.student_id=ef.student_id
join
(select province_id as lowId, province_id as highId from region where level=1
union
select city_id as lowId, province_id as highId from region where level=2
union
select county_id as lowId, province_id as highId from region where level=3
) as rg
on ef.region_id = rg.lowId
-- 限定消息日期,与消息类型为1-接收消息
where md.create_time='20240801' and md.status =1 and rg.highId !='湖北省的ID')
) group by mi
) as c
where c.cn=2;

4. 脚本

通过建立临时表,查询效率较高,耗时60分钟。
脚本名称,region_stu_connect.sh

#!/bin/sh
if[$# -ne 2];then
	echo"\nUSAGE:`basename $0`<省份><互动日期>\n"
	exit
fi

date

set session transaction isolation level READ UNCOMMITTED;

-- 获取地区
select province_id as lowId from region where province_id = '$1'
union
select city_id as lowId from region where province_id = '$1'
union
select county_id as lowId from region where province_id = '$1'
into temp tmp_region with no log;

-- 获取消息类型为0-发送消息的记录
select student_id, message_id from messages_detail where create_time='$2' and status ='0' and exists (
	select 1 from enrollment_form where region_id in (
		select lowId from tmp_region ) 
	and messages_detail.student_id = enrollment_form.student_id
) into temp tmp_messages_detail_out with no log;

-- 关联消息类型为0-发送消息的记录,获取接收到消息的记录
select a.message_id , a.student_id,b.student_id as sid from tmp_messages_detail_out a 
left join messages_detail b 
on b.create_time='$2' and b.message_id =a.message_id and b.status=1 
into temp tmp_messages_detail_in with no log;

-- 两张临时表进行关联
select distinct a.message_id , a.student_id,'$1' as region_id ,a.sid ,c.province_id from tmp_messages_detail_in a,enrollment_form b, region c where b.student_id=a.sid and (
c.city_id=b.region_id or c.county_id=b.region_id )
into temp tmp_messages_detail with no log;

select region_id,province_id,count(*) from tmp_messages_detail group by region_id,province_id;
!
date
exit 0

执行命令
chmod +x ./region_stu_connect.sh
nohup ./region_stu_connect.sh 湖北省的ID 20240801 > region_stu_connect_20240801.sh > &1 &
  • 8
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值