mysql当日、月数据的统计(完整率)

1.场景说明

平台每天都会接收分公司上报的指标(mq发消息给平台,平台解析入库,分别进device表和kpi表)。kpi表是其他指标表,device表是设备信息指标表,每天要统计分公司上报指标的完整率,一类指标发送一条就算发送,重复发送还是只算一条。其中,device表只存一类指标,kpi表存27类指标,27类指标根据字段kpi_id区分。

现在需要统计分公司每天的指标完整率和当月的指标完整率。

2.sql实现

2.1当天的指标完整率

2.1.1设备指标当天推送情况统计
SELECT
	count( 1 ) 
FROM
	monitor_device_info 
WHERE
	DATE_FORMAT( now( ), '%Y-%m-%d' ) = DATE_FORMAT( updated_at, '%Y-%m-%d' ) 
	AND corporation_id = #{bcId}

上面那条sql是我第一次使用的sql,需要遍历分公司的id做参数,连续查询多次,效率较低。

SELECT
	(IF(COUNT(d.device_id) > 0,1,0)) AS kpi_num,
	bc.bc_id,
	bc.bc_short_name 
FROM
	public_org_busicorp bc
	LEFT JOIN ( SELECT * FROM monitor_device_info WHERE DATE_FORMAT(NOW(), '%Y-%m-%d' ) = DATE_FORMAT( updated_at, '%Y-%m-%d' ) ) d ON d.corporation_id = bc.bc_id 
WHERE
	bc.bc_id > 0 AND bc.bc_id < 99 
GROUP BY
	bc.bc_id 
ORDER BY
	bc.bc_desc

这条是改良后的sql,可以一次查询获取所有分公司设备信息类指标的当天推送情况,效率提升。

但是后来发现这样写是有问题的,如果今天设备第一次插入,那么updated_at是null,但是你不能说分公司今天没有推送指标,所以我对sql进行了一下改良

SELECT 
	( IF ( COUNT( d.device_id ) > 0,1,0 ) ) AS kpi_num,
	bc.bc_id,
	bc.bc_short_name 
FROM
	public_org_busicorp bc
LEFT JOIN (
	SELECT
		* 
	FROM
		monitor_device_info 
	WHERE
		DATE_FORMAT( now(), '%Y-%m-%d' ) = DATE_FORMAT( IF ( updated_at IS NULL, created_at, updated_at ), '%Y-%m-%d' ) 
) d ON d.corporation_id = bc.bc_id 
WHERE
	bc.bc_id > 0 
	AND bc.bc_id < 99  
GROUP BY
	bc.bc_id 
ORDER BY
	bc.bc_desc
2.1.2其他指标当天推送情况统计
SELECT
	COUNT( DISTINCT h.kpi_id ) AS kpi_num,
	bc.bc_id,
	bc.bc_short_name 
FROM
	public_org_busicorp bc
LEFT JOIN ( 
    SELECT 
    	* 
    FROM 
    	monitor_statistics_history 
    WHERE 
    	DATE_FORMAT(now(), '%Y-%m-%d' ) = DATE_FORMAT( kpi_time, '%Y-%m-%d' ) 
) h ON h.bc_id = bc.bc_id 
WHERE
	bc.bc_id > 0 
	AND bc.bc_id < 99 
GROUP BY
	bc.bc_id 
ORDER BY
	bc.bc_desc

后来随着数据量的增大,发现接口耗时较长,排查发现是sql查询效率、耗时长导致的,所以又对sql进行了优化,速度的提升还是比较明显的。优化后的sql如下:

SELECT
	if(h.kpi_num is null,0,h.kpi_num) kpi_num,
	bc.bc_id,
	bc.bc_short_name
FROM
	public_org_busicorp bc
LEFT JOIN (
	SELECT
		bc_id,
		COUNT( DISTINCT kpi_id ) AS kpi_num
	FROM
		monitor_statistics_history
	WHERE
		DATE_FORMAT( NOW( ), '%Y-%m-%d' ) = DATE_FORMAT( kpi_time, '%Y-%m-%d' )
	GROUP BY
		CRC32(bc_id)
) h ON h.bc_id = bc.bc_id
WHERE
	bc.bc_id > 0 
	AND bc.bc_id < 99
GROUP BY
	CRC32(bc.bc_id)
ORDER BY
	bc.bc_desc

这条改良后的sql查询速度提升主要来自两个方面:

①左连接的数据量大大减少,在进行左连接之前,对子查询进行了group by操作,使得子查询的数据得以整理,数据量大大减少;

②group by的字段如果是varchar类型,当表的数据量达到万级别的时候,就会明显感觉到sql语句的缓慢。所以使用了CRC32()函数对目标字段取int数,并按此int来分组,查询速度会得到极大的提升;

2.2当月指标完整率

2.2.1设备指标当月推送情况统计
SELECT
	COUNT(DISTINCT d.corporation_id ,DATE_FORMAT(d.updated_at, '%Y-%m-%d' )) AS kpi_num,
	bc.bc_id,
	bc.bc_short_name 
FROM
	public_org_busicorp bc
LEFT JOIN ( 
    SELECT 
    	* 
    FROM 
    	monitor_device_info 
    WHERE 
    	DATE_FORMAT(NOW(), '%Y-%m' ) = DATE_FORMAT( updated_at, '%Y-%m' ) 
) d ON d.corporation_id = bc.bc_id 
WHERE
	bc.bc_id > 0 AND bc.bc_id < 99 
GROUP BY
	bc.bc_id 
ORDER BY
	bc.bc_desc

这是第一次写的,根据省公司id和日期的“年月日”去重查询作为当天设备指标推送的依据。

因为上面发现了设备指标统计sql的漏洞,所以,月统计的sql也要修改,而且要注意,这条sql里面有两处使用updated_at的地方,都要做修改。

SELECT
	COUNT(DISTINCT d.corporation_id ,DATE_FORMAT(IF ( d.updated_at IS NULL, d.created_at, d.updated_at ), '%Y-%m-%d' )) AS kpi_num,
	bc.bc_id,
	bc.bc_short_name 
FROM
	public_org_busicorp bc
	LEFT JOIN ( SELECT * FROM monitor_device_info WHERE DATE_FORMAT(now(), '%Y-%m' ) = DATE_FORMAT( IF ( updated_at IS NULL, created_at, updated_at ), '%Y-%m' )  ) d ON d.corporation_id = bc.bc_id 
WHERE
	bc.bc_id > 0 AND bc.bc_id < 99 
GROUP BY
	bc.bc_id 
ORDER BY
	bc.bc_desc
2.2.2其他指标当月推送情况统计
SELECT
	COUNT( DISTINCT h.kpi_id, DATE_FORMAT( h.kpi_time, '%Y-%m-%d' ) ) AS kpi_num,
	bc.bc_id,
	bc.bc_short_name 
FROM
	public_org_busicorp bc
	LEFT JOIN ( SELECT * FROM monitor_statistics_history WHERE DATE_FORMAT( NOW( ), '%Y-%m' ) = DATE_FORMAT( kpi_time, '%Y-%m' ) ) h ON h.bc_id = bc.bc_id 
WHERE
	bc.bc_id > 0 
	AND bc.bc_id < 99 
GROUP BY
	bc.bc_id 
ORDER BY
	bc.bc_desc

统计思路与设备指标一样。

按照同样的优化思路,优化后的sql如下:

SELECT
	h.kpi_num,
	bc.bc_id,
	bc.bc_short_name
FROM
	public_org_busicorp bc
LEFT JOIN (
	SELECT
		bc_id,
		COUNT( DISTINCT kpi_id, DATE_FORMAT( kpi_time, '%Y-%m-%d' ) ) AS kpi_num
	FROM
		monitor_statistics_history
	WHERE
		DATE_FORMAT( NOW( ), '%Y-%m' ) = DATE_FORMAT( kpi_time, '%Y-%m' )
	GROUP BY
		CRC32(bc_id)
) h ON h.bc_id = bc.bc_id
WHERE
	bc.bc_id > 0 
	AND bc.bc_id < 99
GROUP BY
	CRC32(bc.bc_id)
ORDER BY
	bc.bc_desc

注意一点,查出来的kpi_num有可能是null,求和运算的时候需要用三目运算符处理一下。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值