1.背景
设备状态历史表,按天分区是Duplicate 模型,页面上有明细查询。但是查一个月数据翻页翻到17万页,每页10条数据,就会卡住不动。
doris配置:测试环境 3BE ,32g内存,4线程
2.sql分析
字段比较多,下面是列表分页执行的Doris-sql,时间范围是一个月
select
id,
epid,
epname,
matrix_no as matrixNo,
cmu_no as cmuNo,
esc_no as escNo,
producer,
report_time as reportTime,
inner_temp as innerTemp,
cls1_daily_cha_nrg as cls1DailyChaNrg,
cls1_daily_dis_nrg as cls1DailyDisNrg,
cls1_month_cha_nrg as cls1MonthChaNrg,
cls1_month_dis_nrg as cls1MonthDisNrg,
cls1_year_cha_nrg as cls1YearChaNrg,
cls1_year_dis_nrg as cls1YearDisNrg,
cls1_total_cha_nrg as cls1TotalChaNrg,
cls1_total_dis_nrg as cls1TotalDisNrg,
dcdc_running1 as dcdcRunning1,
dcdc_bus_vol1 as dcdcBusVol1,
dcdc_bus_curr1 as dcdcBusCurr1,
dcdc_vol1 as dcdcVol1,
dcdc_curr1 as dcdcCurr1,
dcdc_insul1 as dcdcInsul1,
cls1_pack_num as cls1PackNum,
cls1_dev_status as cls1DevStatus,
cls1_vol as cls1Vol,
cls1_curr as cls1Curr,
cls1_soc as cls1Soc,
cls1_soh as cls1Soh,
cls1_soe as cls1Soe,
cls1_dod as cls1Dod,
cls1_io_power as cls1IoPower,
cls1_enable_cha_nrg as cls1EnableChaNrg,
cls1_enable_dis_nrg as cls1EnableDisNrg,
cls1_pack_max_temp as cls1PackMaxTemp,
cls1_pack_max_temp_pos as cls1PackMaxTempPos,
cls1_pack_min_temp as cls1PackMinTemp,
cls1_pack_min_temp_pos as cls1PackMinTempPos,
cls1_pack_min_vol as cls1PackMinVol,
cls1_pack_min_vol_pos as cls1PackMinVolPos,
cls1_pack_max_vol as cls1PackMaxVol,
cls1_pack_max_vol_pos as cls1PackMaxVolPos,
cls2_daily_cha_nrg as cls2DailyChaNrg,
cls2_daily_dis_nrg as cls2DailyDisNrg,
cls2_month_cha_nrg as cls2MonthChaNrg,
cls2_month_dis_nrg as cls2MonthDisNrg,
cls2_year_cha_nrg as cls2YearChaNrg,
cls2_year_dis_nrg as cls2YearDisNrg,
cls2_total_cha_nrg as cls2TotalChaNrg,
cls2_total_dis_nrg as cls2TotalDisNrg,
dcdc_running2 as dcdcRunning2,
dcdc_bus_vol2 as dcdcBusVol2,
dcdc_bus_curr2 as dcdcBusCurr2,
dcdc_vol2 as dcdcVol2,
dcdc_curr2 as dcdcCurr2,
dcdc_insul2 as dcdcInsul2,
cls2_pack_num as cls2PackNum,
cls2_dev_status as cls2DevStatus,
cls2_vol as cls2Vol,
cls2_curr as cls2Curr,
cls2_soc as cls2Soc,
cls2_soh as cls2Soh,
cls2_soe as cls2Soe,
cls2_dod as cls2Dod,
cls2_io_power as cls2IoPower,
cls2_enable_cha_nrg as cls2EnableChaNrg,
cls2_enable_dis_nrg as cls2EnableDisNrg,
cls2_pack_max_temp as cls2PackMaxTemp,
cls2_pack_max_temp_pos as cls2PackMaxTempPos,
cls2_pack_min_temp as cls2PackMinTemp,
cls2_pack_min_temp_pos as cls2PackMinTempPos,
cls2_pack_min_vol as cls2PackMinVol,
cls2_pack_min_vol_pos as cls2PackMinVolPos,
cls2_pack_max_vol as cls2PackMaxVol,
cls2_pack_max_vol_pos as cls2PackMaxVolPos
from t_sacu_esc_state_his
where partition_date >= '2024-03-12 00:00:00' and partition_date<='2024-04-10 23:59:59'
and report_time between '2024-03-12 00:00:00' and '2024-04-10 23:59:59'
order by report_time desc
limit 10 offset 1733800;
查询了65秒
3.问题原因分析
上图sql查询了70多个字段,doris是列存查这么多字段对其排序性能影响比较大。改为只查询三个字段,只需要4.7秒
4.解决方案
4.1 建立物化视图
- 建立一个包含report_time,partition_date,id的
create materialized view mv_1 as
select partition_date,report_time,id from t_sacu_esc_state_his order by partition_date desc,report_time desc,id desc;
2. 查看是否建立完成
SHOW ALTER TABLE ROLLUP FROM isoc;
4.2改写sql
改成分两次查询,第一次查询出分区编号与id,第二次根据分区编号加id查
查询id+分区编号只花费了一秒
select partition_date,
id
from t_sacu_esc_state_his
where
report_time between '2024-03-12 00:00:00' and '2024-04-10 23:59:59'
order by report_time
limit 10 offset 1733800;
查看是否命中物化视图
看到命中了mv_1这个物化视图
根据id加分区编号查询其他详情数据
select *
from t_sacu_esc_state_his
where partition_date = '2024-04-06' and id in('_20240411105704563103300','_20240411105704573152300','_20240411105704563103000','_20240411105704573152500',
'_20240411105704563102800','_20240411105704573153100','_20240411105704573153300','_20240411105704563102500','_20240411105704573153900','_20240411113802866110040')
只花费了0.128s