ClickHouse数据查询速度让我有点小小的震撼

背景

现在有ck中有三张表,spu_app_orchestration,spu_app_orch_group,spu_upgrade_group,计算所有解决方案中升级时间的最长时间和最短时间,每个解决方案有都多次编排,时间的总时间为每次编排的配置时间+应用时间+脚本时间之和

sql层

1.我需要以sp_version分组查询spu_app_orchestration,得到每一个sp_version对应的spu_app_orchestration的id集合(每个spu_app_orch_id就属于一次编排)

SELECT 
    sp_version,
    groupArray(id) AS id_list 
FROM spu_app_orchestration 
WHERE project_id =-1 
GROUP by sp_version ;

2.每次的编排的总时间=配置时间+应用时间+脚本时间

配置时间:通过spu_app_orch_id到spu_upgrade_group表中查询,多个配置开始时间和结束时之差的和

 

SELECT 
         round(sum(
        if(dateDiff('second', start_date, last_state_date) >= 0, 
           dateDiff('second', start_date, last_state_date) / 60.0, 
           0)
    ), 2) AS cfg_time
FROM 
    spu_upgrade_group
WHERE 
    project_id = -1
    AND spu_app_orch_id = 114
    AND type NOT IN ('DB_DATA_BACKUP', 'FM_TOOL', 'GATEWAY_COMPARE', 'EXPORT_RESULT');

应用时间和脚本时间:通过spu_app_orch_id到spu_app_orch_group表中查询,多个配置开始时间和结束时之差的和

SELECT 
    round(sum(IF(dateDiff('second', s.app_start_date, s.app_last_state_date) >= 0, 
                 dateDiff('second', s.app_start_date, s.app_last_state_date) / 60.0, 
                 0)), 2) as app_time,
    round(sum(IF(dateDiff('second', s.db_start_date, s.db_last_state_date) >= 0, 
                 dateDiff('second', s.db_start_date, s.db_last_state_date) / 60.0, 
                 0)), 2) as db_time
FROM 
    spu_app_orch_group s 
WHERE 
s.project_id =-1
AND 
s.spu_app_orch_id = 114;

在java层

最开始

三层for循环

耗时:17s
public List<BuildUpgradeProjectDto> getBuildUpgradeProject() {
    // 格式化时间,精确到秒
    DateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
    List<BuildUpgradeProjectDto> buildUpgradeProjectDtos = new ArrayList<>();

    // 获取log表中有哪些type为7且store_status为S的记录
    List<HCPackageLogPO> logPOS = hcPackageLogPOMapperEx.selectLogByTypeCode(LogType.BUILD_UPGRADE, StoreStatus.STORED);
    System.out.println("第一个循环开始:" + sdf.format(new Date()));
    for (HCPackageLogPO po : logPOS) {
        BuildUpgradeProjectDto buildUpgradeProjectDto = new BuildUpgradeProjectDto();

        Long projectId = po.getProjectId();
        buildUpgradeProjectDto.setProjectId(projectId);

        Date uploadDate = po.getUploadDate();
        buildUpgradeProjectDto.setLatestUploadTime(uploadDate);

        String projectName = hcProjectInfoPOMapperEx.selectProjectNameByProjectId(projectId);
        buildUpgradeProjectDto.setProjectName(projectName);

        // 获取解决方案数量
        List<Map<String, Object>> solutionAndIdList = clickHouseMapper.selectSolutionAndIdList(projectId);
        if (CollectionUtils.isNotEmpty(solutionAndIdList)) {
            buildUpgradeProjectDto.setSolutionPlanUpgradeNum(solutionAndIdList.size());
            // 获取最长时间和最短时间,每个解决方案的每次编排的时间之和,每次编排时间=配置+应用+脚本
            Double maxTime = 0.0;
            Double miniTime = 0.0;
            System.out.println("第二个循环开始:" + sdf.format(new Date()));
            for (int i = 0; i < solutionAndIdList.size(); i++) {
                Map<String, Object> map = solutionAndIdList.get(i);
                // 每个解决方案
                // 获取 id_list 并进行类型转换
                Object idListObject = map.get("id_list");
                List<Long> spuAppOrchIdList = new ArrayList<>();
                if (idListObject instanceof BigInteger[]) {
                    BigInteger[] idArray = (BigInteger[]) idListObject;
                    for (BigInteger id : idArray) {
                        spuAppOrchIdList.add(id.longValue());
                    }
                }
                Double totalSpuTime = 0.0;
                System.out.println("第三个循环开始:" + sdf.format(new Date()));
                for (Long spuAppOrchId : spuAppOrchIdList) {
                    //每次编排
                    // 配置时间
                    Double cfgTime = clickHouseMapper.selectConfigureTime(projectId, spuAppOrchId);
                    // 应用时间和脚本时间
                    Map<String, Double> appAndDbTime = clickHouseMapper.selectAppAndDbTime(projectId, spuAppOrchId);
                    Double appTime = appAndDbTime.get("appTime");
                    Double dbTIme = appAndDbTime.get("dbTime");
                    totalSpuTime += cfgTime + appTime + dbTIme;
                }
                if (i == 0) {
                    maxTime = totalSpuTime;
                    miniTime = totalSpuTime;
                } else {
                    if (totalSpuTime > maxTime) {
                        maxTime = totalSpuTime;
                    }
                    if (totalSpuTime < miniTime) {
                        miniTime = totalSpuTime;
                    }
                }
                System.out.println("第三个循环结束:" + sdf.format(new Date()));
            }
            System.out.println("第二个循环结束:" + sdf.format(new Date()));
            buildUpgradeProjectDto.setMiniTime(miniTime + "min");
            buildUpgradeProjectDto.setMaximum(maxTime + "0min");
        } else {
            buildUpgradeProjectDto.setSolutionPlanUpgradeNum(0);
            buildUpgradeProjectDto.setMiniTime("0min");
            buildUpgradeProjectDto.setMaximum("0min");
        }

        buildUpgradeProjectDtos.add(buildUpgradeProjectDto);
    }
    System.out.println("第一个循环结束:" + sdf.format(new Date()));
    return buildUpgradeProjectDtos;
}

第一次优化

  1. 批量获取项目名称:通过一次性查询所有项目ID对应的名称,避免在循环中多次查询数据库。
  2. 并行流处理:使用 parallelStream 来并行处理日志记录和解决方案数据,充分利用多核CPU的性能。
  3. 线程安全的累加器:使用 DoubleAdder 来进行线程安全的时间累加操作。
耗时:10s

public List<BuildUpgradeProjectDto> getBuildUpgradeProject() {
        // 格式化时间,精确到秒
        DateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
        List<BuildUpgradeProjectDto> buildUpgradeProjectDtos = new ArrayList<>();

        // 获取log表中有哪些type为7且store_status为S的记录
        List<HCPackageLogPO> logPOS = hcPackageLogPOMapperEx.selectLogByTypeCode(LogType.BUILD_UPGRADE, StoreStatus.STORED);
        System.out.println("第一个循环开始:" + sdf.format(new Date()));

        // 批量获取项目名称,减少单独查询
        Map<Long, String> projectNames = logPOS.stream()
                .map(HCPackageLogPO::getProjectId)
                .distinct()
                .collect(Collectors.toMap(
                        projectId -> projectId,
                        projectId -> hcProjectInfoPOMapperEx.selectProjectNameByProjectId(projectId)
                ));

        // 并行处理日志记录
        buildUpgradeProjectDtos = logPOS.parallelStream().map(po -> {
            BuildUpgradeProjectDto buildUpgradeProjectDto = new BuildUpgradeProjectDto();

            Long projectId = po.getProjectId();
            buildUpgradeProjectDto.setProjectId(projectId);

            Date uploadDate = po.getUploadDate();
            buildUpgradeProjectDto.setLatestUploadTime(uploadDate);

            String projectName = projectNames.get(projectId);
            buildUpgradeProjectDto.setProjectName(projectName);

            // 获取解决方案数量
            List<Map<String, Object>> solutionAndIdList = clickHouseMapper.selectSolutionAndIdList(projectId);
            if (CollectionUtils.isNotEmpty(solutionAndIdList)) {
                buildUpgradeProjectDto.setSolutionPlanUpgradeNum(solutionAndIdList.size());
                // 获取最长时间和最短时间,每个解决方案的每次编排的时间之和,每次编排时间=配置+应用+脚本
                DoubleAdder maxTime = new DoubleAdder();
                DoubleAdder miniTime = new DoubleAdder();
                miniTime.add(Double.MAX_VALUE);

                solutionAndIdList.parallelStream().forEach(map -> {
                    // 每个解决方案
                    // 获取 id_list 并进行类型转换
                    Object idListObject = map.get("id_list");
                    List<Long> spuAppOrchIdList = new ArrayList<>();
                    if (idListObject instanceof BigInteger[]) {
                        BigInteger[] idArray = (BigInteger[]) idListObject;
                        for (BigInteger id : idArray) {
                            spuAppOrchIdList.add(id.longValue());
                        }
                    }
                    DoubleAdder totalSpuTime = new DoubleAdder();

                    spuAppOrchIdList.parallelStream().forEach(spuAppOrchId -> {
                        //每次编排
                        // 配置时间
                        Double cfgTime = clickHouseMapper.selectConfigureTime(projectId, spuAppOrchId);
                        // 应用时间和脚本时间
                        Map<String, Double> appAndDbTime = clickHouseMapper.selectAppAndDbTime(projectId, spuAppOrchId);
                        Double appTime = appAndDbTime.get("appTime");
                        Double dbTime = appAndDbTime.get("dbTime");
                        totalSpuTime.add(cfgTime + appTime + dbTime);
                    });

                    synchronized (this) {
                        if (totalSpuTime.doubleValue() > maxTime.doubleValue()) {
                            maxTime.reset();
                            maxTime.add(totalSpuTime.doubleValue());
                        }
                        if (totalSpuTime.doubleValue() < miniTime.doubleValue()) {
                            miniTime.reset();
                            miniTime.add(totalSpuTime.doubleValue());
                        }
                    }
                });

                buildUpgradeProjectDto.setMiniTime(miniTime.doubleValue() + "min");
                buildUpgradeProjectDto.setMaximum(maxTime.doubleValue() + "min");
            } else {
                buildUpgradeProjectDto.setSolutionPlanUpgradeNum(0);
                buildUpgradeProjectDto.setMiniTime("0min");
                buildUpgradeProjectDto.setMaximum("0min");
            }

            return buildUpgradeProjectDto;
        }).collect(Collectors.toList());

        System.out.println("第一个循环结束:" + sdf.format(new Date()));
        return buildUpgradeProjectDtos;
    }

第二次优化

想到大部分的耗时还是在for循环里面,ck的查询速度非常快,为什么不废弃for循环直接改造sql直接返回所有解决方案的总耗时?

耗时:200ms

sql:

        SELECT
            ROUND(SUM(cfg_time + app_time + db_time), 2) AS total_time
        FROM
            (SELECT
                 sp_version,
                 id AS spu_app_orch_id
             FROM
                 spu_app_orchestration
             WHERE
                 project_id = #{projectId,jdbcType=BIGINT}
            ) AS a
                JOIN
            (SELECT
                 spu_app_orch_id,
                 ROUND(SUM(
                               IF(dateDiff('second', start_date, last_state_date) >= 0,
                                  dateDiff('second', start_date, last_state_date) / 60.0,
                                  0)
                           ), 2) AS cfg_time
             FROM
                 spu_upgrade_group
             WHERE
                 project_id = #{projectId,jdbcType=BIGINT}
               AND type NOT IN ('DB_DATA_BACKUP', 'FM_TOOL', 'GATEWAY_COMPARE', 'EXPORT_RESULT')
             GROUP BY
                 spu_app_orch_id
            ) AS b
            ON
                a.spu_app_orch_id = b.spu_app_orch_id
                JOIN
            (SELECT
                 s.spu_app_orch_id,
                 ROUND(SUM(IF(dateDiff('second', s.app_start_date, s.app_last_state_date) >= 0,
                              dateDiff('second', s.app_start_date, s.app_last_state_date) / 60.0,
                              0)), 2) AS app_time,
                 ROUND(SUM(IF(dateDiff('second', s.db_start_date, s.db_last_state_date) >= 0,
                              dateDiff('second', s.db_start_date, s.db_last_state_date) / 60.0,
                              0)), 2) AS db_time
             FROM
                 spu_app_orch_group s
             GROUP BY
                 s.spu_app_orch_id
            ) AS c
            ON
                b.spu_app_orch_id = c.spu_app_orch_id
        GROUP BY
            a.sp_version;
    public List<BuildUpgradeProjectDto> getBuildUpgradeProject() {
        List<BuildUpgradeProjectDto> buildUpgradeProjectDtos = new ArrayList<>();

        // 获取log表中有哪些type为7且store_status为S的记录
        List<HCPackageLogPO> logPOS = hcPackageLogPOMapperEx.selectLogByTypeCode(LogType.BUILD_UPGRADE, StoreStatus.STORED);
        // 并行处理日志记录
        buildUpgradeProjectDtos = logPOS.parallelStream().map(po -> {
            BuildUpgradeProjectDto buildUpgradeProjectDto = new BuildUpgradeProjectDto();

            Long projectId = po.getProjectId();
            buildUpgradeProjectDto.setProjectId(projectId);

            Date uploadDate = po.getUploadDate();
            buildUpgradeProjectDto.setLatestUploadTime(uploadDate);

            String projectName = hcProjectInfoPOMapperEx.selectProjectNameByProjectId(projectId);
            buildUpgradeProjectDto.setProjectName(projectName);

            // 获取解决方案数量
            List<Map<String, Object>> solutionAndIdList = clickHouseMapper.selectSolutionAndIdList(projectId);
            if (CollectionUtils.isNotEmpty(solutionAndIdList)) {
                buildUpgradeProjectDto.setSolutionPlanUpgradeNum(solutionAndIdList.size());
                // // 获取最长时间和最短时间,每个解决方案的每次编排的时间之和,每次编排时间=配置+应用+脚本
                List<Double> spuTimes = clickHouseMapper.selectSpuTime(projectId);
                
                buildUpgradeProjectDto.setMiniTime(Collections.min(spuTimes));
                buildUpgradeProjectDto.setMaximum(Collections.max(spuTimes));
            } else {
                buildUpgradeProjectDto.setSolutionPlanUpgradeNum(0);
                buildUpgradeProjectDto.setMiniTime(0.00);
                buildUpgradeProjectDto.setMaximum(0.00);
            }

            return buildUpgradeProjectDto;
        }).collect(Collectors.toList());
        return buildUpgradeProjectDtos;
    }

结论

在使用ck作为数据储存数据库时,查询速度非常快,可以尽量不使用循环,能用sql查的就用sql查

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值