查询接口怎么这么慢该如何优化(1)

 背景

 今天遇到接口查询比较慢的两个场景,分析一下并如何优化

场景1:导入文件预览,目的将文件中的企业和标签进行导入后分析和标签是否存在,并返回关联关系,展示出来再有操作人观察是否需要修改再提交更新。在查询预览这个接口导入1700+数据每条数据关联标签平均3个,需要近7分钟

场景2:企业查询列表,企业表、标签关联表、标签表、连表查询,企业数据22w+,测试环境(已标签关联数据29)3s内,线上(9000个标签关联数据)直接响应不回来了

场景1

大致逻辑
1、读取文件转为,企业-》标签集合
2、遍历企业,查询企业信息,标签信息(标签及分类),企业不存在过滤掉
这个方式1700+条企业数据,耗时:将近7分钟
整体逻辑没毛病,也都是走索引查询的,但问题就在单条遍历,每天需要查数据库3次(企业、标签、标签分类)按这个逻辑计算1700*3*3,需要交互15300次,
这种方式数据量小百来条还能忍受数据量一大就不行了。
优化1:
思路:因为都能走索引,可以批量查找减少交互次数,从而较少耗时
修改代码第二步将企业遍历改为分片遍历,每次500个企业,标签和分类联合查询,交互次数:for循环变为1700/500=4,企业1次,标签和分类1次,最终4*2=8次
交互次数重15300->8,缩小1900+倍
效果:耗时:2-3s,提升100-200倍,虽然随之数据量增加效果会骤减,但因为是企业数据量不会特别大,所以已经够了。
优化2:
思路:多线程+缓存,标签查询逻辑可加入缓存,如果还行提升可以多线程,将分片交个多个线程再分片执行,后汇总;因为标签不容易变动所以可以加缓存
代码优化前:
 public List<KeyEnterpriseRelationTagImportPreviewVO> importRelationTagPreview(MultipartFile file) {
        //...文件转为对象List<Object> objects
 
				//企业-》标签集合
        Map<String, List<String>> creditCodeAndTagMap = objects.stream().map(object -> (KeyEnterpriseRelationTagImportExcel) object)
        .filter(object -> StringUtils.isNotBlank(object.getCreditCode()))
        .collect(Collectors.toMap(KeyEnterpriseRelationTagImportExcel::getCreditCode, value -> {
            List<String> tags = Lists.newArrayList();
            if (StringUtils.isNotEmpty(value.getTagName())) {
                tags.addAll(Lists.newArrayList(value.getTagName().split(SPLIT_CHART)));
            }
            if (StringUtils.isNotEmpty(value.getTagName())) {
                tags.addAll(Lists.newArrayList(value.getCompanyTypeName().split(SPLIT_CHART)));
            }
            return tags;
        }));
				//按企业遍历,查询企业信息,以及标签集合
        List<KeyEnterpriseRelationTagImportPreviewVO> voList = Lists.newArrayList();
        for (Map.Entry<String, List<String>> entry : creditCodeAndTagMap.entrySet()) {
            KeyEnterpriseRelationTagImportPreviewVO previewVO = new KeyEnterpriseRelationTagImportPreviewVO();
            String creditNo = entry.getKey();
            BizKeyEnterprise bizKeyEnterprise = bizKeyEnterpriseService.getOne(new LambdaQueryWrapper<BizKeyEnterprise>()
                    .select(BizKeyEnterprise::getCreditNo,
                            BizKeyEnterprise::getCompanyName)
                    .eq(BizKeyEnterprise::getCreditNo, creditNo));
            //如果需要标记不存在的企业
            if (bizKeyEnterprise == null) {
                continue;
            }
            voList.add(previewVO);
            EnterpriseDTO enterpriseDTO = new EnterpriseDTO();
            enterpriseDTO.setCreditNo(bizKeyEnterprise.getCreditNo());
            enterpriseDTO.setName(bizKeyEnterprise.getCompanyName());
            previewVO.setInfo(enterpriseDTO);
            List<String> value = entry.getValue();
            if (CollectionUtils.isEmpty(value)) {
                continue;
            }
            List<KeyEnterpriseTagAndCategoryVO> tagList = value.stream().distinct().map(tagName -> {
                if (StringUtils.isBlank(tagName)) {
                    return null;
                }
                BizKeyEnterpriseTag tag = bizKeyEnterpriseTagService.getOne(new LambdaQueryWrapper<BizKeyEnterpriseTag>()
                        .eq(BizKeyEnterpriseTag::getName, tagName)
                        .eq(BizKeyEnterpriseTag::getDisabled, false)
                );
                if (tag != null) {
                    KeyEnterpriseTagAndCategoryVO keyEnterpriseTagAndCategoryVO = new KeyEnterpriseTagAndCategoryVO();
                    keyEnterpriseTagAndCategoryVO.setCreditNo(bizKeyEnterprise.getCreditNo());
                    keyEnterpriseTagAndCategoryVO.setTagId(tag.getId());
                    keyEnterpriseTagAndCategoryVO.setTagName(tag.getName());

                    BizKeyEnterpriseTagCategory category = bizKeyEnterpriseTagCategoryService.getOne(new LambdaQueryWrapper<BizKeyEnterpriseTagCategory>()
                            .eq(BizKeyEnterpriseTagCategory::getId, tag.getCategoryId()));
                    if (category != null) {
                        keyEnterpriseTagAndCategoryVO.setCategoryId(category.getId());
                        keyEnterpriseTagAndCategoryVO.setCategoryName(category.getName());
                    }
                    return keyEnterpriseTagAndCategoryVO;
                }
                return null;
            }).filter(Objects::nonNull).collect(Collectors.toList());
            previewVO.setTagList(tagList);
        }
        return voList;
    }
 优化后:
  public List<KeyEnterpriseRelationTagImportPreviewVO> importRelationTagPreview(MultipartFile file) {
       //...文件转为对象List<Object> objects

				//企业-》标签集合
        Map<String, List<String>> creditCodeAndTagMap = objects.stream().map(object -> (KeyEnterpriseRelationTagImportExcel) object)
                .filter(object -> StringUtils.isNotBlank(object.getCreditCode()) || (StringUtils.isNotBlank(object.getTagName()) && StringUtils.isNotBlank(object.getCompanyTypeName())))
                .collect(Collectors.toMap(KeyEnterpriseRelationTagImportExcel::getCreditCode, value -> {
                    List<String> tags = Lists.newArrayList();
                    if (StringUtils.isNotEmpty(value.getTagName())) {
                        tags.addAll(Lists.newArrayList(value.getTagName().split(SPLIT_CHART)));
                    }
                    if (StringUtils.isNotEmpty(value.getCompanyTypeName())) {
                        tags.addAll(Lists.newArrayList(value.getCompanyTypeName().split(SPLIT_CHART)));
                    }
                    return tags;
                }));

        List<KeyEnterpriseRelationTagImportPreviewVO> voList = Lists.newArrayList();
        Set<String> creditNos = creditCodeAndTagMap.keySet();
        //优化:单个遍历改为分片,查询企业信息,以及标签集合
        int forCount = creditNos.size() % LIMIT == 0 ? creditNos.size() / LIMIT : creditNos.size() / LIMIT + 1;
        for (int i = 0; i < forCount; i++) {
            List<String> limitCreditNos = creditNos.stream().skip((long) i * LIMIT).limit(LIMIT).collect(Collectors.toList());
            List<BizKeyEnterprise> bizKeyEnterprises = bizKeyEnterpriseService.list(new LambdaQueryWrapper<BizKeyEnterprise>()
                    .select(BizKeyEnterprise::getCreditNo,
                            BizKeyEnterprise::getCompanyName)
                    .in(BizKeyEnterprise::getCreditNo, limitCreditNos)
                    .groupBy(BizKeyEnterprise::getCreditNo)
            );

            if (CollectionUtils.isEmpty(bizKeyEnterprises)) {
                continue;
            }
            //标签去重
            Set<String> tagNames = Sets.newHashSet();
            bizKeyEnterprises.forEach(bizKeyEnterprise -> {
                tagNames.addAll(creditCodeAndTagMap.get(bizKeyEnterprise.getCreditNo()));
            });
            List<KeyEnterpriseTagAndCategoryVO> tagList = bizKeyEnterpriseTagService.listTagName(tagNames, null);
            Map<String, KeyEnterpriseTagAndCategoryVO> tagNameAndTagVO = tagList.stream().collect(Collectors.toMap(KeyEnterpriseTagAndCategoryVO::getTagName, Function.identity()));
            for (BizKeyEnterprise bizKeyEnterprise : bizKeyEnterprises) {
                KeyEnterpriseRelationTagImportPreviewVO previewVO = new KeyEnterpriseRelationTagImportPreviewVO();
                voList.add(previewVO);
                EnterpriseDTO enterpriseDTO = new EnterpriseDTO();
                enterpriseDTO.setName(bizKeyEnterprise.getCompanyName());
                enterpriseDTO.setCreditNo(bizKeyEnterprise.getCreditNo());
                previewVO.setInfo(enterpriseDTO);
                List<String> tagNameList = creditCodeAndTagMap.get(bizKeyEnterprise.getCreditNo());
                if (CollectionUtils.isEmpty(tagNameList)) {
                    continue;
                }
                previewVO.setTagList(tagNameList.stream().map(tagName -> {
                    KeyEnterpriseTagAndCategoryVO vo = tagNameAndTagVO.get(tagName);
                    if(vo == null) return null;
                    vo.setCreditNo(bizKeyEnterprise.getCreditNo());
                    return vo;
                }).filter(Objects::nonNull).collect(Collectors.toList()));
            }
        }
        return voList;
    }

场景2

大致逻辑:企业列表,通过企业部分字段、标签、标签类别查询企业,返回企业部分字段关联标签、标签分类

sql优化前,不管是否需要用到标签表都left join,按线上现有数据量计算,220000*9000=1980000000,将近20亿扫描😂,难怪不响应了
一开始有人说给关系表加索引(这种其实没效果,因为区分度不大,标签才71个,有9000条关系),但给同事看,建了索引分析结果如下:
+----+-------------+-------+------------+--------+--------------------------------+---------+---------+----------------------------------+--------+----------+------------------------------------------------+
| id | select_type | table | partitions | type   | possible_keys                  | key     | key_len | ref                              | rows   | filtered | Extra                                          |
+----+-------------+-------+------------+--------+--------------------------------+---------+---------+----------------------------------+--------+----------+------------------------------------------------+
|  1 | SIMPLE      | dci   | NULL       | ALL    | credit_no_IDX                  | NULL    | NULL    | NULL                             | 204510 |   100.00 | Using temporary; Using filesort                |
|  1 | SIMPLE      | bketr | NULL       | ALL    | credit_no_idx                  | NULL    | NULL    | NULL                             |   8514 |   100.00 | Range checked for each record (index map: 0x2) |
|  1 | SIMPLE      | bket  | NULL       | eq_ref | PRIMARY                        | PRIMARY | 8       | bketr.tag_id                     |      1 |   100.00 | Using index                                    |
+----+-------------+-------+------------+--------+--------------------------------+---------+---------+----------------------------------+--------+----------+------------------------------------------------+
未加索引分析结果
+----+-------------+-------+------------+--------+--------------------------------+---------+---------+----------------------------------+--------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type   | possible_keys                  | key     | key_len | ref                              | rows   | filtered | Extra                                              |
+----+-------------+-------+------------+--------+--------------------------------+---------+---------+----------------------------------+--------+----------+----------------------------------------------------+
|  1 | SIMPLE      | dci   | NULL       | ALL    | credit_no_IDX                  | NULL    | NULL    | NULL                             | 204510 |   100.00 | Using temporary; Using filesort                    |
|  1 | SIMPLE      | bketr | NULL       | ALL    | NULL                           | NULL    | NULL    | NULL                             |   8514 |   100.00 | Using where; Using join buffer (Block Nested Loop) |
|  1 | SIMPLE      | bket  | NULL       | eq_ref | PRIMARY                        | PRIMARY | 8       | bketr.tag_id                     |      1 |   100.00 | Using index                                        |
+----+-------------+-------+------------+--------+--------------------------------+---------+---------+----------------------------------+--------+----------+----------------------------------------------------+
通过对比,发现虽然加了索引后识别到索引了,但并没有使用(区别度不高,mysql不走索引,走全表扫描)
通过sql分析,查询结果并不需要返回标签相关数据(其实是另外的优化,因为企业和标签1:n想要一个sql写出来也可以通过聚合查询将同个企业标签聚合在一个字段,
但成本很高,且sql也阅读性也差,就先查出企业信息,返回之后再对企业查出对应标签),这是在未加条件的情况下,是不需要联合查询的那么通过iBatis的<if>标签来
动态拼接联合表,这样在没有条件下只需要扫描企业表22W+数据比20亿数据快几千倍。

mapper.xml
优化前
   select dci.credit_no, dci.company_name, dci.company_type_name, dci.company_sub_type, dci.street_name, dci.industry_platform_name, dci.industry_type_name
        from dwd_company_info dci
             left join biz_key_enterprise_tag_relation bketr on dci.credit_no =bketr.credit_no and bketr.source_type=0
             left join biz_key_enterprise_tag bket on bketr.tag_id = bket.id
        <include refid="pageKeyEnterpriseBaseDataWhere"/>
        group by dci.credit_no
优化后
   select dci.credit_no, dci.company_name, dci.company_type_name, dci.company_sub_type, dci.street_name, dci.industry_platform_name, dci.industry_type_name
        from dwd_company_info dci
			      <if test="(req.tagIds!= null and req.tagIds.size() > 0) or (req.categoryId!= null and req.categoryId!= '')">
             left join biz_key_enterprise_tag_relation bketr on dci.credit_no =bketr.credit_no and bketr.source_type=0
             left join biz_key_enterprise_tag bket on bketr.tag_id = bket.id
		        </if>
        <include refid="pageKeyEnterpriseBaseDataWhere"/>
        group by dci.credit_no

那么这时又有人问,那加了标签条件非索引字段那不还是要扫描20亿数据,(心想这小子没想到MySQL优化器的存在呀!得给他分析一下,顺便复习一下😄)
例子:筛选条件选择标签分类id=4,分享结果如下:
 explain select dci.credit_no, dci.company_name, dci.company_type_name, dci.company_sub_type, dci.street_name, dci.industry_platform_name, dci.industry_type_name from dwd_company_info dci left join biz_key_enterprise_tag_relation bketr on dci.credit_no =bketr.credit_no and bketr.source_type=0 left join biz_key_enterprise_tag bket on bketr.tag_id = bket.id WHERE bket.category_id = 4 group by dci.credit_no;
+----+-------------+-------+------------+--------+--------------------------------+--------------------------------+---------+-------------------------------------+------+----------+------------------------------+
| id | select_type | table | partitions | type   | possible_keys                  | key                            | key_len | ref                                 | rows | filtered | Extra                        |
+----+-------------+-------+------------+--------+--------------------------------+--------------------------------+---------+-------------------------------------+------+----------+------------------------------+
|  1 | SIMPLE      | bketr | NULL       | ALL    | NULL                           | NULL                           | NULL    | NULL                                |90000 |    10.00 | Using where; Using temporary |
|  1 | SIMPLE      | bket  | NULL       | eq_ref | PRIMARY                        | PRIMARY                        | 8       | enterprise-workflow.bketr.tag_id    |    1 |    10.00 | Using where                  |
|  1 | SIMPLE      | dci   | NULL       | ref    | dwd_company_info_credit_no_IDX | dwd_company_info_credit_no_IDX | 1023    | enterprise-workflow.bketr.credit_no |    1 |   100.00 | Using index condition        |
+----+-------------+-------+------------+--------+--------------------------------+--------------------------------+---------+-------------------------------------+------+----------+------------------------------+
通过分析结果可以看出来MySQL优化器还是比较智能的,会先将关联表过滤一遍得到结果,和标签表id关联(命中主键索引)结果再和企业表关联(命中普通索引credit_no),总共才扫描到9000行
并没有想象中先联合后再过滤出数据,这是MySQL优化器判断出那种方式更优(并不是所有情况都能正确选择到最优解)

🤗 总结归纳

这两个场景都不是直接通过优化sql的手段来提高效率的,都是通过优化编码逻辑技巧优化,情景1:较少数据库请求次数,情景2:减少联合表+MySQL自身优化器。

场景2需要通过工具分析执行计划来论证优化内容。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值