EasyExcel导出数据,解决慢sql,漏数据,重复数据问题(一)
大家思考一下,在导出excel时是否会出现如下几个常见问题
- 慢sql问题
- 漏数据,缺数据问题
- 数据重复
那到底该如何解决呢?下面我们一起来看看我的实现吧!
一、示例1,走时间范围索引
- controller入口
- 分页查询2000条数据,分批次导出。
- 使用 where id > ? limit a 。而不是用 limit a, b 的查询方式导出,可以解决mysql深分页问题。
- 解决了由于where 条件, order by 排序 带来的慢sql问题
- 解决了导出数据过程中重复数据,遗漏数据的问题
/**
*下载自动续费
* @author youlu
* @date 2022/11/21 14:32
* @param response
* @return com.smy.ucc.common.JsonMessage
*/
@GetMapping("/downloadRenewalSign")
public void downloadRenewalSign(HttpServletResponse response, RenewalSignAdminReq req) throws Exception {
this.checkDownloadParam(req);
response.setContentType("application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
response.setCharacterEncoding("utf-8");
// 这里URLEncoder.encode可以防止中文乱码 当然和easyexcel没有关系
String fileName = URLEncoder.encode("橡树会员自动续费", "UTF-8");
response.setHeader("Content-disposition", "attachment;filename=" + fileName + ".xlsx");
ExcelWriter excelWriter = EasyExcel.write(response.getOutputStream(), RenewalSignAdminResp.class).build();
WriteSheet writeSheet = EasyExcel.writerSheet("自动续费").build();
req.setStartCreateTime(DateUtil.parseDate(req.getStartCreateTimeStr() + " 00:00:00", "yyyy-MM-dd HH:mm:ss"));
req.setEndCreateTime(DateUtil.parseDate(req.getEndCreateTimeStr() + " 23:59:59", "yyyy-MM-dd HH:mm:ss"));
int pageSize = 2000;
boolean firstFlag = true;
while (true) {
/**
* 该查询有漏数据的情况,因此下面有个补偿查询
* firstFlag = true。首次查询,where create_datetime >= ? and create_datetime <= ? and 其他条件 order by create_time , id limit 2000
* = false。非首次,where create_datetime > ? and create_datetime <= ? and 其他条件 order by create_time , id limit 2000
*/
List<RenewalSignAdminResp> data = cbsRenewalService.queryRenewalSignListForDownload(req, firstFlag, pageSize);
if (CollectionUtils.isEmpty(data)) {
break;
}
Date lastCreateTime = data.get(data.size() - 1).getCreateTime();
String startId = data.get(data.size() - 1).getId();
//补偿同一时间段内并发的数据 where create_time = ? and id > ? and 其他筛选条件
List<RenewalSignAdminResp> otherDatas = cbsRenewalService.queryRenewalSignListByCreateTimeAndId(req, lastCreateTime, startId);
data.addAll(otherDatas);
excelWriter.write(data, writeSheet);
req.setStartCreateTime(lastCreateTime);
if (firstFlag) {
firstFlag = false;
}
}
excelWriter.finish();
}
- 实体(只贴了部分)
/**
*
* @author youlu
* @date 2022/9/22 10:19
* @return
*/
@Getter
@Setter
public class RenewalSignAdminResp implements Serializable {
private static final long serialVersionUID = 472574503670404785L;
/**
* 会员类型,
* @see VipType
*/
@ExcelProperty(value = "会员类型", index = 2, converter = VipTypeStringConverter.class)
private String vipType;
/**
* 签约状态,1:待签约、2:签约失败、3:签约中、4:已解约
* @see RenewalStatusEnum
*/
@ExcelProperty(value = "签约状态", index = 6, converter = RenewalStatusStringConverter.class)
private String renewalStatus;
//创建日期
@ExcelProperty(value = "创建日期", index = 12)
@DateTimeFormat("yyyy-MM-dd HH:mm:ss")
@JsonFormat(pattern = "yyyy-MM-dd HH:mm:ss",locale = "zh",timezone="GMT+8")
private Date createTime;
//修改日期
@ExcelProperty(value = "修改日期", index = 13)
@DateTimeFormat("yyyy-MM-dd HH:mm:ss")
@JsonFormat(pattern = "yyyy-MM-dd HH:mm:ss",locale = "zh",timezone="GMT+8")
private Date updateTime;
}
这里要注意converter 的使用,例如:vipType在数据库中设定的是枚举值“1”,“2”,“3”,“4” 但是我们导出数据期望是其代表的含义描述,因此要转换,关注其convertToExcelData方法,转换如下:
package com.smy.cbs.easyexcel.write;
import com.alibaba.excel.converters.Converter;
import com.alibaba.excel.enums.CellDataTypeEnum;
import com.alibaba.excel.metadata.CellData;
import com.alibaba.excel.metadata.GlobalConfiguration;
import com.alibaba.excel.metadata.property.ExcelContentProperty;
import com.smy.cbs.enums.vip.VipType;
/**
* String and string converter
*
* @author youlu
*/
public class VipTypeStringConverter implements Converter<String> {
@Override
public Class supportJavaTypeKey() {
return String.class;
}
@Override
public CellDataTypeEnum supportExcelTypeKey() {
return CellDataTypeEnum.STRING;
}
/**
* 这里是读的时候会调用 不用管
*
* @param cellData
* NotNull
* @param contentProperty
* Nullable
* @param globalConfiguration
* NotNull
* @return
*/
@Override
public String convertToJavaData(CellData cellData, ExcelContentProperty contentProperty,
GlobalConfiguration globalConfiguration) {
return cellData.getStringValue();
}
/**
* 这里是写的时候会调用 不用管
*
* @param value
* NotNull
* @param contentProperty
* Nullable
* @param globalConfiguration
* NotNull
* @return
*/
@Override
public CellData convertToExcelData(String value, ExcelContentProperty contentProperty,
GlobalConfiguration globalConfiguration) {
return new CellData(VipType.getVipTypeDescByCode(value));
}
}
- service|Dao层实现
@Override
public List<RenewalSignAdminResp> queryRenewalSignListForDownload(RenewalSignAdminReq req, boolean firstFlag, int pageSize) {
return cbsRenewalSignInnerService.queryRenewalSignListForDownload(req, firstFlag, pageSize);
}
@Override
public List<RenewalSignAdminResp> queryRenewalSignListByCreateTimeAndId(RenewalSignAdminReq req, Date createTime, String startId) {
return cbsRenewalSignInnerService.queryRenewalSignListByCreateTimeAndId(req,createTime, startId);
}
- sql层查询
<select id="queryRenewalSignListForDownload" resultMap="signAdminMap">
select
<include refid="Base_Column_List" />
from t_renewal_sign
<where>
<choose>
<when test="firstFlag">
and create_time >= #{req.startCreateTime}
</when>
<otherwise>
and create_time > #{req.startCreateTime}
</otherwise>
</choose>
and create_time <= #{req.endCreateTime}
and renewal_status = #{req.renewalStatus}
<if test="req.custNo != null and req.custNo != ''">
and cust_no = #{req.custNo}
</if>
<if test="req.id != null and req.id != ''">
and id = #{req.id}
</if>
<if test="req.vipType != null and req.vipType != ''">
and vip_type = #{req.vipType}
</if>
</where>
order by create_time , id
limit #{pageSize}
</select>
- firstFlag 用于区分是否是第一次查询
- order by create_time , id 的排序可以用于解决慢sql问题。
根据这个查询可以解决: 慢sql问题 和 数据重复的问题。但依然解决不了漏数据的问题,因此引进下面这条sql专门查询在相同时间点内生成的数据。
<select id="queryRenewalSignListByCreateTimeAndId" resultMap="signAdminMap">
select
<include refid="Base_Column_List" />
from t_renewal_sign
<where>
and create_time = #{createTime}
and id > #{startId}
and renewal_status = #{req.renewalStatus}
<if test="req.custNo != null and req.custNo != ''">
and cust_no = #{req.custNo}
</if>
<if test="req.id != null and req.id != ''">
and id = #{req.id}
</if>
<if test="req.vipType != null and req.vipType != ''">
and vip_type = #{req.vipType}
</if>
</where>
- 这条sql的意义就是补偿,查询在相同时间点内生成的数据,解决漏数据问题
二、示例2,走主键索引(推荐)
上面示例1的方法每页查询完会将上一页最后一条数据的时间做为下一页查询的起始时间(压缩时间范围),并且根据游标id去查询数据。可能处理起来会比较麻烦。
其实用下面的方式查询会更便捷,。 只需要将上一页的最后一条数据的id 做为下一面查询的起始id去查,不走时间范围索引,直接走主键索引。type类型为range。
select * from t_order where create_datetime >= '起始时间'
and create_datetime <= '截止时间'
and 其他条件
and id >''
order by id
limit 2000