不久前在测试给组织添加应用功能时(类似于小程序,选中组织下的所有用户有权限使用添加的应用),无意选中了根节点,插入状态一直pending,后台跟踪大概插入操作执行了270s,数据量有26510 -> 约2.6W条,看了下代码,采用整体for循环单个save的方式:
可谓慢的过分,于是着手优化;
优化一:考虑JPA的批量查询
JPA咱也不熟,批量插入搞不好需要添加配置,一顿百度猛如虎,需要添加如下配置:
1、application.properties添加(yml自行转换)
#批量的大小
spring.jpa.properties.hibernate.jdbc.batch_size=500
#可以告诉Hibernate JDBC驱动程序能够在执行批量更新时返回正确的受影响行数(执行版本检查所需)
spring.jpa.properties.hibernate.jdbc.batch_versioned_data=true
#开启批量插入
spring.jpa.properties.hibernate.order_inserts=true
#开启批量更新
spring.jpa.properties.hibernate.order_updates =true
2、数据库jdbc_url添加(这里是国产神通数据库、其他数据库参考着添加就ok)
#rewriteBatchedStatements=TRUE
oa-server.datasource.url=jdbc:oscar://x.x.x.x:2003/OSRDB?useSSL=false&rewriteBatchedStatements=TRUE
代码初步修改如下:
查看批量save的源码:
点进去发现其实调用的不过也是单个save的方法:
单个save方法慢其实就慢在 - 有检测机制isNew方法,它有许多实现类实习了此方法,每一次插入都会检测是新增还是更新(通过判断id是否为空、通过判断版本号),浪费了大量时间:
JPA对于批量插入的支持可真是难以言说,要解决这个检测机制,当然是要重写save方法;
AbstractEntityInformation.isNew(判断id是否为空)
public boolean isNew(T entity) {
ID id = this.getId(entity);
Class<ID> idType = this.getIdType();
if (!idType.isPrimitive()) {
return id == null;
} else if (id instanceof Number) {
return ((Number)id).longValue() == 0L;
} else {
throw new IllegalArgumentException(String.format("Unsupported primitive id type %s!", idType));
}
}
JpaMetamodelEntityInformation.isNew (判断版本号是否一致)
/*
* (non-Javadoc)
* @see org.springframework.data.repository.core.support.AbstractEntityInformation#isNew(java.lang.Object)
*/
@Override
public boolean isNew(T entity) {
if (versionAttribute == null || versionAttribute.getJavaType().isPrimitive()) {
return super.isNew(entity);
}
BeanWrapper wrapper = new DirectFieldAccessFallbackBeanWrapper(entity);
Object versionValue = wrapper.getPropertyValue(versionAttribute.getName());
return versionValue == null;
}
优化二:自定义Repository、重写save方法,解决isNew检测问题,此时插入时间2.6W数据为68s
自定义Repository官方文档:Custom Implementations for Spring Data Repositories
自定义基础存储库: Customize the Base Repository
Example 39. Custom repository base class
class MyRepositoryImpl<T, ID>
extends SimpleJpaRepository<T, ID> {
private final EntityManager entityManager;
MyRepositoryImpl(JpaEntityInformation entityInformation,
EntityManager entityManager) {
super(entityInformation, entityManager);
// Keep the EntityManager around to used from the newly introduced methods.
this.entityManager = entityManager;
}
@Transactional
public <S extends T> S save(S entity) {
// implementation goes here
}
}
这里我新增一个新的batchSave方法,替代之前的批量save:
1、新增 ApplicationCompanyOrganizationUserBatchSaveRepository 接口类继承JpaRepository、JpaSpecificationExecutor接口
@Repository
public interface ApplicationCompanyOrganizationUserBatchSaveRepository extends JpaRepository<ApplicationCompanyOrganizationUser, Integer>, JpaSpecificationExecutor<ApplicationCompanyOrganizationUser> {
@Transactional
List<ApplicationCompanyOrganizationUser> batchSave(Iterable<ApplicationCompanyOrganizationUser> entities);
}
2、新增实现类实现BatchSave方法(注意要在启动类中,加上该实现类)
新增实现类
package com.easemob.oa.persistence.jpa.impl;
import com.easemob.oa.models.entity.ApplicationCompanyOrganizationUser;
import com.easemob.oa.persistence.jpa.ApplicationCompanyOrganizationUserBatchSaveRepository;
import com.easemob.oa.persistence.jpa.ApplicationCompanyOrganizationUserRepository;
import com.google.common.collect.Lists;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.data.jpa.repository.support.JpaEntityInformation;
import org.springframework.data.jpa.repository.support.SimpleJpaRepository;
import org.springframework.data.repository.NoRepositoryBean;
import org.springframework.util.Assert;
import javax.persistence.EntityManager;
import javax.transaction.Transactional;
import java.io.Serializable;
import java.util.Iterator;
import java.util.List;
/**
* @Author turnflys
* @Date 1/12/21 10:51 PM
*/
@NoRepositoryBean
public class ApplicationCompanyOrganizationRepositoryImpl<T, ID extends Serializable> extends SimpleJpaRepository<ApplicationCompanyOrganizationUser,Integer> implements ApplicationCompanyOrganizationUserBatchSaveRepository{
//持久化上下文
private final EntityManager em;
private final JpaEntityInformation<T, ?> entityInformation;
public ApplicationCompanyOrganizationRepositoryImpl(JpaEntityInformation<T, ?> entityInformation, EntityManager entityManager) {
super((JpaEntityInformation<ApplicationCompanyOrganizationUser, ?>) entityInformation,entityManager);
this.entityInformation = entityInformation;
this.em = entityManager;
}
@Override
@Transactional
public List<ApplicationCompanyOrganizationUser> batchSave(Iterable<ApplicationCompanyOrganizationUser> entities) {
Iterator<ApplicationCompanyOrganizationUser> iterator = entities.iterator();
int index = 0;
while (iterator.hasNext()){
em.persist(iterator.next());
index++;
if (index % 1000 == 0){
em.flush();
em.clear();
}
}
if (index % 1000 != 0){
em.flush();
em.clear();
}
List<ApplicationCompanyOrganizationUser> lists = Lists.newArrayList();
entities.forEach(lists::add);
return lists;
}
}
3、主逻辑service实现类,启用新repository中新的存储方法
优化三:引入多线程
参考本人的另一篇文章:JPA批量插入过慢及其优化之 —— 泛型提炼公用batchSave方法、引入多线程