集合List和Set使用
问题
多线程 处理数据,每个子线程返回的结果都一样。理论上返回结果集应该不一样。
排查
检查线程中处理逻辑是否有问题,InsuranceDetailCallable中call()方法主体逻辑正确,可能存在的传到每个子线程中的List insuranceDetails数据是一样的,导致每个线程的数据结果都一样。
/**
* 数据处理 operate_type link_id
*/
private class InsuranceDetailCallable implements Callable<Set<Long>> {
private CountDownLatch countDownLatch;
private List<FinalStatementInsuranceDetail> insuranceDetails;
public InsuranceDetailCallable(CountDownLatch countDownLatch,
List<FinalStatementInsuranceDetail> insuranceDetails) {
this.countDownLatch = countDownLatch;
this.insuranceDetails = insuranceDetails;
}
@Override
public Set<Long> call() {
Set<Long> result = new HashSet<>();
try {
//代码省略
} catch (Exception e) {
System.out.println("子线程异常:" + e.getMessage());
} finally {
countDownLatch.countDown();
}
return result;
}
}
父线程中将获取到的list数据根据statementId字段进行分组得到Map集合,接着将Map中的key进行分组得到List<List>组,子线程分别对这些group中集合进行处理数据。问题出现在代码
List<List<Long>> groupList = Stream.iterate(0, n -> n + 1).limit(set.size())
.parallel().map(a -> set.stream().skip((long) a * finalSize).limit(finalSize) .parallel().collect(Collectors.toList())).filter(CollUtil::isNotEmpty).collect(Collectors.toList());
处理的集合Set是无序的,并行处理分组数据,导致分组完的groupList中数据List部分数据一样,子线程中处理的数据也一样。List是有序的,将Set转换成List,可以解决这个问题。
/**
* 修改前父线程代码
*/
public void testMaintainNormalOperateType() {
//获取还未处理的数据
List<FinalStatementInsuranceDetail> insuranceDetailList = insuranceDetailService.listOperateTypeIsNull();
if (CollUtil.isEmpty(insuranceDetailList)) {
return;
}
Map<Long, List<FinalStatementInsuranceDetail>> mapList = insuranceDetailList.stream().collect(Collectors.groupingBy(FinalStatementInsuranceDetail::getStatementId));
Set<Long> set = mapList.keySet();
int size = set.size() / 100;
if (size == 0) {
size = set.size();
}
int finalSize = size;
List<List<Long>> groupList = Stream.iterate(0, n -> n + 1).limit(list.size())
.parallel().map(a -> set.stream().skip((long) a * finalSize).limit(finalSize).parallel().collect(Collectors.toList())).filter(CollUtil::isNotEmpty).collect(Collectors.toList());
}
修改后的父线程代码
public void testMaintainNormalOperateType() {
long start = System.currentTimeMillis();
List<FinalStatementInsuranceDetail> insuranceDetailList = insuranceDetailService.listOperateTypeIsNull(true);
if (CollUtil.isEmpty(insuranceDetailList)) {
return;
}
Map<Long, List<FinalStatementInsuranceDetail>> mapList = insuranceDetailList.stream()
.collect(Collectors.groupingBy(FinalStatementInsuranceDetail::getStatementId));
List<Long> list = new ArrayList<>(mapList.keySet());
try {
int size = list.size() / 100;
if (size == 0) {
size = list.size();
}
int finalSize = size;
List<List<Long>> groupList = Stream.iterate(0, n -> n + 1).limit(list.size())
.parallel().map(a -> list.stream().skip((long) a * finalSize).limit(finalSize)
.parallel().collect(Collectors.toList())).filter(CollUtil::isNotEmpty).collect(Collectors.toList());
ExecutorService executorService = Executors.newFixedThreadPool(groupList.size());
CountDownLatch countDownLatch = new CountDownLatch(groupList.size());
List<Future<Set<Long>>> futureList = new ArrayList<>();
for (List<Long> statementIdList : groupList) {
List<FinalStatementInsuranceDetail> insuranceDetails = new ArrayList<>();
for (Long statementId : statementIdList) {
insuranceDetails.addAll(mapList.getOrDefault(statementId, new ArrayList<>()));
}
InsuranceDetailCallable callable = new InsuranceDetailCallable(countDownLatch, insuranceDetails);
Future<Set<Long>> future = executorService.submit(callable);
futureList.add(future);
}
executorService.shutdown();
for (Future<Set<Long>> future : futureList) {
Set<Long> setLong = future.get();
System.out.println("输出:" + setLong.size() + " " + JSONObject.toJSONString(setLong));
}
} catch (Exception e) {
System.out.println("父线程异常:" + e.getMessage());
}
long end = System.currentTimeMillis();
System.out.println("耗时 = " + (end - start) / 1000);
}
另外一种方式使用apache.commons夹包ListUtils.partition(List list, int size)方法
<!--对应maven依赖 -->
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-collections4</artifactId>
<version>4.1</version>
</dependency>
总结
List是有序的,不唯一;Set无序,数据唯一。