最近需要对每周生成的日志表进行处理,并且输出结果到另一张表。日志表少的有300万,多有的有上千万条记录。因此打算用多线程来处理数据。在使用线程池时,几个注意点:
1、在入口的地方,直接新建一个线程为执行,然后返回结果,后续通过日志表来跟踪;
2、设置独立的线程名规则,区分自动生成的线程名;
3、直接使用ThreadPoolExecutor,而不是借用Executors类生成;
4、利用Future的阻塞特性来控制全部线程执行结束的时间点;
5、考虑是否有必要增加中断执行的机制;
6、考虑能合成批量操作的地方尽量合成批量操作。
代码参考:
//1.计算线程数
int threadNum = totalCount / StatConstant.SPLIT_NUM;
if (threadNum * StatConstant.SPLIT_NUM < totalCount) {
threadNum++;
}
//2.发起线程
List<Future<Integer>> futureList = new ArrayList<>();
ThreadFactory threadFactory = new ThreadFactoryBuilder()
.setNameFormat("LogHandlerThread-%d")
.build();
ExecutorService executorService = new ThreadPoolExecutor(threadNum, threadNum, 0L, TimeUnit.SECONDS, new ArrayBlockingQueue<Runnable>(threadNum), threadFactory);
for (int i = 0; i < threadNum; i++) {
int begin = i * StatConstant.SPLIT_NUM;
int end = (i + 1) * StatConstant.SPLIT_NUM;
if (i == threadNum - 1) {
end = totalCount;
}
Future<Integer> future = executorService.submit(new LogHandlerThread(begin, end, weekNo, applicationContext));
futureList.add(future);
}
//3.记录线程结果
boolean finalResult = true;
for (int i = 0; i < futureList.size(); i++) {
try {
Future<Integer> future = futureList.get(i);
Integer result = future.get();
handleCount += ((result == null) ? 0 : result);
} catch (Exception e) {
weekLog.setMessage(weekLog.getMessage() + "###" + "(ThreadNum=" + i + ")" + e.getMessage());
finalResult = false;
}
}
executorService.shutdown();
//4.执行其他任务...
public class LogHandlerThread implements Callable<Integer> {
public LogHandlerThread(Integer begin, Integer end, String weekNo, ApplicationContext applicationContext) {
//初始..
}
@Override
public Integer call() {
//执行..
}
}
期间还碰上,死锁的问题(org.springframework.dao.DeadlockLoserDataAccessException: PreparedStatementCallback; SQL [INSERT IGNORE INTO tb(...) VALUES (..)]; Deadlock found when trying to get lock; try restarting transaction;),本来想通过批量初始来提高性能,但是表在更新的时候,如果是锁同样的行记录,确实容易出现死锁,没有太好的办法,说明业务逻辑上可能需要适当的调整来规避这种多线程冲突的情况,优先通过优化设计来解决冲突。