背景
将原本job优化为多线程执行,因为业务逻辑比较复杂,直接调用原来单条处理的方法,不做修改,耗时较长
代码1
public void job() {
int threadNum = 8;
List<Integer> res = new ArrayList<>();
CountDownLatch countDownLatch = new CountDownLatch(threadNum);
for (int i = 0; i < threadNum; i++) {
new Thread(){
@Override
public void run() {
int[] ints = new int[3000];
try {
for (int anInt : ints) {
//doSomething
Thread.sleep(35);
synchronized (res) {
res.add(anInt);
}
}
} catch (Exception e) {
} finally {
countDownLatch.countDown();
}
}
}.start();
}
countDownLatch.await();
batchInsert(res);
}
代码2
为了防止oom,又优化一个版本,子线程生成数据,主线程每分钟循环一次完成数据的入库
public void job() {
int threadNum = 8;
List<Integer> res = new ArrayList<>();
List<Boolean> runRes = new Vector<>();
for (int i = 0; i < threadNum; i++) {
new Thread(){
@Override
public void run() {
int[] ints = new int[3000];
try {
for (int anInt : ints) {
//doSomething
Thread.sleep(35);
synchronized (res) {
res.add(anInt);
}
}
} catch (Exception e) {
} finally {
runRes.add(true);
}
}
}.start();
}
while (true) {
synchronized (res) {
if (res.size() > 0) {
batchInsert(res);
}
res.clear();
if (runRes.size() == threadNum) {
//结束任务
return;
}
}
try {
//每分钟执行一次
Thread.sleep(1000 * 60);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
代码3
代码2仍然存在风险,测试环境与生产环境的子线程生成效率截然不同,主线程多长时间消费一次不能容易的确定
继续优化,每千条数据入库一次,容易想到阻塞队列,但是问题就在于子线程全部阻塞的时候如何通知主线程,想到的也同样是主线程轮询的方式来判断子线程状态,以及加一个countdownlanch来退出循环。
这种方案同样不是很优雅,与《代码2》相同的问题,不能够确定轮询的间隔时间,太短则频繁判断子线程状态,太长则浪费时间
最终决定使用条件变量来控制主线程与子线程的运行
{
long sta = System.currentTimeMillis();
int threadNum = 4;
List<Integer> res = new ArrayList<>();
CountDownLatch countDownLatch = new CountDownLatch(threadNum);
ReentrantLock reentrantLock = new ReentrantLock();
Condition notFull = reentrantLock.newCondition();
Condition full = reentrantLock.newCondition();
int batchNum = 10000;
for (int i = 0; i < threadNum; i++) {
new Thread(){
@Override
public void run() {
int[] ints = new int[6000];
try {
for (int anInt : ints) {
//doSomething
Thread.sleep(3);
reentrantLock.lock();
try {
while (res.size() == batchNum) {
System.out.println("数据充足,准备唤醒主线程 阻塞当前线程-" + Thread.currentThread().getName());
full.signalAll();
notFull.await();
System.out.println("子线程被唤醒-" + Thread.currentThread().getName());
}
res.add(anInt);
} finally {
reentrantLock.unlock();
}
}
} catch (Exception e) {
} finally {
reentrantLock.lock();
System.out.println("子线程结束,准备唤醒主线程,需要主线程额外判断所属情况-" + Thread.currentThread().getName());
full.signalAll();
reentrantLock.unlock();
countDownLatch.countDown();
}
}
}.start();
}
while (true) {
reentrantLock.lock();
try {
try {
System.out.println("主线程阻塞");
full.await();
System.out.println("主线程被唤醒");
} catch (InterruptedException e) {
e.printStackTrace();
}
if (countDownLatch.getCount() == 0) {
System.out.println("所有子线程结束");
batchInsert(res);
System.out.println("入库完成");
System.out.println("所有子线程结束,主线程退出");
System.out.println("执行时间" + (System.currentTimeMillis() - sta));
return;
} else {
if (res.size() == batchNum) {
System.out.println("数据充足,入库");
batchInsert(res);
System.out.println("入库完成");
} else {
System.out.println("list数据没满,同时子线程没有全部结束,说明部分子线程结束,释放锁让其余子线程继续执行!");
reentrantLock.unlock();
//这里没有子线程阻塞, 不需要唤醒
continue;
}
}
System.out.println("准备唤醒子线程-" + Thread.currentThread().getName());
notFull.signalAll();
} finally {
try {
reentrantLock.unlock();
} catch (IllegalMonitorStateException e) {
//已经释放锁
}
}
}
}
其他方案
这其实是一个标准生产消费模型,如果有引入消息中间件的话使用中间件也比较容易