异常现象
线上服务器负载并不是很高,但是却出现了OOM异常,报错如下
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:717)
at org.java_websocket.client.WebSocketClient.connect(WebSocketClient.java:291)
at org.java_websocket.client.WebSocketClient.connectBlocking(WebSocketClient.java:315)
分析
1. dump jvm堆内存
通过eclipse的MAT工具的 Leak Suspects分析,发现内存主要是线程吃掉了。如图所示,
这是非繁忙的时候dump的堆,所以占用的内存还不是很高
点击 details
进去查看,线程名称都是 pool-11298-thread-1之类的。
搜索日志发现该线程是在使用阿里云ASR识别的时候创建的
2. jstask 线程栈分析
打开保存的线程栈,搜索线程 pool-11298-thread-1
"pool-11298-thread-1" #89607 prio=5 os_prio=0 tid=0x00007f95948ae800 nid=0x249c waiting on condition [0x00007f926eff7000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000e295aba8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- None
发现大量的线程处在 waiting on condition
状态,
3. 查看源码
找到发现 AliRealtimeListener
这个类每次都是new出来的实例,但是他有个属性是线程池,并且不是static final
修饰的,导致每次进行识别的时候就创建线程池,并且固定启动5条线程,在进行大量句子识别的时候,就会导致短时的内存紧张出现OOM
解决
将线程池用 static final
修饰,并将线程改为手动创建
private static final ExecutorService stopAsrThread = new ThreadPoolExecutor(
10, 20, 60, TimeUnit.SECONDS, new LinkedBlockingQueue<>(1000),
ThreadUtil.newNamedThreadFactory("asr_stop_", false), new ThreadPoolExecutor.CallerRunsPolicy()
);