背景
某同学求救说任务可以提交,但是无法执行,所有线程都hang在了同一个地方。让帮忙看下,而且说自己跑单元测试的时候复现不了。
问题定位过程
导出栈
第一步当然是导出栈信息看下。导出之后相关日志大致如下:
"threadPoolTaskExecutor-5" #95 prio=5 os_prio=0 tid=0x00007f2a6018d000 nid=0x7179 runnable [0x00007f2b6083f000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:139)
at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:155)
at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:284)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:165)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:167)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107)
at com.xxx.task.ImgDownTask.run(ImgDownTask.java:57)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
线程池全部日志基本都卡在了这个位置。
代码定位
使用的HttpClient依赖如下:
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.5.2</version>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpmime</artifactId>
<version>4.5.2</version>
</dependency>
他代码大致是这个样子
xxx
CloseableHttpClient client = HttpClients.createDefault();
for(int i = 0; i < number; i++) {
threadPool.execute(() -> {
HttpGet httpGet = new HttpGet(url);
try {
client.execute(httpGet);
} catch (IOException e) {
e.printStackTrace();
}
});
}
xxx
对比代码可以看到线程池全部卡在了client.execute(httpGet);
这一行。
熟悉HttpClient的同学大概已经看出这段问题所在了。
个人拙见如下:
- HttpGet没有及时释放连接;
- CloseableHttpClient没有及时关闭;
- 没有设置超时时间。
原因分析
因为没有及时释放连接,所以导致HttpClient的连接池被用完,同时又因为没有设置超时时间,所以其他的线程只好傻傻的等待Http连接被释放。进而导致所有线程全部hang掉。
简单改写代码大致如下
private static AtomicInteger counter = new AtomicInteger(0);
private static String url = "http://www.baidu.com/";
public static void main(String[] args) {
// 总请求数
int number = 100;
int concurrent = 10;
// 线程池
ExecutorService threadPool = Executors.newFixedThreadPool(concurrent);
CountDownLatch latch = new CountDownLatch(number);
HttpRequestRetryHandler myRetryHandler = new HttpRequestRetryHandler() {
@Override
public boolean retryRequest(IOException exception, int executionCount, HttpContext context) {
return false;
}};
CloseableHttpClient client = HttpClients.createDefault();
for(int i = 0; i < number; i++) {
threadPool.execute(() -> {
HttpGet httpGet = new HttpGet(url);
try {
client.execute(httpGet);
} catch (IOException e) {
e.printStackTrace();
}finally {
//释放链接
httpGet.releaseConnection();
latch.countDown();
}
System.out.println("执行完成"+counter.incrementAndGet());
});
}
threadPool.shutdown();
try {
latch.await();
System.out.println("任务执行完毕");
} catch (InterruptedException e) {
e.printStackTrace();
}
}
//超时设置
public static RequestConfig getRequestConfig() {
RequestConfig defaultRequestConfig = RequestConfig.custom()
.setSocketTimeout(3000)
.setConnectTimeout(3000)
.setConnectionRequestTimeout(3000)
.build();
return defaultRequestConfig;
}
为什么单元测试无法复现?
这是某同学最后的疑惑。
其实原因很简单,当Spring运行单元测试的时候其实不会真的创建线程池的,仍然是单线程模式。
所以遇到多线程的情况还是启动应用测试下好了。