说明: 项目中使用的HttpClient版本是3.1
测试
一般的HttpClient使用例子:
MultiThreadedHttpConnectionManager manager = new MultiThreadedHttpConnectionManager();
HttpClient client = new HttpClient(manager);
client.setConnectionTimeout(30000);
client.setTimeout(30000);
GetMethod get = new GetMethod("http://download.jboss.org/jbossas/7.0/jboss-7.0.0.Alpha1/jboss-7.0.0.Alpha1.zip");
try {
client.executeMethod(get); //发起请求
String result = get.getResponseBodyAsString(); //获取数据
} catch (Exception e) {
} finally {
get.releaseConnection(); //释放链接
}
这里一个url是近20MB的一个下载资源,很快发现线程要等个很久。得加个timeout超时机制。
分析
目前httpClient3.1只支持3种timeout的设置:
connectionTimeout : socket建立链接的超时时间,Httpclient包中通过一个异步线程去创建socket链接,对应的超时控制。
timeoutInMilliseconds : socket read数据的超时时间, socket.setSoTimeout(timeout);
httpConnectionTimeout : 如果那个的是MultiThreadedHttpConnectionManager,对应的是从连接池获取链接的超时时间。
分析一下问题,我们需要的是一个HttpClient整个链接读取的一个超时时间,包括请求发起,Http Head解析,response流读取的一系列时间的总和。
目标很明确,对应的修正后的测试代码:
final MultiThreadedHttpConnectionManager manager = new MultiThreadedHttpConnectionManager();
final HttpClient client = new HttpClient(manager);
client.setConnectionTimeout(30000);
client.setTimeout(30000);
final GetMethod get = new GetMethod(
"http://download.jboss.org/jbossas/7.0/jboss-7.0.0.Alpha1/jboss-7.0.0.Alpha1.zip");
Thread t = new Thread(new Runnable() {
@Override
public void run() {
try {
client.executeMethod(get);
String result = get.getResponseBodyAsString();
} catch (Exception e) {
// ignore
}
}
}, "Timeout guard");
t.setDaemon(true);
t.start();
try {
t.join(5000l); //等待5s后结束
} catch (InterruptedException e) {
System.out.println("out finally start");
((MultiThreadedHttpConnectionManager) client.getHttpConnectionManager()).shutdown();
System.out.println("out finally end");
}
if (t.isAlive()) {
System.out.println("out finally start");
((MultiThreadedHttpConnectionManager) client.getHttpConnectionManager()).shutdown();
System.out.println("out finally end");
t.interrupt();
// throw new TimeoutException();
}
System.out.println("done");
这里通过Thread.join方法,设置了超时时间为5000 ms,这是比较早的用法。 如果熟悉cocurrent包的,可以直接使用Future和ThreadPoolExecutor进行异步处理,缓存对应的Thread。
Cocurrent代码例子代码
ExecutorService service = Executors.newCachedThreadPool();
Future future = service.submit(new Callable() {
@Override
public String call() throws Exception {
try {
client.executeMethod(get);
return get.getResponseBodyAsString();
} catch (Exception e) {
e.printStackTrace();
} finally {
System.out.println("future finally start");
((MultiThreadedHttpConnectionManager) client.getHttpConnectionManager()).shutdown();
System.out.println("future finally end");
}
return "";
}
});
try {
future.get(5000, TimeUnit.MILLISECONDS);
} catch (Exception e) {
System.out.println("out finally");
e.printStackTrace();
((MultiThreadedHttpConnectionManager) client.getHttpConnectionManager()).shutdown();
System.out.println("out finally end");
}
service.shutdown();
说明: 这里为什么释放链接未采用get.releaseConnection()
看下release的实现:
public void releaseConnection() {
if (responseStream != null) {
try {
// FYI - this may indirectly invoke responseBodyConsumed.
responseStream.close(); // 会先关闭流
} catch (IOException e) {
// the connection may not have been released, let's make sure
ensureConnectionRelease();
}
} else {
// Make sure the connection has been released. If the response
// stream has not been set, this is the only way to release the
// connection.
ensureConnectionRelease();
}
}
这里会先关闭responseStream流,这就是问题点。
对应的responseStream是在方法:readResponseBody(HttpConnection conn)。一般的html页面返回的是一个ContentLengthInputStream对象
ContentLengthInputStream在调用close方法时会用ChunkedInputStream.exhaustInputStream读完所有流数据public void close() throws IOException {
if (!closed) {
try {
ChunkedInputStream.exhaustInputStream(this);
} finally {
// close after above so that we don't throw an exception trying
// to read after closed!
closed = true;
}
}
}
ChunkedInputStream.exhaustInputStream代码
static void exhaustInputStream(InputStream inStream) throws IOException {
// read and discard the remainder of the message
byte buffer[] = new byte[1024];
while (inStream.read(buffer) >= 0) {
;
}
}
说明:
因为非sleep和park的方法,不会响应InterruptedException事件,所以普通future超时发起的Thread.interrpt()并没有效果。
默认的SimpleHttpConnectionManager不支持这样的操作,所以选择MultiThreadedHttpConnectionManager.shutdown()方法,强制关闭底层HttpConnection的sock的输入输出流。
总结
理解一下HttpClient这样设计的理由: socket重用,keepAlive协议的支持等,保证上一次数据不会对新的请求有影响。
Thread.interrpt()处理,只会在Thread处于sleep或者wait状态才会被唤醒(api的描述)。而且该方法的调用并不自动产生InterruptedException异常,一般是需要自己判断Thread.isInterrupted(),然后throw异常。 我们目前使用的一些jdk cocurrent类比如future.cancel也是类似处理。