一个花了我将近8小时的问题,从上午九点一直到下午16点半,中间吃了个饭。
原来的程序读取任务列表,将任务依次塞进一个线程池中,然后让它们独立去跑。
ExecutorService executor = Executors.newFixedThreadPool(10);
这已经是一种多线程了。
为了追求更快速度,在每个线程中,如果碰到了多个子任务的话,继续使用线程池,使多个子任务能同时进行。
于是还是照原样写。
// 构建线程池
ExecutorService executor = Executors.newFixedThreadPool(5);
这次容量设小了点,5个。但是总是会出错,莫名其妙的,比如
java.io.IOException: Stream closed ,
java.io.IOException: Attempted read on closed stream.等等
看eclipse给出的stacktrace,经常是在下面这行出错,
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
而这行还就是httpclient.executeMethod后面的系统代码。
但是这句话怎么会出错呢,我用这句话已经n次了,而且还有catch伺候着,还判断着返回的statuscode呢。
不停地google,终于,找到一篇文章,跟我的问题类似,转贴在下面:
---------------------------------------------------------------------------------------------------------------------
java.io.IOException: Stream closed
| Print | View Threaded | Show Only this Message
Hi Pierre-Alain,
> I'm trying to stream a GET response directly to a PUT method but I get a
> "java .io .IOException : Stream closed " exception in order to synchronize 2
> DAV repositories. I would like to avoid to deal with temporary files.
>
> You can found my attempt here : http://pastebin.com/f5cef9454
>
> Can somebody point me where I'm wrong?
You are using a single HttpClient with the default connection manager
to do two things at the same time. That cannot work. The default is
SimpleHttpConnectionManager [1], which always has only one connection.
First, you use that connection to execute the GET and obtain the result.
But when you try to use that same connection for sending the PUT, it
gets closed and re-opened to the new target. Then when it is time to
read from the GET, the stream is detected as closed .
Use two different HttpClient objects for the two requests, or create
one HttpClient object with a MultiThreadedHttpConnectionManager [2].
You might want to study the Threading Guide [3] first.
cheers,
Roland
[1]
http://hc.apache.org/httpclient -3.x/apidocs/org/apache/commons/httpclient /SimpleHttpConnectionManager.html
[2]
http://hc.apache.org/httpclient -3.x/apidocs/org/apache/commons/httpclient /MultiThreadedHttpConnectionManager.html
[3] http://hc.apache.org/httpclient -3.x/threading.html
-----------------------------------------------------------------------------------------------------
呵呵,原来问题的根源在于在多线程环境中,httpclient不能使用原有的单一线程方式,而必须使用多线程方式,可以如下写:
MultiThreadedHttpConnectionManager connectionManager =
new MultiThreadedHttpConnectionManager();
HttpClient client = new HttpClient(connectionManager);
看看上面的第3个参考文献,有非常明白的解释了。顺便把它也引过来,如下:
来自http://hc.apache.org/httpclient-3.x/threading.html,就是httpclient官方网站。
Introduction
This document provides an overview of how to use HttpClient safely from within a multi-threaded environment. It is broken down into the following main sections:
Please see the MultiThreadedExample for a concrete example.
MultiThreadedHttpConnectionManager
The main reason for using multiple theads in HttpClient is to allow the execution of multiple methods at once (Simultaniously downloading the latest builds of HttpClient and Tomcat for example). During execution each method uses an instance of an HttpConnection. Since connections can only be safely used from a single thread and method at a time and are a finite resource, we need to ensure that connections are properly allocated to the methods that require them. This job goes to the MultiThreadedHttpConnectionManager .
To get started one must create an instance of the MultiThreadedHttpConnectionManager and give it to an HttpClient. This looks something like:
MultiThreadedHttpConnectionManager connectionManager =
new MultiThreadedHttpConnectionManager();
HttpClient client = new HttpClient(connectionManager);
This instance of HttpClient can now be used to execute multiple methods from multiple threads. Each subsequent call to HttpClient.executeMethod() will go to the connection manager and ask for an instance of HttpConnection. This connection will be checked out to the method and as a result it must also be returned. More on this below in Connection Release .
Options
The MultiThreadedHttpConnectionManager supports the following options:
connectionStaleCheckingEnabled | The connectionStaleCheckingEnabled flag to set on all created connections. This value should be left true except in special circumstances. Consult the HttpConnection docs for more detail. |
maxConnectionsPerHost | The maximum number of connections that will be created for any particular HostConfiguration. Defaults to 2. |
maxTotalConnections | The maximum number of active connections. Defaults to 20. |
In general the connection manager makes an attempt to reuse connections for a particular host while still allowing different connections to be used simultaneously. Connection are reclaimed using a least recently used approach.
Connection Release
One main side effect of connection management is that connections must be manually released when no longer used. This is due to the fact that HttpClient cannot determine when a method is no longer using its connection. This occurs because a method's response body is not read directly by HttpClient, but by the application using HttpClient. When the response is read it must obviously make use of the method's connection. Thus, a connection cannot be released from a method until the method's response body is read which is after HttpClient finishes executing the method. The application therefore must manually release the connection by calling releaseConnection() on the method after the response body has been read. To safely ensure connection release HttpClient should be used in the following manner:
MultiThreadedHttpConnectionManager connectionManager =
new MultiThreadedHttpConnectionManager();
HttpClient client = new HttpClient(connectionManager);
...
// and then from inside some thread executing a method
GetMethod get = new GetMethod("http://httpcomponents.apache.org/");
try {
client.executeMethod(get);
// print response to stdout
System.out.println(get.getResponseBodyAsStream());
} finally {
// be sure the connection is released back to the connection
// manager
get.releaseConnection();
}
Particularly, notice that the connection is released regardless of what the result of executing the method was or whether or not an exception was thrown. For every call to HttpClient.executeMethod there must be a matching call to method.releaseConnection().
那为什么我原来写多线程不会出现这个问题呢?
原来原有的代码在每一个线程中定义了一个httpclient对象。
而这次在子线程中继续开多个子线程时,却没有为每个子线程分配一个httpclient对象,
而是让他们共享一个httpclient对象,自然,大家不够用了,闹矛盾了,读完就把连接给关了,
当然别人就读不到了,那也就出错喽。