关于HttpClient不多说,可以到apache网站下载,网上文档的也多如牛毛。针对HttpClient-3.1,我认为比较重要的一部分就是配置HTTP的参数。实际上,按HttpClient默认的参数配置也可以做简单的应用,其中对于参数配置问题,我觉得虽然比较乱,但是掌握了HttpClient应用参数继承的继承机制,就非常明白了。
HttpClient-3.1中,参数继承结构如图所示:
global--+ | DefaultHttpParams | | client | HttpClient | | +-- connection manager | HttpConnectionManager | | | | +-- connection | HttpConnection | | +-- host | HostConfiguration | | +-- method | HttpMethod
HttpClient中配置HTTP参数通过上面的继承结构可以看到,HttpClient在参数配置上分为6个级别,分别为:global、 client、connection manager、connection、host、method。下面结合HttpClient-3.1的文档和源代码,说说这6个级别的含义。
HttpClient参数配置因为分为6个级别,可能导致在HttpClient参数的重复配置。但是,文档上也说明了,HttpClient在工作的过程中,是按照继承层次查找参数配置,如果在某个最低级别上没有配置某个参数,程序会自动向上一级别查找,直到查找到该参数在某个级别中被配置过,然后运行时按照,距离参数配置级别最低层最近的原则,进行配置。如果都没有配置,自然就按global级别执行。
当你创建HttpClient和HttpMethod实例的时候,可以不用配置就执行HttpMethod,它会按照global级别配置,如果你对它们进行了配置,而且参数配置级别低于global级别,global级别参数配置就被覆盖。通过上面的继承层次可以看到,每个不同的级别都对应着一个实体,从高到低依次为DefaultHttpParams、HttpClient、HttpConnectionManager、 HttpConnection、HostConfiguration、HttpMethod。
关于各个级别是如何规定的,并在都可以配置哪些参数,可以参考Apache官方网站帮助文档http://hc.apache.org/httpclient-3.x/preference-api.html,下面举个例子说明一下:
主要通过使用http.socket.timeout参数,其中,HttpClient、HttpConnectionManager、HttpMethod都可以对干参数进行配置,代码如下所示:
package org.shirdrn.test;
import java.io.IOException;
import org.apache.commons.httpclient.HostConfiguration;
import org.apache.commons.httpclient.HttpClient;
import org.apache.commons.httpclient.HttpException;
import org.apache.commons.httpclient.HttpMethodBase;
import org.apache.commons.httpclient.HttpStatus;
import org.apache.commons.httpclient.URI;
import org.apache.commons.httpclient.URIException;
import org.apache.commons.httpclient.methods.GetMethod;
public class MyHttpClient {
private HttpClient client;
private HttpMethodBase method;
private String stringData;
public MyHttpClient() {
this.client = new HttpClient();
this.method = new GetMethod();
}
public void setParams(String url) throws URIException, NullPointerException {
this.client.getParams().setParameter("http.socket.timeout", 1); // 为HttpClient设置参数
this.client.getHttpConnectionManager().getParams().setParameter("http.socket.timeout", 1); // 为HttpConnetionManager设置参数
this.method.getParams().setParameter("http.socket.timeout", 10000); // 为HttpMethod设置参数
HostConfiguration hostConf = new HostConfiguration();
hostConf.setHost(new URI(url, true));
this.client.setHostConfiguration(hostConf);
}
public void execute() throws HttpException, IOException {
int status = this.client.executeMethod(this.method);
if(status == HttpStatus.SC_OK) {
this.stringData = new String(this.method.getResponseBody(), "gb2312");
}
}
public void print() {
System.out.println(this.stringData);
}
public static void main(String[] args) {
MyHttpClient client = new MyHttpClient();
try {
client.setParams("http://www.csdn.net");
client.execute();
client.print();
} catch (Exception e) {
e.printStackTrace();
}
}
}
上面程序中,分别对HttpClient、HttpConnectionManager、HttpMethod三个级别配置了 http.socket.timeout参数,直接运行可以看到指定网站的页面数据被下载下来,可见,是HttpMethod设置的超时参数起作用了。所以它的级别在上述三个级别中是最低的。
可以测试一下,将HttpConnectionManager、HttpMethod设置的参数全部注释掉,再执行上述程序,就会发生Socket超时异常:
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
at org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78)
at org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106)
at org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java:1116)
at org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1973)
at org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:1735)
at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1098)
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at org.shirdrn.test.MyHttpClient.execute(MyHttpClient.java:35)
at org.shirdrn.test.MyHttpClient.main(MyHttpClient.java:49)
因为HttpClient设置了Socket超时时间为1ms,肯定是不能够获取到指定网站的数据的。