最近用到了HttpClient写爬虫,可能我有新版本强迫症,老是喜欢用新版本的东西(虽说新版本不一定好用),然后就用了HttpClient 4.3。HttpClient这货和Lucene一样,每个版本的API都变化很大,这有点让人头疼。就好比创建一个HttpClient对象吧,每一个版本的都不一样,
3.X是这样的
1
|
HttpClient httpClient=
new
DefaultHttpClient();
|
1
|
CloseableHttpClient httpClient = HttpClients.createDefault();
|
我要讲的是超时设置,HttpClient有三种超时设置,最近比较忙,没时间具体归纳总结,以后再补上,我这里就讲一些最简单最易用的超时设置方法。
这是个3.X的超时设置方法
1
2
3
|
HttpClient client =
new
HttpClient();
client.setConnectionTimeout(
30000
);
client.setTimeout(
30000
);
|
1
2
|
HttpClient httpClient=
new
HttpClient();
httpClient.getHttpConnectionManager().getParams().setConnectionTimeout(
5000
);
|
1
2
3
|
HttpClient httpClient=
new
DefaultHttpClient();
httpClient.getParams().setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT,
2000
);
//连接时间
httpClient.getParams().setParameter(CoreConnectionPNames.SO_TIMEOUT,
2000
);
//数据传输时间
|
1
2
3
4
5
|
CloseableHttpClient httpClient = HttpClients.createDefault();
HttpGet httpGet=
new
HttpGet(
"http://www.baidu.com"
);//HTTP Get请求(POST雷同)
RequestConfig requestConfig = RequestConfig.custom().setSocketTimeout(
2000
).setConnectTimeout(
2000
).build();
//设置请求和传输超时时间
httpGet.setConfig(requestConfig);
httpClient.execute(httpGet);
//执行请求
|
BTW,4.3版本不设置超时的话,一旦服务器没有响应,等待时间N久(>24小时)。
昨天遇到一个问题需要设置CloseableHttpClient的超时时间,查了官方文档如下。
新建一个RequestConfig:
RequestConfig defaultRequestConfig = RequestConfig.custom()
.setSocketTimeout(5000)
.setConnectTimeout(5000)
.setConnectionRequestTimeout(5000)
.setStaleConnectionCheckEnabled(true)
.build();
这个超时可以设置为客户端级别,作为所有请求的默认值:
CloseableHttpClient httpclient = HttpClients.custom()
.setDefaultRequestConfig(defaultRequestConfig)
.build();
Request不会继承客户端级别的请求配置,所以在自定义Request的时候,需要将客户端的默认配置拷贝过去:
HttpGet httpget = new HttpGet("http://www.apache.org/");
RequestConfig requestConfig = RequestConfig.copy(defaultRequestConfig)
.setProxy(new HttpHost("myotherproxy", 8080))
.build();
httpget.setConfig(requestConfig);