1, HttpClient的范围
- 基于HttpCore的客户端端传输库
- 基于经典的阻塞式I/O
- 内容无关
2, HttpClient不是什么
HttpClient不是一个浏览器,只是一个客户端的HTTP传输库,它起着传输和接收HTTP信息的作用,
它不进行内容的缓存,不执行html页面中嵌入的javascript代码,不尝试判断content type的内容或者
重新格式化请求,或者重定向,以及其他跟http传输不管的功能
HttpClient基础
HttpGet
下面是一个HttpClient请求的简单的执行过程
HttpClient httpClient = new DefaultClient();
HttpGet httpGet = new HttpGet("http://www.baidu.com");
HttpResponse response = httpClient.execute(httpGet);
HttpEntitiy entity = response.getEntity();
if(entity!=null){
InputStream instream = entity.getContent();
try{
//do something useful
}finally{
instream.close();
}
}
URIBuilder
HttpClient提供了URIBuilder工具类用于创建修改请求的URI
URIBuilder builder = new URIBuilder();
builder.setScheme("http").setHost("www.google.com.hk").setPath("/search")
.setParameter("q","httpclient")
.setParameter("aq","f");
URI uri = builder.builder();
HttpGet httpGet = new HttpGet(uri);
System.out.println(httpGet.getURI());
HttpResponse
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1,HttpStatus.SC_OK,"OK");
System.out.println(response.getProtocolVersion()+" "+response.getStatusLine().getStatusCode()+" "
+response.getStatusLine());
Header
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1,
HttpStatus.SC_OK, "OK");
response.addHeader("Set-Cookie", "c1=a; path=/; domain=localhost");
response.addHeader("Set-Cookie",
"c2=b; path=\"/\", c3=c; domain=\"localhost\"");
Header h1 = response.getFirstHeader("Set-Cookie");
System.out.println(h1);
Header h2 = response.getLastHeader("Set-Cookie");
System.out.println(h2);
Header[] hs = response.getHeaders("Set-Cookie");
System.out.println(hs.length);
HeaderIterator
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1,
HttpStatus.SC_OK, "OK");
response.addHeader("Set-Cookie",
"c1=a; path=/; domain=localhost");
response.addHeader("Set-Cookie",
"c2=b; path=\"/\", c3=c; domain=\"localhost\"");
HeaderIterator it = response.headerIterator("Set-Cookie");
while (it.hasNext()) {
System.out.println(it.next())
HeaderElementIterator
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1,
HttpStatus.SC_OK, "OK");
response.addHeader("Set-Cookie",
"c1=a; path=/; domain=localhost");
response.addHeader("Set-Cookie",
"c2=b; path=\"/\", c3=c; domain=\"localhost\"");
HeaderElementIterator it = new BasicHeaderElementIterator(
response.headerIterator("Set-Cookie"));
while (it.hasNext()) {
HeaderElement elem = it.nextElement();
System.out.println(elem.getName() + " = " + elem.getValue());
NameValuePair[] params = elem.getParameters();
for (int i = 0; i < params.length; i++) {
System.out.println(" " + params[i]); } }
http entity
http实体分为三种
streamed: The content is received from a stream, or generated on the fly. In particular, this category includes entities being received from HTTP responses. Streamed entities are generally not repeatable.
self-contained: The content is in memory or obtained by means that are independent from a connection or other entity. Self-contained entities are generally repeatable. This type of entities will be mostly used for entity enclosing HTTP requests.
wrapping: The content is obtained from another entity.
可重复实体:一个实体是可重复的,意味着它可以内重复的读取,也就是自包含的实体, 例如:StringEntity ,ByteArrayEntity
StringEntity myEntity = new StringEntity("important message",
ContentType.create("text/plain", "UTF-8"));
System.out.println(myEntity.getContentType());
System.out.println(myEntity.getContentLength());
System.out.println(EntityUtils.toString(myEntity));
System.out.println(EntityUtils.toByteArray(myEntity).length);
如果Header是不可用的,则httpentity.getContentLength()返回-1,httpentity.getContentType()为null,
如果Content-Type头信息可用,则httpentity.getContentType()返回一个Header对象
httpentity.getContent及httpentity.writeTo(inputStream)使用之后都必须关闭流,可以使用EntityUtils.consume(httpentity)来确保当前实体内容被消费及关闭流
使用httpUriRequest.abort()方法可以终止当前请求,连接不可以重用,所用的资源将被释放
entity.setChunked(true)设置httpclient使用块编码
EntityUtils
EntityUtils提供了一下方法处理HttpEntity
static String | getContentCharSet( HttpEntity entity) |
static byte[] | toByteArray( HttpEntity entity) |
static String | toString( HttpEntity entity, String defaultCharset) |
static String | toString( HttpEntity entity) |
HttpClient提供了一些有用的实体
StringEntity ,ByteArrayEntity ,FileEntity (self-contained)
InputStreamEntity(streamed)
File file = new File("somefile.txt");
FileEntity entity = new FileEntity(file, ContentType.create("text/plain", "UTF-8"));
HttpPost httppost = new HttpPost("http://localhost/action.do");
httppost.setEntity(entity);
InputStreamEntity是不可重复的, 只能够被读取一次
UrlEncodedFormEntity
List<NameValuePair> formparams = new ArrayList<NameValuePair>();Fundamentals
7
formparams.add(new BasicNameValuePair("param1", "value1"));
formparams.add(new BasicNameValuePair("param2", "value2"));
UrlEncodedFormEntity entity = new UrlEncodedFormEntity(formparams, "UTF-8");
HttpPost httppost = new HttpPost("http://localhost/handler.do");
httppost.setEntity(entity);
ResponseHandler
使用ResponseHandler处理服务器响应,无论是请求执行成功或者发生异常,httpclient将自动处理连接的释放HttpClient httpclient = new DefaultHttpClient();
HttpGet httpget = new HttpGet("http://localhost/");
ResponseHandler<byte[]> handler = new ResponseHandler<byte[]>() {
public byte[] handleResponse(
HttpResponse response) throws ClientProtocolException, IOException {
HttpEntity entity = response.getEntity();
if (entity != null) {
return EntityUtils.toByteArray(entity);
} else {
return null;
}
}
};
byte[] response = httpclient.execute(httpget, handler);
HTTP execution context
在http请求过程中, HttpClient向执行上下文中添加了如下属性:
ExecutionContext.HTTP_CONNECTION="http.connection"
返回连接实例 HttpConnection
ExecutionContext.HTTP_TARGET_HOST='http.target_host': 返回代表连接目标的HttpHost实例
ExecutionContext.HTTP_PROXY_HOST='http.proxy_host': 返回表示代理服务器目标的HttpHost实例
ExecutionContext.HTTP_REQUEST='http.request': HttpRequest 实例
ExecutionContext.HTTP_RESPONSE='http.response': HttpResponse 实例
ExecutionContext.HTTP_REQ_SENT='http.request_sent': 返回boolean值代表传输是否成功
DefaultHttpClient httpclient = new DefaultHttpClient();
HttpContext localContext = new BasicHttpContext();
HttpGet httpget = new HttpGet("http://www.google.com/");
HttpResponse response = httpclient.execute(httpget, localContext);
HttpHost target = (HttpHost) localContext.getAttribute(
ExecutionContext.HTTP_TARGET_HOST);
System.out.println("Final target: " + target);
HttpEntity entity = response.getEntity();
EntityUtils.consume(entity);
}
异常处理
HttpClient抛出两种异常 IOException 和HttpException ,I/O error是不致命的,可以恢复的,然而Http protocol errors是致命的,不可以恢复的
幂等性
HttpClient中假定封装了实体类的方法如GET ,HEAD是幂等的,封装了实体类的方法如POST请求是非幂等的
自动异常恢复
默认情况HttpClient尝试自动从I/O异常中修复,默认自动修复机制仅仅限于那些被认为安全的异常
HttpClient不尝试从任何逻辑或者http协议错误(HttpException)中恢复
HttpClient自动重试那些假定是幂等的请求方法
当Http请求仍然正在传输到目标服务器时,HttpClient自动重试那些以为传输异常而失败的方法如:
请求还没有完全传输到服务器
请求重试处理器
为了实现自定义的异常恢复机制,应该提供HttpRequestRetryHandler实现,例子如下:
DefaultHttpClient httpclient = new DefaultHttpClient();
HttpRequestRetryHandler myRetryHandler = new HttpRequestRetryHandler() {
public boolean retryRequest(
IOException exception,
int executionCount,
HttpContext context) {
if (executionCount >= 5) {
// Do not retry if over max retry count
return false;
}
if (exception instanceof InterruptedIOException) {
// Timeout
return false;
}
if (exception instanceof UnknownHostException) {
// Unknown host
return false;
}
if (exception instanceof ConnectException) {
// Connection refused
return false;
}
if (exception instanceof SSLException) {
// SSL handshake exception
return false;
}
HttpRequest request = (HttpRequest) context.getAttribute(
ExecutionContext.HTTP_REQUEST);
boolean idempotent = !(request instanceof HttpEntityEnclosingRequest);
if (idempotent) {
// Retry if the request is considered idempotent
return true;
}
return false;
}
};
httpclient.setHttpRequestRetryHandler(myRetryHandler);
HTTP协议拦截器
通常用于给响应消息加入自定义的一个或者一组响应头,或者给请求信息加入自定义的一个或者一组相关请求头,拦截器
也可以操作内容实体,如内容的压缩和解压缩.通常是通过装饰模式完成.如使用一个包装实体装饰原始的实体,多个协议拦截
器可以结合形成一个逻辑单元.
协议拦截器可以通过Http执行上下文分享信息如进程状态,协议拦截器可以使用协议上下文为一个或者后续的多个请求保存进程信息
通常只要不依赖一个执行上下文的特殊状态信息,拦截器的执行顺序不重要,否则必须按着期望执行的顺序添加拦截器
协议拦截器的实现必须是线程安全的,类似于servlets,协议拦截器不应该使用实例变量除非操作这些实例变量时同步的
例子如:
DefaultHttpClient httpclient = new DefaultHttpClient();
HttpContext localContext = new BasicHttpContext();
AtomicInteger count = new AtomicInteger(1);
localContext.setAttribute("count", count);
httpclient.addRequestInterceptor(new HttpRequestInterceptor() {
public void process(
final HttpRequest request,
final HttpContext context) throws HttpException, IOException {
AtomicInteger count = (AtomicInteger) context.getAttribute("count");
request.addHeader("Count", Integer.toString(count.getAndIncrement()));
}
});
HttpGet httpget = new HttpGet("http://localhost/");
for (int i = 0; i < 10; i++) {
HttpResponse response = httpclient.execute(httpget, localContext);
HttpEntity entity = response.getEntity();
EntityUtils.consume(entity);
}
HttpParams
HttpParams类似于执行上下文HttpContext,是一个键值对的集合,区别在于:
HttpParams倾向于保存简单的对象如: Integers ,doubles ,strings, collections 以及那些在运行时保持不变的对象
HttpParams通常是"write once - ready many" 使用模式,
HttpContext保存那些在http信息处理过程很可能改变的复杂对象
HttpParams的层级性
HttpClient的HttpParams是全局性的,被所有的HttpRequest分享,HttpRequest的HttpParams是独占性的,会选择性的覆盖HttpClient中的HttpParams
DefaultHttpClient httpclient = new DefaultHttpClient();
httpclient.getParams().setParameter(CoreProtocolPNames.PROTOCOL_VERSION,
HttpVersion.HTTP_1_0); // Default to HTTP 1.0
httpclient.getParams().setParameter(CoreProtocolPNames.HTTP_CONTENT_CHARSET,
"UTF-8");
HttpGet httpget = new HttpGet("http://www.google.com/");
httpget.getParams().setParameter(CoreProtocolPNames.PROTOCOL_VERSION,
HttpVersion.HTTP_1_1); // Use HTTP 1.1 for this request only
httpget.getParams().setParameter(CoreProtocolPNames.USE_EXPECT_CONTINUE,
Boolean.FALSE);
httpclient.addRequestInterceptor(new HttpRequestInterceptor() {
public void process(
final HttpRequest request,
final HttpContext context) throws HttpException, IOException {
System.out.println(request.getParams().getParameter(
CoreProtocolPNames.PROTOCOL_VERSION));
System.out.println(request.getParams().getParameter(
CoreProtocolPNames.HTTP_CONTENT_CHARSET));
System.out.println(request.getParams().getParameter(
CoreProtocolPNames.USE_EXPECT_CONTINUE));
System.out.println(request.getParams().getParameter(
CoreProtocolPNames.STRICT_TRANSFER_ENCODING));
}
});
Http Parameter bean
HttpClient提供的参数bean:
AuthParamBean, ClientParamBean, ConnConnectionParamBean,ConnManagerParamBean, ConnRouteParamBean,CookieSpecParamBean,
HttpConnectionParamBean,HttpProtocolParamBean
HttpProtocolParamBean使用例子如下:
HttpParams params = new BasicHttpParams();
HttpProtocolParamBean paramsBean = new HttpProtocolParamBean(params);
paramsBean.setVersion(HttpVersion.HTTP_1_1);
paramsBean.setContentCharset("UTF-8");
paramsBean.setUseExpectContinue(true);
System.out.println(params.getParameter(
CoreProtocolPNames.PROTOCOL_VERSION));
System.out.println(params.getParameter(
CoreProtocolPNames.HTTP_CONTENT_CHARSET));
System.out.println(params.getParameter(
CoreProtocolPNames.USE_EXPECT_CONTINUE));
System.out.println(params.getParameter(
CoreProtocolPNames.USER_AGENT));
Http请求执行参数简介
CoreProtocolPNames.PROTOCOL_VERSION="http.protocol.version";
定义Http协议版本,这个参数的值是ProtocolVersion类型(HttpVersion是其子类),默认使用HTTP/1.1
CoreProtocolPNames.HTT_ELEMENT_CHARSET="http.protocol.element-charset":
定义了http协议元素使用的字符集,参数类型是String,默认使用US-ASCII
CoreProtocolPNames.HTTP_CONTENT_CHARSET='http.protocol.content-charset':
定义了默认内容题使用的编码,参数类型是String,默认使用ISO-8859-1
CoreProtocolPNames.USER_AGENT='http.useragent':
类型为String,默认由HttpClient自动生成
CoreProtocolPNames.STRICT_TRANSFER_ENCODING='http.protocol.strict-transfer-encoding':
类型为Boolean,定义了是否响应头中的无效的Transfer-Encoding头是否被拒绝
CoreProtocolPNames.USE_EXPECT_CONTINUE='http.protocol.expect-continue':
CoreProtocolPNames.WAIT_FOR_CONTINUE='http.protocol.wait-for-continue':
类型为Integer,定义了最大的等待100-continue回应的时间,单位毫秒,默认HttpClient将等待3秒