1.创建HttpClient实例
HttpClient的重要功能是执行HTTP请求方法,获取响应资源。在执行具体的请求方法之前,需要实例化HttpClient。 实例化HttpClient的方式主要有以下5种。
- HttpClient httpClient = Httpclients.custom().build();
- HttpClient httpClient = Httpclientbuilder.create().build();
- HttpClient httpClient = Httpclients.createSystem();
- HttpClient httpClient = Httpclients.createMinimal();
- CloseableHttpClient httpClient = Httpclients.createDefault();
2.创建请求方法的实例
在HttpClient中,支持HTTP/1.1的HTTP方法,即GET、POST、HEAD、PUT、DELETE、OPTIONS和TRACE。其中,每种方法都对应一个类,即HttpGet、HttpPost、HttpHead、HttpPut、HttpDelete、HttpOption和HttpTrace。在网络爬虫中,常用的类是HttpGet与HttpPost。从HttpClient源码中,可以发现这些类的实例化方式各有三种,三种实例化使用方式如下面代码所示。
//第一种方式
String personalUrl = "https://searchcustomerexperience.techtarget.com/info/news";
URI uri = new URIBuilder(personalUrl).build();
HttpGet getMethod = new HttpGet();
getMethod.setURI(uri);
System.out.println(getMethod);
//第二种方式
HttpGet httpGetUri = new HttpGet(uri);
System.out.println(httpGetUri);
//第三种方式
HttpGet httpGetStr = new HttpGet(personalUrl);
System.out.println(httpGetStr);
3.执行请求
基于实例化的HttpClient,可以调用execute(HttpUriRequest request)方法来执行请求,返回HttpResponse。HttpClient也提供了三种操作方式,代码示例如下。
//第一种方式
HttpResponse httpResponse = new BasicHttpResponse(HttpVersion.HTTP_1_1,
HttpStatus.SC_OK, "OK");
httpResponse = client.execute(getMethod);
//第二种方式
HttpResponse httpResponse = null;
try {
httpResponse = httpClient.execute(httpGet,localContext);
} catch (IOException e) {
e.printStackTrace();
}
//第三种方式
CloseableHttpClient httpClient = HttpClients.createDefault();
HttpGet httpGet = new HttpGet("https://searchcustomerexperience.techtarget.com/info/news");
CloseableHttpResponse httpResponse = null;
try {
httpResponse = httpClient.execute(httpGet);
} catch (IOException e) {
e.printStackTrace();
}
4.获取响应信息
基于上述方法3获取的HttpResponse,可以继续执行一些方法获取响应状态码、响应头和响应实体等信息,如程序3-14所示,在执行请求时,使用了HttpContext,即HTTP上下文环境。
//程序3-14
public class HttpclientInit {
public static void main(String[] args) throws Exception {
//初始化HttpContext
HttpContext localContext = new BasicHttpContext();
String url = "https://searchcustomerexperience.techtarget.com/info/news";
//初始化httpClient
HttpClient httpClient = HttpClients.custom().build();
HttpGet httpGet = new HttpGet(url);
//执行请求获取HttpResponse
HttpResponse httpResponse = null;
try {
httpResponse = httpClient.execute(httpGet,localContext);
} catch (IOException e) {
e.printStackTrace();
}
//获取具体响应信息
System.out.println("response:" + httpResponse );
//响应状态
String status = httpResponse .getStatusLine().toString();
System.out.println("status:" + status);
//获取响应状态码
int statusCode = httpResponse .getStatusLine().getStatusCode();
System.out.println("statusCode:" + statusCode);
//协议的版本号
ProtocolVersion protocolVersion = httpResponse .getProtocolVersion();
System.out.println("protocolVersion:" + protocolVersion);
//是否是ok
String phrase = httpResponse .getStatusLine().getReasonPhrase();
System.out.println("phrase:" + phrase);
//头信息
Header [] headers = httpResponse.getAllHeaders();
System.out.println("输出头信息为:");
for (int i = 0; i < headers.length; i++) {
System.out.println(headers[i]);
}
System.out.println("头信息输出结束");
if(statusCode == HttpStatus.SC_OK){//状态码200表示响应成功
//获取实体内容
HttpEntity entity = httpResponse.getEntity();
//注意设置编码
String entityString = EntityUtils.toString (entity,"gbk");
//输出实体内容
System.out.println(entityString);
EntityUtils.consume(httpResponse.getEntity());
}else {
//关闭HttpEntity的流实体
EntityUtils.consume(httpResponse.getEntity());
}
}
}