HttpClient HelloWorld实现
前面我们介绍了HttpClient 这个框架主要用来请求第三方服务器,然后获取到网页,得到我们需要的数据;
所以今天搞个简单实例,让大家体验一把。
首先建一个Maven项目,然后添加httpClient依赖,版本是4.5
1
2
3
4
5
|
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>
4.5
.
2
</version>
</dependency>
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
|
package
com.open1111.httpclient;
import
java.io.IOException;
import
org.apache.http.HttpEntity;
import
org.apache.http.ParseException;
import
org.apache.http.client.ClientProtocolException;
import
org.apache.http.client.methods.CloseableHttpResponse;
import
org.apache.http.client.methods.HttpGet;
import
org.apache.http.impl.client.CloseableHttpClient;
import
org.apache.http.impl.client.HttpClients;
import
org.apache.http.util.EntityUtils;
public
class
HelloWorld {
public
static
void
main(String[] args) {
CloseableHttpClient httpClient=HttpClients.createDefault();
// 创建httpClient实例
HttpGet httpget =
new
HttpGet(
"http://www.open1111.com/"
); // 创建httpget实例
CloseableHttpResponse response=
null
;
try
{
response = httpClient.execute(httpget);
}
catch
(ClientProtocolException e) {
// http协议异常
// TODO Auto-generated catch block
e.printStackTrace();
}
catch
(IOException e) {
// io异常
// TODO Auto-generated catch block
e.printStackTrace();
}
// 执行get请求
HttpEntity entity=response.getEntity();
// 获取返回实体
try
{
System.out.println(
"网页内容:"
+EntityUtils.toString(entity,
"utf-8"
));
}
catch
(ParseException e) {
// 解析异常
// TODO Auto-generated catch block
e.printStackTrace();
}
catch
(IOException e) {
// io异常
// TODO Auto-generated catch block
e.printStackTrace();
}
// 指定编码打印网页内容
try
{
response.close();
}
catch
(IOException e) {
// io异常
// TODO Auto-generated catch block
e.printStackTrace();
}
// 关闭流和释放系统资源
}
}
|
视频里会有详细讲解,运行输出:
这里得到了网站首页源码,当然要获得具体数据的话,要用到Jsoup,我们后面课程会讲解该技术;
假如你对这些异常都熟悉 我们可以简化下,异常抛出,这样代码可读性好点。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
package
com.open1111.httpclient;
import
org.apache.http.HttpEntity;
import
org.apache.http.client.methods.CloseableHttpResponse;
import
org.apache.http.client.methods.HttpGet;
import
org.apache.http.impl.client.CloseableHttpClient;
import
org.apache.http.impl.client.HttpClients;
import
org.apache.http.util.EntityUtils;
public
class
HelloWorld2 {
public
static
void
main(String[] args)
throws
Exception{
CloseableHttpClient httpclient = HttpClients.createDefault();
// 创建httpclient实例
HttpGet httpget =
new
HttpGet(
"http://www.open1111.com/"
); // 创建httpget实例
CloseableHttpResponse response = httpclient.execute(httpget);
// 执行get请求
HttpEntity entity=response.getEntity();
// 获取返回实体
System.out.println(
"网页内容:"
+EntityUtils.toString(entity,
"utf-8"
));
// 指定编码打印网页内容
response.close();
// 关闭流和释放系统资源
}
}
|
但是实际开发的话,我们对于每一种异常的抛出,catch里都需要做一些业务上的操作,所以以后用的话,还是第一种,假如爬虫任务很简单,容易爬取,并且量小,那就第二种。还是要根据具体情况来。