我们HttpClient向服务器请求时,正常情况 执行成功 返回200状态码,不一定每次都会请求成功,比如这个请求地址不存在 返回404
服务器内部报错 返回500,有些服务器带有防采集,假如你频繁的采集数据,则返回403 拒绝你请求。
当然 我们是有办法的 下一章会发布用代理IP,解决此类问题
demo:
package com.gcx.demo.HelloWorld2;
import org.apache.http.HttpEntity;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;
public class App2 {
public static void main(String[] args) throws Exception{
CloseableHttpClient httpClient=HttpClients.createDefault(); // 创建httpClient实例
HttpGet httpGet=new HttpGet("https://www.baidu.com"); // 创建httpget实例
httpGet.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:50.0) Gecko/20100101 Firefox/50.0"); // 设置请求头消息User-Agent
CloseableHttpResponse response=httpClient.execute(httpGet); // 执行http get请求
System.out.println("Status:"+response.getStatusLine().getStatusCode());
HttpEntity entity=response.getEntity(); // 获取返回实体
System.out.println("Content-Type:"+entity.getContentType().getValue());
//System.out.println("网页内容:"+EntityUtils.toString(entity, "utf-8")); // 获取网页内容
response.close(); // response关闭
httpClient.close(); // httpClient关闭
}
}
输出结果:
Status:200
Content-Type:text/html; charset=utf-8