Java中有多种方式模拟浏览器发送http请求,本文简要介绍JDK自带的HttpURLConnection,Apache的HttpClient,以及Square公司的OkHttp三种工具发送get/post请求。说明:本文不展开https请求证书验证绕过的问题
HttpURLConnection
JDK自带的网络请求工具定义在java.net包下,其中最核心的是HttpURLConnection/HttpsURLConnection类。范例如下:
public class HttpUrlConnectionTest {
public static void main(String [] args) {
HttpUrlConnectionTest testObj = new HttpUrlConnectionTest();
testObj.doGet();
testObj.doPost();
}
public void doGet() {
try {
String url = "https://tcc.taobao.com/cc/json/mobile_tel_segment.htm?tel={xxxx}";
URL obj = new URL(url);
HttpURLConnection con = (HttpURLConnection) obj.openConnection();
con.setRequestMethod("GET");
// 设置请求头
con.setRequestProperty("user-agent", "Mozilla/5.0 ");
int responseCode = con.getResponseCode();
parseResult(con.getInputStream(), "GB2312");
} catch (Exception e) {
e.printStackTrace();
}
}
public void doPost() {
try {
String url = "https://tcc.taobao.com/cc/json/mobile_tel_segment.htm";
URL obj = new URL(url);
HttpURLConnection con = (HttpURLConnection) obj.openConnection();
con.setRequestMethod("POST");
con.setRequestProperty("User-Agent", "Mozilla/5.0");
String params = "tel={xxxx}";
// 表示有数据输出给服务端
con.setDoOutput(true);
DataOutputStream dos = new DataOutputStream(con.getOutputStream());
dos.writeBytes(params);
dos.flush();
dos.close();
int responseCode = con.getResponseCode();
parseResult(con.getInputStream(), "GB2312");
} catch (Exception e) {
e.printStackTrace();
}
}
private void parseResult(InputStream inStream, String charSet) {
try {
InputStreamReader stream = new InputStreamReader(inStream, charSet);
BufferedReader br = new BufferedReader(stream);
String inputLine;
StringBuffer response = new StringBuffer();
while ((inputLine = br.readLine()) != null) {
response.append(inputLine);
}
System.out.println(response.toString());
} catch (IOException e) {
e.printStackTrace();
}
}
}
说明:
- 1.con.setRequestProperty()设置请求头,这对于一些请求来说是必要的,很多网站会对请求头中的Referer进行检查,以此来放爬。部分网站还会对User-Agent进行检查,根据这个字段来过滤一些非浏览器的请求
- 2.HTTP协议定义了很多种HTTP请求方法:GET、POST、PUT、DELETE、OPTIONS等,以上只模拟了最常用的GET、POST请求
HttpClient
HttpClient是Apache Jakarta Common下的子项目,用来提供高效的、最新的、功能丰富的支持HTTP协议的客户端编程工具包。以下是HttpClient提供的主要功能:
- 实现了所有HTTP的方法(GET、POST、PUT、HEAD等)
- 支持自动转向
- 支持HTTPS协议
- 支持代理服务器
public class HttpClientTest {
public static void main(String [] args) {
try {
HttpClientTest testClient = new HttpClientTest();
testClient.doGet();
testClient.doPost();
} catch (Exception e) {
e.printStackTrace();
}
}
public void doGet() throws IOException {
String url = "https://tcc.taobao.com/cc/json/mobile_tel_segment.htm?tel={xxxx}";
HttpClient client = HttpClients.createDefault();
HttpGet request = new HttpGet(url);
request.addHeader("user-agent", "Mozilla/5.0");
HttpResponse response = client.execute(request);
parseResult(response.getEntity().getContent(), "GB2312");
}
public void doPost() throws Exception{
String url = "https://tcc.taobao.com/cc/json/mobile_tel_segment.htm";
HttpClient client = HttpClients.createDefault();
HttpPost post = new HttpPost(url);
post.setHeader("user-agent", "Mozilla/5.0");
List<NameValuePair> params = new ArrayList<NameValuePair>();
params.add(new BasicNameValuePair("tel", "{xxxx}"));
post.setEntity(new UrlEncodedFormEntity(params));
HttpResponse response = client.execute(post);
parseResult(response.getEntity().getContent(), "GB2312");
}
private void parseResult(InputStream inStream, String charSet) {
try {
InputStreamReader stream = new InputStreamReader(inStream, charSet);
BufferedReader br = new BufferedReader(stream);
String inputLine;
StringBuffer response = new StringBuffer();
while ((inputLine = br.readLine()) != null) {
response.append(inputLine);
}
System.out.println(response.toString());
} catch (IOException e) {
e.printStackTrace();
}
}
}
OkHttp
OkHttp 是由Square公司开发,专注于性能和易用性的HTTP客户端。OkHttp库的设计和实现的首要目标是高效,这也是选择OkHttp的重要理由之一。OkHttp提供了对GZIP的默认支持来降低传输内容的大小,也提供了对HTTP响应的缓存机制,可以避免不必要的网络请求,当网络出现问题时,OkHttp会自动重试一个主机的多个IP地址。
public class OkHttpTest {
public static void main(String [] args) throws Exception{
OkHttpTest testClient = new OkHttpTest();
testClient.doGet();
testClient.doPost();
}
public void doGet() throws Exception{
String url = "http://wthrcdn.etouch.cn/weather_mini?citykey=101010100";
OkHttpClient client = new OkHttpClient();
Request requst = new Request.Builder().url(url).build();
Response response = client.newCall(requst).execute();
System.out.println(response.body().string());
}
public void doPost() throws Exception {
String url = "http://wthrcdn.etouch.cn/weather_mini";
OkHttpClient client = new OkHttpClient();
RequestBody requestBody = new FormBody.Builder()
.add("citykey", "101010100").build();
Request requst = new Request.Builder().url(url).post(requestBody).build();
Response response = client.newCall(requst).execute();
System.out.println(response.body().string());
}
}