解析百度搜索结果时现在的地址是一个加密地址,如何可以获取它的真实呢?
直接使用httpclient 还是返回原网页的html内容。
处理方法:
httpClient3.*版本中 设置
getMethod.setFollowRedirects(
false
);
httpClient4.+版本中 设置
- HttpClient httpclient = new DefaultHttpClient();
- HttpParams params = httpclient.getParams();
- params.setParameter(ClientPNames.HANDLE_REDIRECTS, false);
示例代码:
HttpClient http = new HttpClient();
GetMethod getMethod = new GetMethod("https://www.baidu.com/link?url=tlxHtrmlLFYz5m1tLDoyQuhdQvjJy3H2Y_6gRaGdbigSKbUUakq23FBrRTG80HRPkLMUeUEBnJO-3hWzrua2RFOFQcKdBiWJFCvjfbN3-cG");
try {
getMethod.setRequestHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0");
getMethod.setFollowRedirects(false);
http.executeMethod(getMethod);
System.out.println(getMethod.getStatusCode());
// System.out.println(getMethod.getResponseBodyAsString());
for(Header h : getMethod.getResponseHeaders()){
System.out.println(h.getName()+":"+h.getValue());
}
} catch (Exception e) {
e.printStackTrace();
}