我们在前面讲到httpclient抓取网页内容的时候 通常都是获取到页面的源代码content存入数据库。
详见下文:
那么如果我们除了获得页面源代码之外 还想把页面保存到本地存成html应该怎么做呢?
其实很简单 我们先来看访问页面获取content的代码
private static String getUrlContent(DefaultHttpClient httpPostClient,
String urlString) throws IOException, ClientProtocolException {
HttpGet httpGet = new HttpGet(urlString);
HttpResponse httpGetResponse = httpPostClient.execute(httpGet);// 其中HttpGet是HttpUriRequst的子类
httpPostClient.getParams().setParameter(
CoreConnectionPNames.CONNECTION_TIMEOUT, 10000);// 连接时间20s
httpPostClient.getParams().setParameter(
CoreConnectionPNames.SO_TIMEOUT, 8000);// 数据传输时间60s
if (httpGetResponse.getStatusLine().getStatusCode() == 200) {
HttpEntity httpEntity = httpGetResponse.getEntity();
if (httpEntity.getContentEncoding() != null) {
if ("gzip".eq