最近学习lucene,涉及爬虫。通过网络资源包抓取网络资源,附上代码:
public class Dsfa {
public static void main(String[] args) {
HttpClient client = new HttpClient();
GetMethod getMethod = new GetMethod(
"http://blog.csdn.net/luo_da/article/details/76135572");
getMethod.getParams().setParameter(HttpMethodParams.RETRY_HANDLER,
new DefaultHttpMethodRetryHandler());
try {
int statusCode = client.executeMethod(getMethod);
if (statusCode != HttpStatus.SC_OK) {
System.out.println("获取失败..." + getMethod.getStatusLine());
}
byte[] responseBody = getMethod.getResponseBody();
FileOutputStream fileOutputStream = new FileOutputStream(
"content.txt");// 将文件读取到本地文本
fileOutputStream.write(responseBody);
fileOutputStream.close();
} catch (HttpException e) {
e.printStackTrace();
System.out.println("获取失败,请重新获取...");
} catch (IOException e) {
e.printStackTrace();
} finally {
getMethod.releaseConnection();
}
}
}
进行网络资源,需要用到的包:
commons-codec-1.10.jar
commons-httpclient-3.1.jar
commons-logging-1.1.1.jar
资源下载地址:http://download.csdn.net/detail/luo_da/9912913