爬取静态页面
需求:获取本人博客页面的 title “yhao的博客- 博客频道 - CSDN.NET”
首先通过okhttp以get方式请求页面:
final String url = "http://blog.csdn.net/yhaolpz?viewmode=contents";
Request request = new Request.Builder().url(url).build();
mOkHttpClient.newCall(request).enqueue(new Callback() {
@Override
public void onFailure(Call call, IOException e) {
Log.e(TAG, "onFailure ");
}
@Override
public void onResponse(Call call, Response response) throws IOException {
if (response.code() == 200) {
String html = response.body().string();
Log.d(TAG, "onResponse: " + html);
}
}
});
返回页面数据onResponse如下:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<script type="text/javascript" src="http://c.csdnimg.cn/pubfooter/js/tracking.js"