URL连接(需要抛出异常):Document document = Jsoup.connect("url").timeout(10 * 1000).get();
select查找元素:
查找id.select("#id")
查找class.select(".class")
获取文章.select.html()
获取文本.select.text()
获取href.attr("href")
获取script标签.getElementsByTag("script")
去除空格TextUtils.removeWhiteBlank(要去除空格的内容)
创建标签
document.createElement("a");
a1.text(i + "");
a1.attr("href", url1);
document.body().appendChild(a1);
post请求网页数据下面展示一些 内联代码片
。
private ScriptEngineManager sem = new ScriptEngineManager();
private ScriptEngine engine = sem.getEngineByExtension("js");//解码
private static String postUrl = "http://www.pudong.gov.cn/shpd/specialDepts/SpecialPolicy/default.aspx/GetReportList";
private void fetchPageData(String categoryType, int totalPage, Document curPageDocument) {
for (int i = 1; i <= totalPage; i++) {
System.out.println(String.format("\n ===== categoryType: %s, %d vs %d =====", categoryType, i, totalPage));