查看网页源代码或者使用firebug定位时可以看到想要抓取的内容,等到用Jsoup解析时却什么都没有,可以在解析之前模拟浏览器操作。
//模拟浏览器操作
URL url1 = null;
URLConnection uc = null;
InputStream in = null;
BufferedReader br = null;
String ss = null;
url1 = new URL(url);
uc = url1.openConnection();
uc.setRequestProperty("User-Agent",
"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:28.0) Gecko/20100101 Firefox/28.0");
// õ
in = uc.getInputStream();
br = new BufferedReader(new InputStreamReader(in, "utf-8"));
String temp = "";
StringBuilder sbs = new StringBuilder();
while ((temp = br.readLine()) != null) {
sbs.append(temp + "\n");
}
ss = sbs.toString();
// System.out.println(ss);
Document doc = Jsoup.parse(ss, "", new org.jsoup.parser.Parser(new XmlTreeBuilder()));