< ul>包含城市是< div class =“state_delimiter”>的下一个兄弟.您可以使用
Element#nextElementSibling()从该div中获取它.这是一个启动示例:
Document document = Jsoup.connect("http://www.craigslist.org/about/sites").get();
Elements countries = document.select("div.colmask");
for (Element country : countries) {
System.out.println("Country: " + country.select("h1.continent_header").text());
Elements states = country.select("div.state_delimiter");
for (Element state : states) {
System.out.println("\tState: " + state.text());
Elements cities = state.nextElementSibling().select("li");
for (Element city : cities) {
System.out.println("\t\tCity: " + city.text());
}
}
}
doc.select(“div.state_delimiter,ul”)没有做你想要的.它返回所有< div class =“state_delimiter”>和< ul>文件的要素.如果您已经掌握了HTML解析器,那么通过字符串函数手动解析它是没有意义的.