我有
text1 | text2 |
text | text |
我想提取所有行的网址和文本
我用
Document doc = Jsoup.connect(url).get();
for (Element table : doc.select("table.table")) {
for (Element row : table.select("tr")) {
Elements tds = row.select("td");
String text1=tds.get(0).text();
String url= row.attr("href");
System.out.println(text1+ "," + url);
}
}
我得到text1值但url为null.
如何从td标签中获取网址?
解决方法:
您的行变量不是a标记,因此它上没有属性href.
试试这个:
Element table = doc.select("table.table");
Elements links = table.getElementsByTag("a");
for (Element link: links) {
String url = link.attr("href");
String text = link.text();
System.out.println(text + ", " + url);
}
标签:java,jsoup,html-parsing
来源: https://codeday.me/bug/20190723/1512653.html