一、导包(jsoup jar)
二、代码
Document doc = Jsoup.connect(“http://politics.people.com.cn/n1/2016/0728/c1024-28590499.html“).get();
Elements links = doc.getElementsByTag(“a”);
for (int i = 0; i < links.size(); i++) {
Element link = links.get(i);
System.out.print(link.text()+”\t”);
System.out.println(link.attr(“href”));
}
三、请求数据
1.直接html
Document doc = Jsoup.parse(html);
2.url
Document doc =Jsoup.connect(“网址/”).get();或者post()
String title = doc.title(); //获取标题
Connection conn = Jsoup.connect(“网址/”);
conn.data(key,value) //请求参数
conn. cookie(key,value) //设置cookie
conn. Timeout(3000) //设置超时时间
conn.userAgent(“”) //设置User-Agent
3.文件
File file = new File(“D:/test.html”);Document doc = Jsoup.parse(file,”UTF-8”,”网址”);
网址:baseURL