想要获取双色球开奖信息,利用爬虫无疑是个比较方便的方式,针对简单的功能,除了python以外,Java也有比较便捷的方式——Jsoup
要获取指定位置的内容,需要知道该内容的标签,比如红球的标签是‘li’,class是ball_red,那么对应的,我们就要去获取这个标签的内容
话不多说,上代码
import org.jsoup.Connection;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;
import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class SpiderZCW {
public static void main(String[] args) throws Exception {
String url = "https://kaijiang.500.com/shtml/ssq/";
SpiderZCW spider = new SpiderZCW();
Long i = 21001L;
String newUrl = url + i + ".shtml";
Map<String,Object> map = spider.getNum(newUrl);
System.out.println("第"+i+"期:红球:"+map.get("red")+",蓝球:"+map.get("blue"));
}
public Map<String,Object> getNum(String url) throws InterruptedException, IOException {
//链接到目标地址
Connection connect = Jsoup.connect(url);
//设置useragent,设置超时时间,并以get请求方式请求服务器
Document document = connect.userAgent("Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)").timeout(6000).ignoreContentType(true).get();
//先获取红球
Elements reds = document.select("li.ball_red");
List<String> redList = new ArrayList<>();
//ball_red有多个,依次获取
for(int i = 0; i <reds.size();i++){
String idStr = reds.get(i).text();
redList.add(idStr);
}
//再根据蓝球数据的标签,获取蓝球内容
Elements blues = document.select("li.ball_blue");
String blue = blues.get(0).text();
Map<String,Object> map = new HashMap<>();
map.put("red",redList);
map.put("blue",blue);
return map;
}
}
运行结果:
第21001期:红球:[02, 03, 13, 18, 20, 31],蓝球:11