项目场景:
项目场景:学习ElasticSearch搜索实战仿京东搜索
【【狂神说Java】ElasticSearch搜索实战仿京东搜索】 https://www.bilibili.com/video/BV1Nk4y1R7Hf/?p=2&share_source=copy_web&vd_source=7d9d861ee2a548cdf209fc2a99830867
问题描述
提示:项目中遇到的问题:
Exception in thread “main” java.lang.NullPointerException
at com.example.esjd.utils.HtmlParseUtil.main(HtmlParseUtil.java:28)
原因分析:
看完吉先生一点小激动就记录一下(感谢!)
进入京东https://search.jd.com/Search?keyword=java
详细获取thor,突破京东防爬机制
错误原因:工具类已经废弃需要加上以下代码
package com.example.esjd.utils;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
public class HtmlParseUtil {
public static void main(String[] args) throws IOException {
//获取请求 https://search.jd.com/Search?keyword=java
//前提,需要联网!
String url="https://search.jd.com/Search?keyword=java";
//解析网页 (Jsoup返回Document就是Document。js页面对象)
// Document document = Jsoup.parse(new URL(url), 3000);已经废弃
Map<String, String> cookies = new HashMap<String, String>();
cookies.put("thor", "6796BFA4D9A2BC020E411D6C34AEDE22AFE22E159F71E1A52235A5ABBC35C2BC56326DE97E7E1A162E796F8312EB60DE2DF5FE2D24E4FA9EBD324B8C8D01AA61D6CD48F5C99E8AD1FD8D8C970CF5A5D65E1EA11EE979B2F3DCB71372E511BF51CEE319BB934159F137FB23DD72D5AD40585AB9C37856A7118240562E0C4E32563D75EF2EA10C2E3A26C68DA752B59420");
Document document = Jsoup.connect(url).cookies(cookies).get();
//所有你在js中可以使用的方法,他这里都能用
Element element = document.getElementById("J_goodsList");
//获取所有的li元素
Elements elements = element.getElementsByTag("li");
for (Element el : elements) {
String img = el.getElementsByTag("img").eq(0).attr("src");
String price = el.getElementsByClass("p-price").eq(0).text();
String title = el.getElementsByClass("p-name").eq(0).text();
System.out.println("========================");
System.out.println(img);
System.out.println(title);
System.out.println(price);
}
}
}
**
解决方案
**
以下代码
Document document = Jsoup.parse(new URL(url), 3000);
需要改变为代码:
Map<String, String> cookies = new HashMap<String, String>();
cookies.put("thor", "6796BFA4D9A2BC020E411D6C34AEDE22AFE22E159F71E1A52235A5ABBC35C2BC56326DE97E7E1A162E796F8312EB60DE2DF5FE2D24E4FA9EBD324B8C8D01AA61D6CD48F5C99E8AD1FD8D8C970CF5A5D65E1EA11EE979B2F3DCB71372E511BF51CEE319BB934159F137FB23DD72D5AD40585AB9C37856A7118240562E0C4E32563D75EF2EA10C2E3A26C68DA752B59420");
Document document = Jsoup.connect(url).cookies(cookies).get();
**
输出结果
**