软件介绍
JsoupXpath 是一款纯Java开发的使用xpath解析提取html数据的解析器,针对html解析完整实现了W3C XPATH 1.0标准语法,xpath的Lexer和Parser基于Antlr4构建,html的DOM树生成采用Jsoup,故命名为JsoupXpath. 为了在java里也享受xpath的强大与方便但又苦于找不到一款足够好用的xpath解析器,故开发了JsoupXpath。JsoupXpath的实现逻辑清晰,扩展方便, 支持完备的W3C XPATH 1.0标准语法,W3C规范:http://www.w3.org/TR/1999/REC-xpath-19991116 ,JsoupXpath语法描述文件Xpath.g4
快速开始
maven依赖:cn.wanghaomiao
JsoupXpath
${latest-release-version}
示例:String xpath="//div[@id='post_list']/div[./div/div/span[@class='article_view']/a/num()>1000]/div/h3/allText()";
String doc = "...";
JXDocument jxDocument = new JXDocument(doc);
Listrs = jxDocument.sel(xpath);
for (Object o:rs){
if (o instanceof Element){
int index = ((Element) o).siblingIn