Java使用Xpath实现字符串的拼接
前提:导入dom4j依赖
<dependency>
<groupId>dom4j</groupId>
<artifactId>dom4j</artifactId>
<version>1.6.1</version>
</dependency>
1.实现方法
public String replaceContent(String content) throws Exception {
String s1 = "<!DOCTYPE html [ <!ENTITY nbsp \" \"><!ENTITY ldquo \"“\"><!ENTITY rdquo \"”\"> ]>";
String s2 = "<div>";
String s3 = "</div>";
//转成XML文件进行解析 字符串拼接
Document document = DocumentHelper.parseText(s1 + s2 + content + s3);
String xpath = "/div/p";
//找字符串中的<div><p></p></div>的数据并且插入到集合中
List<Element> list = document.selectNodes(xpath);
if (!ObjUtil.isNullOrEmpty(list)){
for (Element element : list) {
//
StringBuilder stringBuilder = new StringBuilder();
List<Element> list1 = element.selectNodes(".//strong");
if (!ObjUtil.isNullOrEmpty(list1)){
List<Text> list2 = element.selectNodes(".//text()");
for (Text element1 : list2) {
stringBuilder.append(element1.getText().replaceAll(" ","").replaceAll("\n",""));
}
}
String text = stringBuilder.toString();
//在<p>标签中增加id="";
if (!ObjUtil.isNullOrEmpty(text)){
if (text.equals("1.参赛资格")){
// 如果该id已经加过一次那么不用再次添加
//String xpath2 = "/div/p[@id=\"" + scInnovateIndex.getRandomId() + "\"]";
String xpath2 = "/div//p[@id='" + new Random().nextInt(100) + "']";
List list2 = document.selectNodes(xpath2);
int size = list2.size();
if (size < 1){
element.addAttribute("id",new Random().nextInt(100)+"" );
}
break;
}
}
}
}
String replaceText = document.asXML();
return replaceText.substring(59, replaceText.length() - s3.length());
}
2.测试
public static void main(String[] args) throws Exception {
String str="<p style=\"text-align:left\"><span style=\"font-size:10.5pt\"><span style=\"font-family:"Times New Roman"\"><strong>\n" +
"<span style=\"font-size:16.0000pt\"><span style=\"font-family:仿宋_GB2312\"><span style=\"font-family:仿宋_GB2312\">1.</span></span></span></strong>\n" +
"<strong><span style=\"font-size:16.0000pt\"><span style=\"font-family:仿宋_GB2312\"><span style=\"font-family:仿宋_GB2312\">参赛资格</span></span></span></strong></span></span></p>\n" +
"\n" +
"\n" +
"\n" +
"<p style=\"text-align:justify\"><span style=\"font-size:10.5pt\"><span style=\"font-family:"Times New Roman"\">\n" +
"<strong><span style=\"font-size:16.0000pt\"><span style=\"font-family:仿宋_GB2312\"><strong> <span style=\"font-family:仿宋_GB2312\">2.参赛队伍</span></strong></span></span>\n" +
"</strong></span></span></p>";
System.out.println(new DemoTest().replaceContent(str));
}
效果图
参考地址:
1.Xpath的语法:XPath 语法 | 菜鸟教程
2.完成代码下载:Java使用dom4j实现字符串(html的字符串)的编辑-Java文档类资源-CSDN下载
3.Java如何移除html标签