参考:https://www.cnblogs.com/zhuyuchao/p/7699712.html
System.out.println("处理前:"+str2);
//定义style的正则表达式
String regEx_style = "<style[^>]*?>[\\s\\S]*?<\\/style>";
//定义script的正则表达式
String regEx_script = "<script[^>]*?>[\\s\\S]*?<\\/script>";
//定义HTML标签的正则表达式
String regEx_html = "<[^>]+>";
//删除css
str2=str2.replaceAll(regEx_style,"");
//删除js
str2=str2.replaceAll(regEx_script,"");
//删除html标记
str2=str2.replaceAll(regEx_html,"");
System.out.println("处理后:"+str2);
但是我却想保存某个标签不去除
参考:https://blog.csdn.net/qq_34063070/article/details/80049815
String regEx = "(?!<(img|p|span).*?>)<.*?>";
Pattern p_html = Pattern.compile(regEx, Pattern.CASE_INSENSITIVE);
Matcher m_html = p_html.matcher(str);
str = m_html.replaceAll("");