用itext把html生成pdf的时候,如果html是这样的话:
String content = "<table><tr><td><a>dayna</a></td></tr></table>";
就会报错:
Exception in thread "main" java.lang.ClassCastException: com.itextpdf.text.html.simpleparser.CellWrapper cannot be cast to com.itextpdf.text.Paragraph
重现的测试方法:
private static void testATag() throws Exception {
Document document = new Document();
PdfWriter.getInstance(document,new FileOutputStream("D:\\dayna.pdf"));
document.open();
String content = "<table><tr><td><a>dayna</a></td></tr></table>";
Paragraph p = new Paragraph();
HashMap<String,Object> map = new HashMap<String,Object>();
map.put(HTMLWorker.IMG_PROVIDER, new ImgProvider());
List<Element> list = HTMLWorker.parseToList(new StringReader(content),null,map);
for(Element e : list) {
p.add(e);
}
document.add(p);
document.close();
}
解决办法:
因为生成的pdf不要求点击里面的链接,所以就简单的把所有的a标签全都去掉了。
private static String handleATag(String content) {
content = content.replaceAll("<a.*?>", "").replaceAll("<A.*?>", "");
content = content.replaceAll("</a>", "").replaceAll("</A>", "");
return content;
}
来自Oracle的回答:
1. You're using HTMLWorker. We're abandoning HTMLWorker in favor of XML Worker.
2. Your HTML is wrong.
You have:
<table><tr><p>A p-tag where you would expect a td-tag</p></tr></table>
or
<table><tr><div>A div-tag where you would expect a td-tag</div></tr></table>
or something like that; you get the idea ;-)