Java HTML直接导出PDF
对于java中如何从html中直接导出pdf,有很多的开源代码,这里个人用itext转。
首先需要的包有:core-renderer-1.0.jar
core-renderer-R8pre1.jar
core-renderer.jar
iText-2.0.8.jar
jtidy-4aug2000r7-dev.jar
Tidy.jar
iTextAsian.jar
java代码的话就比较简单了。具体是先用Tidy将html转换为xhtml,将xhtml转换为其它各种格式的。虽然在转化到pdf时也是用的iText。代码如下:
//struts1.x中
else if("Html2Pdf".equalsIgnoreCase(action)){
exportPdfFile("http://localhost:8080/jsp/test.jsp");
return null;
}
// 导出pdf add by huangt 2012.6.1
public File exportPdfFile(String urlStr) throws BaseException {
// String outputFile = this.fileRoot + "/" +
// ServiceConstants.DIR_PUBINFO_EXPORT + "/" + getFileName() + ".pdf";
String outputFile = "d:/test3.pdf";
OutputStream os;
try {
os = new FileOutputStream(outputFile);
ITextRenderer renderer = new ITextRenderer();
String str = getHtmlFile(urlStr);
renderer.setDocumentFromString(str);
ITextFontResolver fontResolver = renderer.getFontResolver();
fontResolver.addFont("C:/WINDOWS/Fonts/SimSun.ttc",BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);// 宋体字
fontResolver.addFont("C:/WINDOWS/Fonts/Arial.ttf",BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);// 宋体字
renderer.layout();
renderer.createPDF(os);
System.out.println("转换成功!");
os.flush();
os.close();
return new File(outputFile);
} catch (FileNotFoundException e) {
// logger.error("不存在文件!" + e.getMessage());
throw new BaseException(e);
} catch (DocumentException e) {
// logger.error("生成pdf时出错了!" + e.getMessage());
throw new BaseException(e);
} catch (IOException e) {
// logger.error("pdf出错了!" + e.getMessage());
throw new BaseException(e);
}
}
// 读取页面内容 add by huangt 2012.6.1
public String getHtmlFile(String urlStr) throws BaseException {
URL url;
try {
if (urlStr.indexOf("?") != -1) {
urlStr = urlStr + "&locale="
+ LocaleContextHolder.getLocale().toString();
} else {
urlStr = urlStr + "?locale="
+ LocaleContextHolder.getLocale().toString();
}
url = new URL(urlStr);
URLConnection uc = url.openConnection();
InputStream is = uc.getInputStream();
Tidy tidy = new Tidy();
OutputStream os2 = new ByteArrayOutputStream();
tidy.setXHTML(true); // 设定输出为xhtml(还可以输出为xml)
tidy.setCharEncoding(Configuration.UTF8); // 设定编码以正常转换中文
tidy.setTidyMark(false); // 不设置它会在输出的文件中给加条meta信息
tidy.setXmlPi(true); // 让它加上<?xml version="1.0"?>
tidy.setIndentContent(true); // 缩进,可以省略,只是让格式看起来漂亮一些
tidy.parse(is, os2);
is.close();
// 解决乱码 --将转换后的输出流重新读取改变编码
String temp;
StringBuffer sb = new StringBuffer();
BufferedReader in = new BufferedReader(new InputStreamReader(
new ByteArrayInputStream(
((ByteArrayOutputStream) os2).toByteArray()),
"utf-8"));
while ((temp = in.readLine()) != null) {
sb.append(temp);
}
return sb.toString();
} catch (IOException e) {
// logger.error("读取客户端网页文本信息时出错了" + e.getMessage());
throw new BaseException(e);
}
}
为了解决包的问题,加上Maven <!-- pdf导出 -->
<dependency> <groupId>com.lowagie</groupId> <artifactId>itext</artifactId> <version>2.1.7</version> </dependency> <dependency> <groupId>org.xhtmlrenderer.flyingsaucer</groupId> <artifactId>pdf-renderer</artifactId> <version>1.0</version> </dependency> <dependency> <groupId>jtidy</groupId> <artifactId>jtidy</artifactId> <version>4aug2000r7-dev</version> <type>jar</type> <scope>compile</scope> </dependency> <dependency> <groupId>net.sf.barcode4j</groupId> <artifactId>barcode4j-light</artifactId> <version>2.0</version> </dependency> <dependency> <groupId>avalon-framework</groupId> <artifactId>avalon-framework-impl</artifactId> <version>4.2.0</version> </dependency> <!-- pdf -->
另外附上 稍微复杂的PDFUtils.java文件,由于没时间就不做整理解释了!见下载附件!