java使用itext教程_java操作PDF（PDFBOX和Itext框架）

最新推荐文章于 2024-06-25 21:49:16 发布

豆瓣时间

最新推荐文章于 2024-06-25 21:49:16 发布

阅读量518

点赞数

文章标签： java使用itext教程

本文链接：https://blog.csdn.net/weixin_35742195/article/details/115084082

版权

java有很多可以操作pdf的框架，pdfbox和itext就是其中的两种

pdfbox有如下作用

提取文本，包括Unicode字符。和Jakarta Lucene等文本搜索引擎的整合过程十分简单。加密/解密PDF文档。

从PDF和XFDF格式中导入或导出表单数据。

向已有PDF文档中追加内容。

将一个PDF文档切分为多个文档。

覆盖PDF文档。

下面是一个使用pdfbox的测试程序

public class PdfBoxTest {

public void getText(String file) throws Exception{

//是否排序

boolean sort = false;

//pdf文件名

String pdfFile = file;

//输入文本文件名称

String textFile = null;

//编码方式

String encoding = "UTF-8";

//开始提取页数

int startPage = 1;

//结束提取页数

int endPage = Integer.MAX_VALUE;

//文件输入流，输入文本文件

Writer output = null;

//内存中存储的PDF Document

PDDocument document = null;

try{

//首先当作一个URL来加载文件，如果得到异常再从本地系统装载文件

URL url = new URL(pdfFile);

document = PDDocument.load(url);

String fileName = url.getFile();

if(fileName.length() > 4){

//以原来pdf名称来命名新产生的txt文件

File outputFile = new File(fileName.substring(0, fileName.length()-4) + ".txt");

textFile = outputFile.getName();

}

}catch(Exception e){

//如果作为URL装载得到异常则从文件系统装载

document = PDDocument.load(pdfFile);

if(pdfFile.length() > 4){

textFile = pdfFile.substring(0, pdfFile.length() - 4) + ".txt";

}

//文件输出流，写入文件到textFile

output = new OutputStreamWriter(new FileOutputStream(textFile),encoding);

//PDFTextStripper来提取文本

PDFTextStripper stripper = new PDFTextStripper();

//设置是否排序

stripper.setSortByPosition(sort);

//设置起始页

stripper.setStartPage(startPage);

//设置结束页

stripper.setEndPage(endPage);

//调用PDFTextStripper的writeText提取并输出文本

stripper.writeText(document, output);

}finally{

if(output != null){

output.close();

}

if(document != null){

document.close();

}

/** *//**

* @param args

public static void main(String[] args) {

// TODO Auto-generated method stub

PdfBoxTest test = new PdfBoxTest();

try{

test.getText("E://test.pdf");

}catch(Exception e){

e.printStackTrace();

}

} iText是著名的开放源码的站点sourceforge一个项目，是用于生成PDF文档的一个java类库。通过iText不仅可以生成PDF或rtf的文档，而且可以将XML、Html文件转化为PDF文件。

下面是一个使用itext生成pdf的例子

public class ITextTest {

public static void main(String args[]){

writePdf();

}

public static void writePdf(){

Document document = new Document();

try {

PdfWriter.getInstance(document, new FileOutputStream("Helloworld.pdf"));

} catch (DocumentException e) {

e.printStackTrace(); //To change body of catch statement use File | Settings | File Templates.

} catch (FileNotFoundException e) {

e.printStackTrace(); //To change body of catch statement use File | Settings | File Templates.

}

document.open();

try {

document.add(new Paragraph("Hello World"));

} catch (DocumentException e) {

e.printStackTrace(); //To change body of catch statement use File | Settings | File Templates.

}

document.close();

}

默认的iText字体设置不支持中文字体，需要下载远东字体包iTextAsian.jar，否则不能往PDF文档中输出中文字体。通过下面的代码就可以在文档中使用中文了：

BaseFont bfChinese = BaseFont.createFont("STSong-Light", "UniGB-UCS2-H", BaseFont.NOT_EMBEDDED);

com.lowagie.text.Font FontChinese = new com.lowagie.text.Font(bfChinese, 12, com.lowagie.text.Font.NORMAL);

Paragraph pragraph=new Paragraph("你好", FontChinese);

总结一下，本文采用pdfbox和itext分别演示了如何读取pdf和生成pdf的简单方法。

来源：oschina

链接：https://my.oschina.net/u/195637/blog/146586

豆瓣时间

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
java使用itext教程_java操作PDF（PDFBOX和Itext框架）

java有很多可以操作pdf的框架，pdfbox和itext就是其中的两种pdfbox有如下作用提取文本，包括Unicode字符。和Jakarta Lucene等文本搜索引擎的整合过程十分简单。加密/解密PDF文档。从PDF和XFDF格式中导入或导出表单数据。向已有PDF文档中追加内容。将一个PDF文档切分为多个文档。覆盖PDF文档。下面是一个使用pdfbox的测试程序public class P...
复制链接

扫一扫