接到个需求,客户需要将word里面某一段或几段内容截取出来生成新的文档或者图片(图片要贴到另一个生成的报告里),格式等都不变,想了好久最终决定只有一行一行遍历,删除不要的,将start和end标识内的内容留下来,才能保证格式不变。上代码:
pom:
<dependency>
<groupId>e-iceblue</groupId>
<artifactId>spire.office.free</artifactId>
<version>5.3.1</version>
</dependency>
start:起始标志,end:结束标志,fileName:源文档路径
private Document wordIntercept(String start, String end, String fileName) throws IOException {
Document doc = new Document();
doc.loadFromFile(fileName);
int temp = 0;
//遍历Section()获取节数(当前就一节)
// for(int i = 0; i < doc.getSections().getCount();i++) {
//获取section
Section section = doc.getSections().get(0);
//遍历section中的对象
for (int j = 0; j < section.getBody().getChildObjects().getCount(); j++) {
//获取对象类型
Object object = section.getBody().getChildObjects().get(j).getDocumentObjectType();
//遍历段落
for (int z = 0; z < section.getParagraphs().getCount(); z++) {
// 获取段落
Paragraph paragraph = section.getParagraphs().get(z);
//判断对象类型是否为段落
if (object.equals(DocumentObjectType.Paragraph)) {
if (temp == 0) {
String text = paragraph.getText();
if (text.contains(start)) {
text = text.substring(text.substring(0, text.indexOf(start)).length() + start.length());
if (text.contains(end)) {
paragraph.setText(text.substring(0, text.indexOf(end)));
temp = 0;
continue;
}
paragraph.setText(text);
// section.getBody().getParagraphs().remove(paragraph);
z--;
temp = 1;
continue;
}
if (j == 0) {
//直接删除
section.getBody().getParagraphs().remove(paragraph);
z--;
}
}
if (temp == 1) {
//不需要删除
String text = paragraph.getText();
if (text.contains(end)) {
//删除这个标识字符和它之后的
paragraph.setText(text.substring(0, text.indexOf(end)));
// section.getBody().getParagraphs().remove(paragraph);
temp = 0;
continue;
}
}
}
}
// }
}
//保存文档
BufferedImage[] images = doc.saveToImages(ImageType.Bitmap);
doc.saveToFile("新的路径和名字",FileFormat.Docx_2013);
int i = 0;
for (BufferedImage image : images) {
//保存为.png文件格式
File files = new File(fileTempPath + "\\" + String.format(("Img-%d.png"), i));
// File files = new File("D:/桌面/tempImages/" + String.format(("Img-%d.png"), i));
ImageIO.write(image, "PNG", files);
i++;
}
return doc;
}
大佬们有没有更好的方法。一起分享