先贴代码,再解释与疑问(这段代码是我努力了半天的结果)
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.util.List;
import org.apache.poi.hssf.usermodel.HSSFClientAnchor;
import org.apache.poi.hssf.usermodel.HSSFPicture;
import org.apache.poi.hssf.usermodel.HSSFPictureData;
import org.apache.poi.hssf.usermodel.HSSFShape;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import org.apache.poi.ss.usermodel.PictureData;
import org.apache.poi.ss.usermodel.WorkbookFactory;
public class ReadPicturesFromExcel {
public static void main(String[] args) throws InvalidFormatException,
Exception {
InputStream inp = new FileInputStream(
"D:\\Users\\Fancy1_Fan\\桌面\\work\\test.xls");
HSSFWorkbook workbook = (HSSFWorkbook) WorkbookFactory.create(inp);
List<HSSFPictureData> pictures = workbook.getAllPictures();
HSSFSheet sheet = (HSSFSheet) workbook.getSheetAt(0);
int i = 0;
for (HSSFShape shape : sheet.getDrawingPatriarch().getChildren()) {
HSSFClientAnchor anchor = (HSSFClientAnchor) shape.getAnchor();
if (shape instanceof HSSFPicture) {
HSSFPicture pic = (HSSFPicture) shape;
int row = anchor.getRow1();
System.out.println(i + "--->" + anchor.getRow1() + ":"
+ anchor.getCol1());
int pictureIndex = pic.getPictureIndex()-1;
HSSFPictureData picData = pictures.get(pictureIndex);
System.out.println(i + "--->" + pictureIndex);
savePic(row, picData);
}
i++;
}
}
private static void savePic(int i, PictureData pic) throws Exception {
String ext = pic.suggestFileExtension();
byte[] data = pic.getData();
if (ext.equals("jpeg")) {
FileOutputStream out = new FileOutputStream(
"D:\\Users\\Fancy1_Fan\\桌面\\work\\pict" + i + ".jpg");
out.write(data);
out.close();
}
if (ext.equals("png")) {
FileOutputStream out = new FileOutputStream(
"D:\\Users\\Fancy1_Fan\\桌面\\work\\pict" + i + ".png");
out.write(data);
out.close();
}
}
}
思路:
1.获得所有图片---->
2.得到sheet DrawingPatriarch的所有shape--->
3.获得shape的anchor --->
4.获得picture的pictureIndex(这个很关键)------->
5.最后假定pictureIndex就是allPictures中图片的位置,从而获得这张picture的data信息.
问题:
对于最后的假定没有官方文档的支持,所以有待测试.但是简单测试结果是ok的!
对于假定的证明:
官方文档向excel添加图片的流程是:
1.调用workbook的addPicture,并且返回此pictureIndex------>
2.然后创建一个ClientAnchor--------->
3.最后通过这个pictureIndex和Anchor把它绘到sheet上
由此可见pictureIndex,ClientAnchor以及pictureData是一一对应的关系,只要能够关联这三者,就可以获得
Excel中picture的完整信息了.
然而根据poi的api,只能单独获得picture,或者包含pictureIndex和anchor的HSSFPicture,并没有把它们关联在一起.
查看源码发现 HSSFWorkbook只不过是一个外观类,或者适配器类,low level工作类为InternalWorkbook
/**
* this is the reference to the low level Workbook object
*/
private InternalWorkbook workbook;
查看InternalWorkbook有api如下
public EscherBSERecord getBSERecord(int pictureIndex) {
return escherBSERecords.get(pictureIndex-1);
}
此处表明:如果能获得InternalWorkbook对象和pictureIndex,就可以获得图片数据和信息.但是没法通过 HSSFWorkbook对象获得InternalWorkbook对象,因为如下:(此方法为包访问)
InternalWorkbook getWorkbook() {
return workbook;
}
但是观察InternalWorkbook可以发现,如图:
private List<EscherBSERecord> escherBSERecords;
保存图像数据的底层是一个List有序的集合.以及根据getBSERecord方法,就推断出picutreIndex就是表示picture在List里面的下标.
以上仅仅是个人的见解,由于对于poi的整体设计理念并没有把握,所以对于以上问题暂时找不到没有一个合理的解释.