实验准备:
1、安装所需要的包docx,若直接pip install docx安装模块docx,运行代码可能会报错:moduleNotFoundError:No module named 'exceptions'
解决方案:卸载原来安装的docx:
pip uninstall docx
安装python-docx模块即可:
pip install python-docx
2、一个含有待提取图片的.docx文档:
实验代码:
from os.path import basename
from docx import Document
doc = Document("../source/aaaa.docx")
print(doc)
for shape in doc.inline_shapes:
contentID = shape._inline.graphic.graphicData.pic.blipFill.blip.embed
contentType = doc.part.related_parts[contentID].content_type
if not contentType.startswith('image'):
continue
imgName = basename(doc.part.related_parts[contentID].partname)
print(imgName)
imgData = doc.part.related_parts[contentID]._blob
with open(imgName,'wb' ) as fp:
fp.write(imgData)
运行结果: