python替换指定位置word图片_用Python在word的指定位置插入图片（使用Python-docx包）...

最新推荐文章于 2024-06-12 10:14:13 发布

weixin_39825259

最新推荐文章于 2024-06-12 10:14:13 发布

阅读量1.1k

点赞数

文章标签： python替换指定位置word图片

首先，本实例是采用java语言的，需要进行xml操作，至于poi,docx4j这些，因为感觉不是文档不太好，要嘛就是操作不方便，而且英文就算了，主要是注释很少，让人云里雾里的。所以，嘿嘿,直接用java写，这些操作包就不用了

首先，得明白docx解压后的文件夹是什么，解压后的东西如下

“----------------------------------------------------------------------------”

其中,_rels文件夹内容如下：

就一个文件，里面的内容如下：

里面标识了三个xml文件的位置，主要是指示链接了哪些文件。

“------------------------------------------------------------------------------”

docProps文件夹里面的东西如下：

app.xml指明了这个文档的总体信息，比如多少行，多少空格，单词，页数之类的一些基础信息。

core.xml也是如此，里面弄了创建人啊，时间啊，最后修改人之类的信息。

doc是最主要的文件夹，里面是我们的word文档内容，挺多东西的

(1)其中这里的_rels文件夹下有个

这里面包含了你需要用到的下面的内容，比如说，下面的什么styles.xml啊等等等文件都需要在这里引入，包括需要用到的文件，比如说一个png图片。

(2)media里面放的就是图片，等等其它一些word支持的媒体文件。要使用这些文件，记得在document.xml.rels里引入

(3)theme文件夹下放的是word文档的主题xml文件(包含了各种什么字体啊，颜色啊，等等一堆乱七八糟的东西)

(4)document.xml里放的就是我们的word文档内容了，如果把docx文档比喻成网页,这里面放的就是html文件了。（等等主要详细介绍）

(5)剩下的几个xml文件就是注入脚注，样式之类的xml文件，由于只是介绍盖章，就不介绍了。一般docx批处理不会特意单独去改单独某个docx的样式的。

最后，顾名思义，[Content Types]文件里放的是文件内容类型的信息，截图如下，

Extension是扩展类型，如果在word文档里用到了jpeg图片，这里就要指明使用了image/jpeg的内容类型（其它格式模仿上面）。这里的jpeg可不是指修改了文件后缀就好了，是真正的jpeg格式，如果格式和后缀对不上的话（文件损坏），word要嘛会把格式转换，要嘛就改后缀成格式后缀了。当然了，正常情况下，格式和后缀不会变的。

好了，现在回头讲讲document.xml这个文件。

现在，弄了个word文档，内容如下(随便编的内容和章)

document.xml里面的内容摘要图如下:

上面的内容，随便几个摘要图随便浏览一下就行了，下面我们分析一下。

这个 xml有表示主体，w:p 表示一个段落

如：

画线的部分都是一个段落，其中，图片的左上角由于是和申请人在同一行上，所以算是同一段落的。

因此，这份document.xml(用谷歌浏览器打开xml，能合并标签)有段落数量如下：

额，感觉还不错，

现在，按顺序点下来，图片应该是在第四段落。也就是第四个w:p标签里面(当然了，凭肉眼是这么看的，写程序当然不是用看的了)。

该段落如下：

这里面有个w:drawing标签，存放了章图片的信息。

我们只需要，把这个dom弄出来。添加到到没盖章的word文档自己需要的段落。注意，rels里一定要添加对该图片的引用，且里面声明的Id要与上图的rId保持一致 ,且里面的rId是唯一的。

上图的rId7在document.xml.rels里已经声明引用了

还有，[Content_Types].xml也要声明该图片类型。

如果想要移动该图片的位置，还是在包裹的标签里面，找到

里面的H是水平移动，也就是X轴的意思，V是垂直也就是y轴，改变这两个数的数值就改变图片的位置了。这里是是相对位置（其实也可以弄成绝对于页面的位置的），也就是相对于这一一段开头行的位置。包括上padding间隔。假如两个都设置为0，图片位置是变成如下：

当然，也可以设置成负数，现在将垂直值设置成-1000000,效果如下:

这样就实现了图片指定位置插入（图片是衬于文字下方的，浮于上方也行，不然漂移图片不了，不然得设置边距，但是这样就不妥了，盖章需要在文字下方）

下面开始例子

首先，先写解压缩和压缩的功能，毕竟word是个zip包。

package com.gmr.io;

import java.io.File;

import java.io.FileNotFoundException;

import java.io.FileOutputStream;

import java.io.IOException;

import java.io.InputStream;

import java.io.OutputStream;

import java.util.ArrayList;

import java.util.Enumeration;

import java.util.HashMap;

import java.util.List;

import java.util.Map;

import java.util.Map.Entry;

import java.util.zip.ZipEntry;

import java.util.zip.ZipException;

import java.util.zip.ZipFile;

import java.util.zip.ZipOutputStream;

import org.apache.commons.io.IOUtils;

import com.gmr.execption.DocxException;

import com.gmr.execption.GmrZipException;

public class DocxFile {

private ZipFile docxFile;

private String docxName;

private File file;

private Map updateEntryMap = new HashMap();

boolean tag = true;

/**

* 初始化(docx)

* @param docxName

* docx文件名及路径如 web-inf/a.docx

public DocxFile(String docxName) {

this.docxName = docxName;

try {

file = new File(docxName);

docxFile = new ZipFile(file);

} catch (ZipException e) {

throw new GmrZipException(e);

} catch (IOException e) {

throw new DocxException(e);

}

/**

* 获取docx压缩体内容

* @param entryName

* 压缩体名

* @return

public InputStream getEntryInputStream(String entryName) {

ZipEntry entry = docxFile.getEntry(entryName);

try {

return entry == null ? null : docxFile.getInputStream(entry);

} catch (IOException e) {

throw new DocxException(e);

}

/**

* 获取docx压缩体内容

* @param entry

* 压缩体

* @return

public InputStream getEntryInputStream(ZipEntry entry) {

try {

return docxFile.getInputStream(entry);

} catch (IOException e) {

throw new DocxException(e);

}

/**

* 获取指定文件夹下的ZipEntry

* @param directoryName

* @return

public List getEntries(String directoryName) {

List list = new ArrayList();

Enumeration<? extends ZipEntry> entries = docxFile.entries();

while (entries.hasMoreElements()) {

ZipEntry entry = entries.nextElement();

String name = entry.getName();

if (name.contains(directoryName)

|| name.contains(directoryName + "/")

|| name.contains(directoryName + "\\")) {

list.add(entry);

}

return list;

}

/**

* 放置修改后的Entry

* @param entryName

* @param bs

public void putUpdateEntry(String entryName, byte[] bs) {

updateEntryMap.put(entryName, bs);

}

/**

* 修改当前的docx文件(这是文件名非空，也就是初始化时流的时候)

* @throws Exception

public void updateZip() throws Exception {

String suffix = "" + System.currentTimeMillis() + docxFile.hashCode()

+ updateEntryMap.hashCode();

File tFile = new File(docxName + suffix);

OutputStream out;

try {

out = new FileOutputStream(tFile);

} catch (FileNotFoundException e) {

throw new DocxException(e);

}

ZipOutputStream docxOut = new ZipOutputStream(out);

Enumeration<? extends ZipEntry> zipEntrys = docxFile.entries();

try {

// 原有的部分，包括修改后的覆盖原有的

while (zipEntrys.hasMoreElements()) {

ZipEntry zipEntry = zipEntrys.nextElement();

docxOut.putNextEntry(new ZipEntry(zipEntry.getName()));

if (updateEntryMap.containsKey(zipEntry.getName())) {

byte[] b = updateEntryMap.get(zipEntry.getName());

if (b != null && b.length > 0) {

docxOut.write(b);

}

updateEntryMap.remove(zipEntry.getName());

} else {

InputStream in = docxFile.getInputStream(zipEntry);

IOUtils.copy(in, docxOut);

}

// 表示新增的修改部分

for (Entry entry : updateEntryMap.entrySet()) {

docxOut.putNextEntry(new ZipEntry(entry.getKey()));

docxOut.write(entry.getValue());

}

docxOut.flush();

} finally {

docxOut.close();

tag = false;

docxFile.close();

}

this.file.delete();

tFile.renameTo(new File(docxName));

}

/**

* 关闭文件

public void close() {

try {

if (tag) {

docxFile.close();

}

} catch (IOException e) {

throw new DocxException(e);

}

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

/**

* docx图片拷贝操作

* @author gmr

public class DocxImageOperator {

private DocxFile docxTemplate;

/**

* 初始化图片拷贝模板

* @param templateDocxName

public DocxImageOperator(String templateDocxName) {

docxTemplate = new DocxFile(templateDocxName);

}

public static void main(String[] args) {

DocxImageOperator docxImageOperator = new DocxImageOperator(

"C:/Users/gmr/Desktop/xx/a.docx");

try {

//a.docx作为模板，将图片拷贝入b.docx

docxImageOperator

.copyImageToDocxFromTemplate("C:/Users/gmr/Desktop/xx/b.docx");

} catch (Exception e) {

e.printStackTrace();

}

/**

* 图片复样

* @param docxNames

public void copyImageToDocxFromTemplate(String... docxNames) {

try {

for (String docxName : docxNames) {

copyImageToDocxFromTemplate(docxName);

}

} catch (Exception e) {

if (e instanceof RuntimeException) {

throw (RuntimeException) e;

} else {

throw new DocxException(e);

}

/**

* 将模板里的图片拷贝到docx中

* @param docxName

private void copyImageToDocxFromTemplate(String docxName) throws Exception {

DocxFile docx = new DocxFile(docxName);

try {

DocumentBuilderFactory factory = DocumentBuilderFactory

.newInstance();

DocumentBuilder documentBuilder = factory.newDocumentBuilder();

TransformerFactory transformerFactory = TransformerFactory

.newInstance();

Transformer transformer = transformerFactory.newTransformer();

// 图片数据切入(图片media进入压缩包)

Set imageNames = new HashSet();// 其实就只支持一张,当然了，可以自行改动代码支持多张

addImage(imageNames, docx);

// 图片引用导入(document.xml.rels添加对图片的引用)

// 虽然写着imageNames，但是为了偷懒，我只让放一张，否则就乱序(因为打算直接replace,不管顺序了)

String idName = documentRelsXmlCheck(transformer, imageNames,

documentBuilder, docx);

// 下面是contentType.xml的修改

// 修改[Content_Types].xml

contentTypesXmlCheck(transformer, imageNames, documentBuilder, docx);

* 修改最后的内容,添加正文图片修改document.xml

documentXmlCheck(transformer, documentBuilder, docx, idName);

// 保存该word文档

docx.updateZip();

} finally {

docx.close();

}

/**

* 拷贝模板的图片正文内容到word文档

* @param transformer

* @param documentBuilder

* @param docx

* @param idName

* @throws Exception

public void documentXmlCheck(Transformer transformer,

DocumentBuilder documentBuilder, DocxFile docx, String idName)

throws Exception {

Document documentXml = documentBuilder.parse(docxTemplate

.getEntryInputStream("word/document.xml"));

NodeList blips = documentXml.getElementsByTagName("a:blip");

if (blips.getLength() > 0) {

Node node = blips.item(0);

node.getAttributes().getNamedItem("r:embed").setNodeValue(idName);

}

NodeList imageContents = documentXml.getElementsByTagName("w:drawing");

if (imageContents.getLength() > 0) {

Node node = null;

Node rNode = null;

Node pNode = null;

int imgLength = imageContents.getLength();

for (int i = 0; i < imgLength; i++) {

node = imageContents.item(i);

rNode = node.getParentNode();

pNode = rNode.getParentNode();

if (node.getParentNode().getNodeName().equals("w:r")

&& pNode.getNodeName().equals("w:p")) {

break;

}

NodeList pList = documentXml.getElementsByTagName("w:p");

int pLength = pList.getLength();

int position = -1;

for (int i = 0; i < pLength; i++) {

if (pList.item(i) == pNode) {

position = i;

}

Document toDocumentXml = documentBuilder.parse(docx

.getEntryInputStream("word/document.xml"));

Node n = toDocumentXml.importNode(rNode, true);

NodeList toNodeList = toDocumentXml.getElementsByTagName("w:p");

if (position != -1) {

if (position >= toNodeList.getLength()) {

position = toNodeList.getLength() - 1;

}

toDocumentXml.getElementsByTagName("w:p").item(position)

.appendChild(n);

}

ByteArrayOutputStream out = new ByteArrayOutputStream();

try {

transformer.transform(new DOMSource(toDocumentXml),

new StreamResult(out));

docx.putUpdateEntry("word/document.xml", out.toByteArray());

} finally {

if (out != null) {

out.close();

}

/**

* 检查word文档对图片类型的拓展（如果没有，则添加该拓展）

* @param transformer

* @param imageNames

* @param documentBuilder

* @param docx

* @throws Exception

private void contentTypesXmlCheck(Transformer transformer,

Set imageNames, DocumentBuilder documentBuilder,

DocxFile docx) throws Exception {

Document contentTypesXml = documentBuilder.parse(docx

.getEntryInputStream("[Content_Types].xml"));

NodeList nodeList = contentTypesXml.getElementsByTagName("Default");

int length = nodeList.getLength();

boolean tag = false;

String imgName = "";

for (String name : imageNames) {

imgName = name;

break;

}

String type = imgName.substring(imgName.lastIndexOf('.') + 1);

for (int i = 0; i < length; i++) {

Node node = nodeList.item(i);

String value = node.getAttributes().getNamedItem("Extension")

.getNodeValue();

if (value.equals(type)) {

tag = true;

break;

}

if (!tag) {

// 没有该类型拓展则添加

Element defaultNode = contentTypesXml.createElement("Default");

defaultNode.setAttribute("Extension", type);

defaultNode.setAttribute("ContentType", "image/" + type);// 就不辨别其它非图片类型了，直接加(不然很长很长的代码)

contentTypesXml.getElementsByTagName("Types").item(0)

.appendChild(defaultNode);

ByteArrayOutputStream out = new ByteArrayOutputStream();

try {

transformer.transform(new DOMSource(contentTypesXml),

new StreamResult(out));

docx.putUpdateEntry("[Content_Types].xml", out.toByteArray());

} finally {

if (out != null) {

out.close();

}

/**

* 添加对图片的引用

* @param transformer

* @param imageNames

* @param documentBuilder

* @param docx

* @return

* @throws Exception

private String documentRelsXmlCheck(Transformer transformer,

Set imageNames, DocumentBuilder documentBuilder,

DocxFile docx) throws Exception {

Document documentRelsXml = documentBuilder.parse(docx

.getEntryInputStream("word/_rels/document.xml.rels"));

String idName = "";

for (String imageName : imageNames) {

// 添加对图片(暂时只支持图片，其它什么乱七八糟非图片格式也暂时不判断过滤了)的引用

Element relTextNode = documentRelsXml.createElement("Relationship");

idName = "gmr" + imageName.hashCode();

relTextNode.setAttribute("Id", idName);

relTextNode

.setAttribute("Type",

"http://schemas.openxmlformats.org/officeDocument/2006/relationships/image");

relTextNode.setAttribute("Target", imageName);

documentRelsXml.getElementsByTagName("Relationships").item(0)

.appendChild(relTextNode);

}

ByteArrayOutputStream out = new ByteArrayOutputStream();

try {

transformer.transform(new DOMSource(documentRelsXml),

new StreamResult(out));

docx.putUpdateEntry("word/_rels/document.xml.rels",

out.toByteArray());

} finally {

if (out != null) {

out.close();

}

return idName;

}

/**

* 添加图片文件进word

* @param imageNames

* @param docx

* @throws Exception

private void addImage(Set imageNames, DocxFile docx)

throws Exception {

List list = docxTemplate.getEntries("word/media");

for (ZipEntry zipEntry : list) {

if (!zipEntry.isDirectory()) {

imageNames.add(zipEntry.getName().replace("word/", "")

.replace("word\\", ""));// 图片名集合

docx.putUpdateEntry(zipEntry.getName(),

IOUtils.toByteArray(docxTemplate

.getEntryInputStream(zipEntry)));

}

public void close() {

docxTemplate.close();

}

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

public class DocxFile {

private ZipFile docxFile;

private String docxName;

private File file;

private Map updateEntryMap = new HashMap();

boolean tag = true;

/**

* 初始化(docx)

* @param docxName

* docx文件名及路径如 web-inf/a.docx

public DocxFile(String docxName) {

this.docxName = docxName;

try {

file = new File(docxName);

docxFile = new ZipFile(file);

} catch (ZipException e) {

throw new GmrZipException(e);

} catch (IOException e) {

throw new DocxException(e);

}

/**

* 获取docx压缩体内容

* @param entryName

* 压缩体名

* @return

public InputStream getEntryInputStream(String entryName) {

ZipEntry entry = docxFile.getEntry(entryName);

try {

return entry == null ? null : docxFile.getInputStream(entry);

} catch (IOException e) {

throw new DocxException(e);

}

/**

* 获取docx压缩体内容

* @param entry

* 压缩体

* @return

public InputStream getEntryInputStream(ZipEntry entry) {

try {

return docxFile.getInputStream(entry);

} catch (IOException e) {

throw new DocxException(e);

}

/**

* 获取指定文件夹下的ZipEntry

* @param directoryName

* @return

public List getEntries(String directoryName) {

List list = new ArrayList();

Enumeration<? extends ZipEntry> entries = docxFile.entries();

while (entries.hasMoreElements()) {

ZipEntry entry = entries.nextElement();

String name = entry.getName();

if (name.contains(directoryName)

|| name.contains(directoryName + "/")

|| name.contains(directoryName + "\\")) {

list.add(entry);

}

return list;

}

/**

* 放置修改后的Entry

* @param entryName

* @param bs

public void putUpdateEntry(String entryName, byte[] bs) {

updateEntryMap.put(entryName, bs);

}

/**

* 修改当前的docx文件(这是文件名非空，也就是初始化时流的时候)

* @throws Exception

public void updateZip() throws Exception {

String suffix = "" + System.currentTimeMillis() + docxFile.hashCode()

+ updateEntryMap.hashCode();

File tFile = new File(docxName + suffix);

OutputStream out;

try {

out = new FileOutputStream(tFile);

} catch (FileNotFoundException e) {

throw new DocxException(e);

}

ZipOutputStream docxOut = new ZipOutputStream(out);

Enumeration<? extends ZipEntry> zipEntrys = docxFile.entries();

try {

// 原有的部分，包括修改后的覆盖原有的

while (zipEntrys.hasMoreElements()) {

ZipEntry zipEntry = zipEntrys.nextElement();

docxOut.putNextEntry(new ZipEntry(zipEntry.getName()));

if (updateEntryMap.containsKey(zipEntry.getName())) {

byte[] b = updateEntryMap.get(zipEntry.getName());

if (b != null && b.length > 0) {

docxOut.write(b);

}

updateEntryMap.remove(zipEntry.getName());

} else {

InputStream in = docxFile.getInputStream(zipEntry);

IOUtils.copy(in, docxOut);

}

// 表示新增的修改部分

for (Entry entry : updateEntryMap.entrySet()) {

docxOut.putNextEntry(new ZipEntry(entry.getKey()));

docxOut.write(entry.getValue());

}

docxOut.flush();

} finally {

docxOut.close();

tag = false;

docxFile.close();

}

this.file.delete();

tFile.renameTo(new File(docxName));

}

/**

* 关闭文件

public void close() {

try {

if (tag) {

docxFile.close();

}

} catch (IOException e) {

throw new DocxException(e);

}

————————————————

原文链接：https://blog.csdn.net/qwe125698420/article/details/70622289

weixin_39825259

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
python替换指定位置word图片_用Python在word的指定位置插入图片（使用Python-docx包）...

12首先，本实例是采用java语言的，需要进行xml操作，至于poi,docx4j这些，因为感觉不是文档不太好，要嘛就是操作不方便，而且英文就算了，主要是注释很少，让人云里雾里的。所以，嘿嘿,直接用java写，这些操作包就不用了首先，得明白docx解压后的文件夹是什么，解压后的东西如下“--------------------------------------------------------...
复制链接

扫一扫