python操作word文档替换文字快捷键_如何使用pythondocx搜索和替换word文档中的单词/文本...

weixin_39620118

于 2020-11-28 02:27:32 发布

阅读量209

点赞数

文章标签： python操作word文档替换文字快捷键

替换行中的文本的问题是，文本可能会在多个运行中拆分，这意味着简单地查找和替换文本并不总是有效的。在

要替换的文本可以拆分为多个运行，因此需要通过部分匹配进行搜索，确定哪些运行需要替换文本，然后替换已标识的文本中的文本。在

此函数用于替换字符串并保留原始文本样式。无论是否需要保留样式设置，此过程都是相同的，因为样式设置会导致文本可能被拆分为多个行，即使文本在视觉上缺乏样式。在

代码import docx

def docx_find_replace_text(doc, search_text, replace_text):

paragraphs = list(doc.paragraphs)

for t in doc.tables:

for row in t.rows:

for cell in row.cells:

for paragraph in cell.paragraphs:

paragraphs.append(paragraph)

for p in paragraphs:

if search_text in p.text:

inline = p.runs

# Replace strings and retain the same style.

# The text to be replaced can be split over several runs so

# search through, identify which runs need to have text replaced

# then replace the text in those identified

started = False

search_index = 0

# found_runs is a list of (inline index, index of match, length of match)

found_runs = list()

found_all = False

replace_done = False

for i in range(len(inline)):

# case 1: found in single run so short circuit the replace

if search_text in inline[i].text and not started:

found_runs.append((i, inline[i].text.find(search_text), len(search_text)))

text = inline[i].text.replace(search_text, str(replace_text))

inline[i].text = text

replace_done = True

found_all = True

break

if search_text[search_index] not in inline[i].text and not started:

# keep looking ...

continue

# case 2: search for partial text, find first run

if search_text[search_index] in inline[i].text and inline[i].text[-1] in search_text and not started:

# check sequence

start_index = inline[i].text.find(search_text[search_index])

check_length = len(inline[i].text)

for text_index in range(start_index, check_length):

if inline[i].text[text_index] != search_text[search_index]:

# no match so must be false positive

break

if search_index == 0:

started = True

chars_found = check_length - start_index

search_index += chars_found

found_runs.append((i, start_index, chars_found))

if search_index != len(search_text):

continue

else:

# found all chars in search_text

found_all = True

break

# case 2: search for partial text, find subsequent run

if search_text[search_index] in inline[i].text and started and not found_all:

# check sequence

chars_found = 0

check_length = len(inline[i].text)

for text_index in range(0, check_length):

if inline[i].text[text_index] == search_text[search_index]:

search_index += 1

chars_found += 1

else:

break

# no match so must be end

found_runs.append((i, 0, chars_found))

if search_index == len(search_text):

found_all = True

break

if found_all and not replace_done:

for i, item in enumerate(found_runs):

index, start, length = [t for t in item]

if i == 0:

text = inline[index].text.replace(inline[index].text[start:start + length], str(replace_text))

inline[index].text = text

else:

text = inline[index].text.replace(inline[index].text[start:start + length], '')

inline[index].text = text

# print(p.text)

# sample usage as per example

doc = docx.Document('find_replace_test_document.docx')

docx_find_replace_text(doc, 'Testing1', 'Test ')

docx_find_replace_text(doc, 'Testing2', 'Test ')

docx_find_replace_text(doc, 'rest', 'TEST')

doc.save('find_replace_test_result.docx')

样本输出

下面是几个屏幕截图，显示源文档和替换文本后的结果：

^{pr2}$

源文档：

结果文档：

我希望这对某人有帮助。在

weixin_39620118

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python操作word文档替换文字快捷键_如何使用pythondocx搜索和替换word文档中的单词/文本...

替换行中的文本的问题是，文本可能会在多个运行中拆分，这意味着简单地查找和替换文本并不总是有效的。在要替换的文本可以拆分为多个运行，因此需要通过部分匹配进行搜索，确定哪些运行需要替换文本，然后替换已标识的文本中的文本。在此函数用于替换字符串并保留原始文本样式。无论是否需要保留样式设置，此过程都是相同的，因为样式设置会导致文本可能被拆分为多个行，即使文本在视觉上缺乏样式。在代码import docxd...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。