python操作word文档替换文字快捷键_如何使用pythondocx搜索和替换word文档中的单词/文本...

替换行中的文本的问题是,文本可能会在多个运行中拆分,这意味着简单地查找和替换文本并不总是有效的。在

要替换的文本可以拆分为多个运行,因此需要通过部分匹配进行搜索,确定哪些运行需要替换文本,然后替换已标识的文本中的文本。在

此函数用于替换字符串并保留原始文本样式。无论是否需要保留样式设置,此过程都是相同的,因为样式设置会导致文本可能被拆分为多个行,即使文本在视觉上缺乏样式。在

代码import docx

def docx_find_replace_text(doc, search_text, replace_text):

paragraphs = list(doc.paragraphs)

for t in doc.tables:

for row in t.rows:

for cell in row.cells:

for paragraph in cell.paragraphs:

paragraphs.append(paragraph)

for p in paragraphs:

if search_text in p.text:

inline = p.runs

# Replace strings and retain the same style.

# The text to be replaced can be split over several runs so

# search through, identify which runs need to have text replaced

# then replace the text in those identified

started = False

search_index = 0

# found_runs is a list of (inline index, index of match, length of match)

found_runs = list()

found_all = False

replace_done = False

for i in range(len(inline)):

# case 1: found in single run so short circuit the replace

if search_text in inline[i].text and not started:

found_runs.append((i, inline[i].text.find(search_text), len(search_text)))

text = inline[i].text.replace(search_text, str(replace_text))

inline[i].text = text

replace_done = True

found_all = True

break

if search_text[search_index] not in inline[i].text and not started:

# keep looking ...

continue

# case 2: search for partial text, find first run

if search_text[search_index] in inline[i].text and inline[i].text[-1] in search_text and not started:

# check sequence

start_index = inline[i].text.find(search_text[search_index])

check_length = len(inline[i].text)

for text_index in range(start_index, check_length):

if inline[i].text[text_index] != search_text[search_index]:

# no match so must be false positive

break

if search_index == 0:

started = True

chars_found = check_length - start_index

search_index += chars_found

found_runs.append((i, start_index, chars_found))

if search_index != len(search_text):

continue

else:

# found all chars in search_text

found_all = True

break

# case 2: search for partial text, find subsequent run

if search_text[search_index] in inline[i].text and started and not found_all:

# check sequence

chars_found = 0

check_length = len(inline[i].text)

for text_index in range(0, check_length):

if inline[i].text[text_index] == search_text[search_index]:

search_index += 1

chars_found += 1

else:

break

# no match so must be end

found_runs.append((i, 0, chars_found))

if search_index == len(search_text):

found_all = True

break

if found_all and not replace_done:

for i, item in enumerate(found_runs):

index, start, length = [t for t in item]

if i == 0:

text = inline[index].text.replace(inline[index].text[start:start + length], str(replace_text))

inline[index].text = text

else:

text = inline[index].text.replace(inline[index].text[start:start + length], '')

inline[index].text = text

# print(p.text)

# sample usage as per example

doc = docx.Document('find_replace_test_document.docx')

docx_find_replace_text(doc, 'Testing1', 'Test ')

docx_find_replace_text(doc, 'Testing2', 'Test ')

docx_find_replace_text(doc, 'rest', 'TEST')

doc.save('find_replace_test_result.docx')

样本输出

下面是几个屏幕截图,显示源文档和替换文本后的结果:

^{pr2}$

源文档:

yTn9N.png

结果文档:

xyLMZ.png

我希望这对某人有帮助。在

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值