Windows,Linux下读取Docx，Doc 文档

最新推荐文章于 2022-04-20 10:39:30 发布

Fighter_Ma

最新推荐文章于 2022-04-20 10:39:30 发布

阅读量1.2k

点赞数

文章标签： linux windows python

本文链接：https://blog.csdn.net/weixin_44675308/article/details/122843330

版权

Windows

Doxc

下载python-docx模块库

pip install python-docx

import docx

word = "a.docx"
document = docx.Document(word)
for paragraph in document.paragraphs:
    text = paragraph.text
print(text)

Doc

下载win32com模块库，只支持Windows下

python -m pip install pypiwin32

from win32com import client
import pythoncom

word = "a.docx"
pythoncom.CoInitialize()
word = client.Dispatch('Word.Application')
word.Visible = 0  # 后台运行,不显示
word.DisplayAlerts = 0  # 不警告
doc = word.Documents.Open(word)
for para in doc.paragraphs:
    print(para.Range.Text)
doc.SaveAs('D:PythonFiles/4paradigm/gdt_flask/file/test.txt', 2)
doc.Close()
word.Quit()
pythoncom.CoUninitialize()

Linux

Doxc

下载python-docx模块库

pip install python-docx

import docx

word = "a.docx"
document = docx.Document(word)
for paragraph in document.paragraphs:
    text = paragraph.text
print(text)

Doc

安装 antiword
下载地址：http://www.winfield.demon.nl/linux/antiword-0.37.tar.gz

解压进入目录
tar -zxvf antiword-0.37.tar.gz

cd  antiword-0.37

make && make install

安装时，自动安装到了/root/目录下，只有root才可执行该命令，我们需要改一下路径，COPY到/usr中方便调用。

cp /root/bin/*antiword /usr/local/bin/
mkdir /usr/share/antiword
cp -R /root/.antiword/* /usr/share/antiword/
chmod 777 /usr/local/bin/*antiword
chmod 755 /usr/share/antiword/*

"""
    代码用法
"""
word = "a.doc"
output = subprocess.check_output(["antiword", word])
# 解码
output = output.decode('utf8')
print(output)

Fighter_ma：弱小和无知不是生存的障碍，傲慢才是~