使用python-docx读取doc,docx文档

API:    http://python-docx.readthedocs.io/en/latest/#api-documentation

将doc转为docx:

        from win32com import client as wc

        word = wc.Dispatch("Word.Application")

        doc = word.Documents.Open(路径+名称.doc)

        doc.SaveAs(路径+名称.docx, 12)   12为docx

        doc.Close()

        word.Quit()

读取段落:

        import docx

        docStr = Document(docName)   打开文档

        for paragraph in docStr.paragraphs:

                parStr = paragraph.text

                --》paragraph.style.name == 'Heading 1'  一级标题   

                --》paragraph.paragraph_format.alignment == 1  居中显示

                --》paragraph.style.next_paragraph_style.paragraph_format.alignment == 1  下一段居中显示

                --》paragraph.style.font.color

读取表格:

        numTables = docStr.tables

        for table in numTables:

                #行列个数

                row_count = len(table.rows)

                col_count = len(table.columns)

                for i in range(row_count):

                        row = table.rows[i].cells

                        i行j列内容:row[j].text

           或者:

                    row_count = len(table.rows)
                    col_count = len(table.columns)
                    for i in range(row_count):
                            for j in range(col_count):
                                    print(table.cell(i,j).text)

 

 

评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值