为python-docx增加chart支持(使用python-pptx库)

背景

python-docx 是一个用于创建和更新 Microsoft Word(.docx)文件的Python 库,可以添加文本、格式、图像、表格和其他内容到文档中,适用于需要自动化文档生成或修改的应用程序。然而,python-docx 并不支持创建图表。一个不太完美的解决方案是利用matplotlib库生成图表后,保存为图片,再将图片插入到文档中。这样做的缺点是不能在文档中对图表数据进行修改。

python-pptx 是一个用于创建和修改 PowerPoint (.pptx)文件的 Python 库,支持生成和添加图表,但python-pptx不是用于处理 Word (.docx) 文件的,无法直接使用。

解决方案

对python-docx库进行适当修改,就可以调用python-pptx库,在docx文档中插入图表。

  1. 安装库
    首先,确保你已经安装了python-docx和python-pptx。如果没有,可以通过pip安装:
pip install python-docx python-pptx
  1. 修改python-docx库
    找到python-docx源码路径,比如“pip库\Lib\site-packages\docx”,修改如下7个文件:
  • document.py
    在class Document(ElementProxy):中增加代码
def add_chart(self, chart_type, x, y, cx, cy, chart_data):
        """
        Add a new chart of *chart_type* to the slide, positioned at (*x*,
        *y*), having size (*cx*, *cy*), and depicting *chart_data*.
        *chart_type* is one of the :ref:`XlChartType` enumeration values.
        *chart_data* is a |ChartData| object populated with the categories
        and series values for the chart. Note that a |GraphicFrame| shape
        object is returned, not the |Chart| object contained in that graphic
        frame shape. The chart object may be accessed using the :attr:`chart`
        property of the returned |GraphicFrame| object.
        """
        run = self.add_paragraph().add_run()
        return run.add_chart(chart_type, x, y, cx, cy, chart_data)
  • section.py
    修改代码
from collections import Sequence

from collections.abc import Sequence
  • opc/package.py
    修改代码
from .packuri import PACKAGE_URI

from .packuri import PACKAGE_URI, PackURI

在class OpcPackage(object):中增加代码

def next_partname(self, tmpl):
        """
        Return a |PackURI| instance representing the next available partname
        matching *tmpl*, which is a printf (%)-style template string
        containing a single replacement item, a '%d' to be used to insert the
        integer portion of the partname. Example: '/word/slides/slide%d.xml'
        """     
        tmpl = tmpl.replace('/ppt', '/word')
        partnames = [part.partname for part in self.iter_parts()]
        for n in range(1, len(partnames)+2):
            candidate_partname = tmpl % n
            if candidate_partname not in partnames:
                return PackURI(candidate_partname)
        raise Exception('ProgrammingError: ran out of candidate_partnames')
  • oxml/_ init_.py
  • 修改代码
from .shape import (
    CT_Blip, CT_BlipFillProperties, CT_GraphicalObject,
    CT_GraphicalObjectData, CT_Inline, CT_NonVisualDrawingProps, CT_Picture,
    CT_PictureNonVisual, CT_Point2D, CT_PositiveSize2D, CT_ShapeProperties,
    CT_Transform2D
)

from .shape import (
    CT_Blip, CT_BlipFillProperties, CT_GraphicalObject,
    CT_GraphicalObjectData, CT_Inline, CT_NonVisualDrawingProps, CT_Picture,
    CT_PictureNonVisual, CT_Point2D, CT_PositiveSize2D, CT_ShapeProperties,
    CT_Transform2D, CT_Chart
)

增加代码

register_element_cls('c:chart',     CT_Chart)
  • oxml/shape.py
  • 修改代码
class CT_GraphicalObjectData(BaseOxmlElement):
    """
    ``<a:graphicData>`` element, container for the XML of a DrawingML object
    """
    pic = ZeroOrOne('pic:pic')
    uri = RequiredAttribute('uri', XsdToken)

class CT_GraphicalObjectData(BaseOxmlElement):
    """
    ``<a:graphicData>`` element, container for the XML of a DrawingML object
    """
    pic = ZeroOrOne('pic:pic')
    cChart = ZeroOrOne('c:chart')
    uri = RequiredAttribute('uri', XsdToken)

在class CT_Inline(BaseOxmlElement):中增加代码

    @classmethod
    def new_chart_inline(cls, shape_id, rId, x, y, cx, cy):
        """
        Return a new ``<wp:inline>`` element populated with the values passed
        as parameters.
        """
        inline = parse_xml(cls._chart_xml())
        inline.extent.cx = cx
        inline.extent.cy = cy
        chart = CT_Chart.new(rId)
        inline.graphic.graphicData._insert_cChart(chart)
        return inline

    @classmethod
    def _chart_xml(cls):
        return (
            '<wp:inline %s>\n'
            '  <wp:extent/>\n'
            '  <wp:effectExtent l="0" t="0" r="0" b="0"/>\n'
            '  <wp:docPr id="1" name="Chart 1"/>\n'
            '  <wp:cNvGraphicFramePr/>\n'
            '  <a:graphic %s>\n'
            '    <a:graphicData uri="http://schemas.openxmlformats.org/drawingml/2006/chart"/>\n'
            '  </a:graphic>\n'
            '</wp:inline>' % (nsdecls('wp', 'a'), nsdecls('a'))
        )

增加代码

class CT_Chart(BaseOxmlElement):
    """
    ``<c:chart>`` element, a DrawingML picture
    """

    @classmethod
    def new(cls, rId):
        """
        Return a new ``<c:chart>`` element populated with the minimal
        contents required to define a viable chart element, based on the
        values passed as parameters.
        """
        chart = parse_xml(cls._chart_xml(rId))
        chart.id = rId
        return chart

    @classmethod
    def _chart_xml(cls, rId):
        return (
            '<c:chart %s r:id="%s"/>\n'% (nsdecls('c', 'r'), rId)
        )
  • parts/document.py
    在class DocumentPart(XmlPart):中增加代码
    def get_or_add_chart(self, chart_type, x, y, cx, cy, chart_data):
        """
        Return an (rId, chart) 2-tuple for the chart.
        Access the chart properties like description in python-pptx documents.
        """
        chart_part = ChartPart.new(chart_type, chart_data, self.package)
        rId = self.relate_to(chart_part, RT.CHART)
        return rId, chart_part.chart
  
    def new_chart_inline(self, chart_type, x, y, cx, cy, chart_data):
        """
        Return a newly-created `w:inline` element containing the chart
        with position *x* and *y* and width *cx* and height *y*
        """
        rId, chart = self.get_or_add_chart(chart_type, x, y, cx, cy, chart_data)
        shape_id = self.next_id
        return CT_Inline.new_chart_inline(shape_id, rId, x, y, cx, cy), chart

  • text/run.py
    在class Run(Parented):中增加代码
    def add_chart(self, chart_type, x, y, cx, cy, chart_data):
        """
        Return an |InlineShape| instance containing the chart, added to the
        end of this run.
        """
        inline, chart = self.part.new_chart_inline(chart_type, x, y, cx, cy, chart_data)
        self._r.add_drawing(inline)
        return chart

如果不追求使用最新版的python-docx,可以直接下载已经改好的版本,集成到当前项目。
基于python-docx 1.1.0修改的版本 : https://github.com/jiangta0/docx_chart

使用样例

图表的具体使用方法可以参考python-pptx相关文档。

from docx import Document
from pptx.chart.data import CategoryChartData
from pptx.util import Inches
from pptx.enum.chart import XL_CHART_TYPE

doc = Document()

labels = ["1月", "2月", "3月"]
data = [25, 33, 18]
chart_data = CategoryChartData()
chart_data.categories = labels
chart_data.add_series("季度销售", data)
x, y, cx, cy = Inches(2), Inches(2), Inches(6), Inches(3)
chart = doc.add_chart(XL_CHART_TYPE.COLUMN_CLUSTERED, x, y, cx, cy, chart_data)


doc.save("t.docx")

生成的图表

  • 7
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值