自动化报告pptx-python｜如何将pandas的表格写入PPTX（二）

悟乙己

已于 2024-02-05 16:47:39 修改

阅读量813

点赞数

分类专栏：付费-智能写作专栏 LLM大模型 Python︱基础与数据处理文章标签：自动化 python pandas pptx-python

于 2024-02-01 17:04:07 首次发布

本文链接：https://blog.csdn.net/sinat_26917383/article/details/135976950

版权

付费-智能写作专栏同时被 3 个专栏收录

33 篇文章 ¥9.90 ¥99.00

订阅专栏

超级会员免费看

LLM大模型

41 篇文章

订阅专栏

Python︱基础与数据处理

34 篇文章

订阅专栏

本文介绍如何利用python-pptx库将pandas DataFrame的内容高效地插入到PowerPoint（PPTX）文件中，重点研究了PandasToPowerpoint库，并提供了案例1和案例2的详细代码解析，包括设置列的小数位数、四舍五入和表格样式调整。此外，还展示了如何修改已创建PPTX中的表格数据。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

本篇延续：自动化报告的前奏｜使用python-pptx操作PPT（一）
因为在pptx-python中使用table，需要单个cell逐一输入，于是在想有没有pandas可以直接读入的方式，
有两个开源项目有类似的功能：

其中mspandas写的比较复杂，PandasToPowerpoint比较易懂，就以此为本篇主要的研究对象。

其他系列：

1 案例1

核心代码：pd2ppt.py

在函数df_to_table中可以看到：

def df_to_table(slide, df, left=None, top=None, width=None, height=None,
                colnames=None, col_formatters=None, rounding=None,
                name=None, table_style=None):
    left = process_position_parameter(left)
    top = process_position_parameter(top)
    width = process_position_parameter(width)
    height = process_position_parameter(height)

    rows, cols = df.shape
    shp = slide.shapes.add_table(rows+1, cols, left, top, width, height)

    if colnames is None:
        colnames = list(df.columns)

    # Insert the column names
    for col_index, col_name in enumerate(colnames):
        shp.table.cell(0,col_index).text = col_name

    m = df.values

    for row in range(rows):
        for col in range(cols):
            val = m[row, col]

            if col_formatters is None:
                text = str(val)
            else:
                text = _do_formatting(val, col_formatters[col])

            shp.table.cell(row+1, col).text = text

    if name is not None:
        shp.name = name

    if table_style is not None:
        table_graphic = shp._element.graphic.graphicData.tbl
        table_graphic[0][-1].text = getattr(TableStyle, table_style)
    return shp

# 一个例子
    df = pd.DataFrame(
        {'District':['Hampshire', 'Dorset', 'Wiltshire', 'Worcestershire'],
         'Population':[25000, 500000, 735298, 12653],
         'Ratio':[1.56, 7.34, 3.67, 8.23]})

    pres = Presentation()
    blank_slide_layout = pres.slide_layouts[6]
    slide = pres.slides.add_slide(blank_slide_layout)
    table = df_to_table(slide, df, col_formatters=['', ',', '.2'],
		    rounding=['', 3, ''],
   			table_style = 'DarkStyle2Accent5Accent6')
    pres.save(filename)

整个的核心思路遍历pandas每一个元素，然后再赋值到shape.table的每一个元素当中；
其中：

col_formatters，设置每一列的小数点位数，案例中的['', ',', '.2']，第三列需要保留2位小数
rounding，是否四舍五入
table_style，风格改变

其中，下述是表格的风格，其中不同版本的PPT会有不同的风格，代码示例，DarkStyle2Accent5Accent6 ,这里可以看到由三种元素组成(DarkStyle2,Accent5,Accent6)：

        table_graphic = shp._element.graphic.graphicData.tbl
        table_graphic[0][-1].text = getattr(TableStyle, table_style)
# 举例
df_to_powerpoint(
    "output/pd2ppt_test_3.pptx", df, col_formatters=['', ',', '.2'],
    rounding=['', 3, ''],
    table_style = 'DarkStyle2Accent5Accent6'
    
    )

2 案例2

通过pandas创建pptx并保存；
然后重新修改表格里面的数据


from pd2ppt import df_to_powerpoint
import pandas as pd

# pandas 读入
df = pd.DataFrame(
    {'District':['Hampshire', 'Dorset', 'Wiltshire', 'Worcestershire'],
     'Population':[25000, 500000, 735298, 12653],
     'Ratio':[1.56, 7.34, 3.67, 8.23]})

df_to_powerpoint(
    "output/pd2ppt_test_3.pptx", df, col_formatters=['', ',', '.2'],
    rounding=['', 3, ''],
    table_style = 'DarkStyle2Accent5Accent6'
    
    )

table_graphic = shp._element.graphic.graphicData.tbl
table_graphic[0][-1].text = getattr(TableStyle, table_style)

        
        
# 修改表格数据
from pptx import Presentation
prs = Presentation("output/pd2ppt_test.pptx")

for s in prs.slides[0].shapes:
    print(s.name)
shapes = prs.slides[0]
prs.slides[0].shapes[0].table.cell(0,0).text_frame.text = "District-mattttt"

prs.save(r'output/pd2ppt_test_2.pptx')

再来一个所有数据替换的：

from pptx import Presentation
import six
import pandas as pd
def _do_formatting(value, format_str):
    """Format value according to format_str, and deal
    sensibly with format_str if it is missing or invalid.
    """
    if format_str == '':
        if type(value) in six.integer_types:
            format_str = ','
        elif type(value) is float:
            format_str = 'f'
        elif type(value) is str:
            format_str = 's'
    elif format_str[0] == '.':
        if format_str.endswith('R'):
            if type(value) in six.integer_types:
                value = round_to_n(value, int(format_str[1]))
                format_str = ','
        if not format_str.endswith('G'):
            format_str = format_str + "G"
    try:
        value = format(value, format_str)
    except:
        value = format(value, '')

    return value
    
# 整体修改
df = pd.DataFrame(
    {'new1':['a', 'b', 'c', 'd'],
     'new2':[250, 5000, 7398, 153],
     'new3':[1, 7, 3, 8]})

col_formatters=['', ',', '.2']
rounding=['', 3, '']

rows, cols = df.shape
shp = prs.slides[0].shapes[0]#.table
# shp = slide.shapes.add_table(rows+1, cols, left, top, width, height)

colnames = list(df.columns)

# Insert the column names
for col_index, col_name in enumerate(colnames):
    shp.table.cell(0,col_index).text = col_name

m = df.values

for row in range(rows):
    for col in range(cols):
        val = m[row, col]
        # text = str(val)
        if col_formatters is None:
            text = str(val)
        else:
            text = _do_formatting(val, col_formatters[col])

        shp.table.cell(row+1, col).text = text

prs.save(r'output/pd2ppt_test_整体修改.pptx')

其中，_do_formatting代表表格内格式的调整；
逐个通过table.cell进行修改