利用Python统计PPT中的文字（包括备注）

最新推荐文章于 2024-03-30 11:02:15 发布

zhoudapeng01

最新推荐文章于 2024-03-30 11:02:15 发布

阅读量773

点赞数

分类专栏：软件设计

本文链接：https://blog.csdn.net/zhoudapeng01/article/details/84075359

版权

软件设计专栏收录该内容

5 篇文章 0 订阅

订阅专栏

import win32com
import re
from win32com.client import Dispatch, constants
ppt = win32com.client.Dispatch('PowerPoint.Application')
ppt.Visible = 1
pptSel = ppt.Presentations.Open(r"C:\Users\16254\Desktop\in.pptx")
win32com.client.gencache.EnsureDispatch('PowerPoint.Application')
f = open(r"C:\Users\16254\Desktop\in.txt","w")
slide_count = pptSel.Slides.Count
print(slide_count)
noteList = []
contentList =[]
for i in range(1,slide_count + 1):
    shape_count = pptSel.Slides(i).Shapes.Count
    note = pptSel.Slides(i).NotesPage.Shapes.Placeholders(2).TextFrame.TextRange.Text
    noteList.append(note)
    print(note)
    print(shape_count)
    for j in range(1,shape_count + 1):
        if pptSel.Slides(i).Shapes(j).HasTextFrame:
            s = pptSel.Slides(i).Shapes(j).TextFrame.TextRange.Text[:]
            contentList.append(s)
            print(s)

notestr = ''.join(noteList)
contentstr = ''.join(contentList)
f.write(notestr)
f.write(contentstr)
outstr = notestr+contentstr
outstr.replace(' ', '')
print(len(outstr))
f.close()
ppt.Quit()

char = re.findall(r'[a-zA-Z]',outstr)
num = re.findall(r'[0-9]',outstr)
blank = re.findall(r' ',outstr)
#\u4E00-\u9FFF是中文的范围
chi = re.findall(r'[\u4E00-\u9FFF]',outstr)
other = len(outstr)-len(char)-len(num)-len(blank)-len(chi)
print("字母：", len(char),"\n数字：", len(num),"\n空格：",len(blank),"\n中文：",len(chi),"\n其他：",other)

zhoudapeng01

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
利用Python统计PPT中的文字（包括备注）

import win32comimport refrom win32com.client import Dispatch, constantsppt = win32com.client.Dispatch('PowerPoint.Application')ppt.Visible = 1pptSel = ppt.Presentations.Open(r"C:\Users\16254\Des...
复制链接

扫一扫

专栏目录