利用python对视频字幕进行识别

最新推荐文章于 2024-07-20 16:53:56 发布

一名不想学习的学渣

最新推荐文章于 2024-07-20 16:53:56 发布

阅读量2.8k

点赞数 4

文章标签：计算机视觉 opencv 深度学习

本文链接：https://blog.csdn.net/weixin_44911037/article/details/122923855

版权

import easyocr
import cv2 as cv
from PIL import Image
video_file = cv.VideoCapture(r"f4459201ee68667a36dee475fe96159c.mp4")
video_fps=video_file.get(cv.CAP_PROP_FPS)
print(video_file.get(cv.CAP_PROP_FPS))
total_frames = int(video_file.get(cv.CAP_PROP_FRAME_COUNT))
image_size = (int(video_file.get(cv.CAP_PROP_FRAME_HEIGHT)), int(video_file.get(cv.CAP_PROP_FRAME_WIDTH)))
frames_height,frames_weight=image_size[0],image_size[1]
print(frames_height)
count_frame_start=0
count_frame_end=0
thresh = 220#设定阈值 进行二值化
temporary_frame=[]
reader=easyocr.Reader(["ch_sim","en"],gpu=False)#用于识别文字
"""
下面函数分别计算是否有字幕的区别   字幕是否相同的
相当于从 无----->有   计算每张图像与 0 值图像的误差
return ((img - imgo) ** 2).sum() / img.size * 100 可以添加到代码中，然后变成属于你自己的代码
   有----->变化    接着计算相同字幕和不同字幕图像直接的误差
"""
def cal_video(img, imgo=None):
        return ((img - imgo) ** 2).sum() / img.size * 100

while True:
    success, frames = video_file.read()
    print("打开第{}帧".format(count_frame_start))
    frames_cut = frames[:, :, 0]#[(486, 864)]
    frames_wh_cut = frames_cut[frames_height-75: frames_height-6, :]
    _, frames_threshold = cv.threshold(frames_wh_cut, 220, 255, cv.THRESH_BINARY)
    temporary_frame.append(frames_threshold)
    if count_frame_start>1:
        del temporary_frame[0]

        if cal_video(temporary_frame[1],temporary_frame[0])>2:
            print("程序运行！第{}帧".format(count_frame_start))
            result = reader.readtext(frames_wh_cut)
            if len(result)>0:
                f=open("嫦娥奔月.txt", "a", encoding="utf-8")
                f.write(str(count_frame_end) + "--->" + str(count_frame_start)+"\n")
                f.write(str(result[0][1])+"\n")
                f.close()
                count_frame_end = count_frame_start
                count_frame_start+=1
                #cv.imwrite(r"D:\PycharmProjects\pythonProject\feiji\image{}.jpg".format(count_frame_start),frames_wh_cut)
            else:
                continue
        else:
            pass
    count_frame_start+=1

大家运行程序注意一下几点：

1.更改视频地址，选择你需要进行字幕识别的视频地址

2.对所截取的视频字幕图片进行二值化，其中阈值可以自己更改

3.相邻图片的相似值的阈值可以进行更改。

总体来说，识别还是有一些问题的，大家将这个代码跑完之后就会发现问题所在，如果谁能够提升效果，希望告知，谢谢。