python系列&deep_study系列:基于Whisper语音识别的实时视频字幕生成 (二): 在线实时字幕

基于Whisper语音识别的实时视频字幕生成 (二): 在线实时字幕




基于Whisper语音识别的实时视频字幕生成 (二): 在线实时字幕

Whistream

基于whisper的在线字幕生成
在这里插入图片描述

1. 安装

pip install whishow
pip install faster-whisper

2. 用法

基于whishow流处理和faster-whisper语音识别的多线程交互进行在线字幕生成

import threading
from whisper_api import WhisperModel
from whishow import STREAM
from whishow import PLAY
import time 

url = "rtmp://mobliestream.c3tv.com:554/live/goodtv.sdp"
url = "test.mp4"
language = "zh"

# init the stream reader, named stm.
stm = STREAM()
stm.init_state(url=url,
               cache_size=10*60,
               video_frame_quality=50)

# init the whisper model, and connect the audio stream of stm
asr = WhisperModel(model_size_or_path=r"tiny", 
                    device='cpu', 
                    compute_type="int8",
                    download_root=r"./models",
                    local_files_only=False)
asr.init_state(Q_audio_asr=stm.Q_audio_asr,
               read_size=16000)

# init the whishow player, and connect the audio/video stream of stm and the asr result
ply = PLAY()
ply.init_state(chunk_size=1,
                video_frame_shift=10,
                audio_fps=stm.AUDIO_FPS,
                video_fps=stm.VIDEO_FPS,
                Q_audio_play=stm.Q_audio_play,
                Q_video_play=stm.Q_video_play,
                asr_results=asr.asr_results)
                

# launch the stm
def process1():
    global stm
    stm.read(is_play=True,
             is_asr=True)

# launch the asr, modify your seeting
def recognition():
    global asr
    asr.transcribe(language=language,
                   task="transcribe",
                   condition_on_previous_text=True)
def process2():
    global asr
    p1 = threading.Thread(target=asr.read_data,args=())
    p2 = threading.Thread(target=recognition,args=())
    p1.start()
    p2.start()
    p1.join()
    p2.join()


# lanuch the player
def process3():
    global ply,stm
    delay = 60
    while stm.at < delay:
         print("wait for asr preprocess ..")
         time.sleep(1)
    ply.run()

# esc for exit
def engine():
        global asr,stm
        import keyboard
        while 1:
            if keyboard.is_pressed('esc'):
                print("exit ..")
                break
            time.sleep(0.1)
        stm.running = False
        asr.running = False
        ply.running = False

if __name__ == "__main__":

    p0 = threading.Thread(target=engine,args=())
    p1 = threading.Thread(target=process1,args=())
    p2 = threading.Thread(target=process2,args=())
    p3 = threading.Thread(target=process3,args=())

    p0.start()
    p1.start()
    p2.start()
    p3.start()

3. 联系我们

605686962@qq.com
coolEphemeroptera@gmail.com







Pika在线

基于Whisper语音识别的实时视频字幕生成 (二): 在线实时字幕

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

坦笑&&life

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值