python 制作srt字幕

最新推荐文章于 2024-02-28 10:31:22 发布

mohana48833985

最新推荐文章于 2024-02-28 10:31:22 发布

阅读量2.8k

点赞数 1

分类专栏：爬虫

本文链接：https://blog.csdn.net/Caiqiudan/article/details/119183503

版权

爬虫专栏收录该内容

6 篇文章

订阅专栏

从B站下载的字幕文件，转为观看视频软件可插入的形式。
参考链接：Python实现json字幕转换为srt字幕

思路从json提取字典列表–>转为dataframe–>秒转为时分秒–>写入文件。

import requests

subtitle_url = 'https://i0.hdslb.com/bfs/subtitle/e837950453ea3e4f6e81a5709449af173d2604dc.json'  # 获取字幕的网址示例
subtitle_r = requests.get(subtitle_url)
sub_content= subtitle_r.json()['body']        # 提取弹幕内容的json

def s2hms(x):      # 把秒转为时分秒
    m, s = divmod(x, 60)
    h, m = divmod(m, 60)
    hms = "%02d:%02d:%s"%(h,m,str('%.3f'%s).zfill(6))
    hms = hms.replace('.',',')       # 把小数点改为逗号
    return hms

with open('字幕文件.srt', 'w', encoding='utf-8') as f:
	write_content = [str(n+1)+'\n' + s2hms(i['from'])+' --> '+s2hms(i['to'])+'\n' + i['content']+'\n\n' for n,i in enumerate(sub_content)] # 序号+开始-->结束+内容
	f.writelines(write_content)

sub_content      # 查看从B站爬的字幕json格式
>>>[{'from': 0,
  'to': 3.39,    # 表示3.39秒，我们需要将其转为 时:分:秒,秒的小数位 的格式
  'location': 2,
  'content': "之前你见过神经网络的大概图形 在本视频中\nYou've seen me draw a few pictures of your neural network in this video"},
 {'from': 3.46,
  'to': 6.02,
  'location': 2,
  'content': "我们将讨论 这些图形的具体含义\nwe'll talk about exactly what those pictures means"},
 ... ...
 {'from': 8.77,
  'to': 10.7,
  'location': 2,
  'content': '到底代表什么\nhave been drawing on represent'}]