blob类型url的视频下载问题

本文介绍了一种通过浏览器开发者工具获取Twitter视频真实地址的方法,并详细解释了如何从.m3u8文件中提取不同分辨率的视频流链接,进而下载所需的视频片段。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

blob下载问题的详细描述

我想用src url blob:https%3A//www.youtube.com/23aea5c8-9ae2-40dc-9417-e675ea99b386下载视频,但是不知道应该怎么做。

有没有下载这类视频的通用方法?

推荐的解决方法

我在Vimeo中找到了一个使用blob url下载视频的方法(读了这篇文章,我才知道做法)。我正在使用Google Chrome,具体步骤如下:

  1. 打开More Tools(更多工具)→Developer Tools(开发工具)

  2. 检查视频标签中是否有这样的东西:

    <video preload="" src="blob:https://player.vimeo.com/XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"></video>
    
  3. 复制iframe标签的src(如果有的话)值,如http://player.vimeo.com/video/XYZ,如果你发现它可以复制,直接跳到第7点,否则按照步骤4,5,6继续操作。

  4. 现在在页面中找到这个字符串https://skyfire.vimeocdn.com/.../master.json?base64_init=1(使用开发视图(Developer View)),应该可以在javascript函数中找到它,像这样:

    (function(e,a){var t={"cdn_url":"https://f.vimeocdn.com","view":1,"request":{"files":{"dash":{"origin":"gcs","url":"https://48skyfiregce-a.akamaihd.net/.../master.json?base64_init=1","cdn":"
    
  5. 复制上面的url字段中的链接到一个新的Chrome选项卡,例如https://48skyfiregce-a.akamaihd.net/.../master.json?base64_init=1,然后使用浏览器打开它,它会打开一个像这样的json文件:

    {
        "clip_id": XYZ,
        "base_url": "../",
        "video": [
                     { ... ... ...
    
  6. 现在用id XYZ组合构造一个URL,如下所示:https://player.vimeo.com/video/XYZ

  7. 用最终的URL替换视频标签内的blob:https://player.vimeo.com/XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX(在上一步#6中创建的)。

  8. 现在可以发现,魔术般地,视频标签内的src字段已更改(如果没有,请尝试第7步多次)...

    <video preload="none" src="https://fpdl.vimeocdn.com/vimeo-prod-skyfire-std-us/XX/XXX/X/XXXXXXXX/XXXXXXXXX.mp4?token=abcdefg"></video>
    
  9. 最后,使用新的链接直接下载它,就像这样:https://fpdl.vimeocdn.com/vimeo-prod-skyfire-std-us/XX/XXX/X/XXXXXXXX/XXXXXXXXX.mp4?token=abcdefg

其他的解决思路

这个答案是针对Twitter网址的 -

  1. 右键点击视频,然后点击检查元素 -

你会发现这样的代码

<div id="playerContainer" class="player-container full-screen-enabled" data-config="{&quot;is_360&quot;:false,&quot;duration&quot;:28617,&quot;scribe_widget_origin&quot;:true,&quot;heartbeatEnabled&quot;:true,&quot;video_url&quot;:&quot;https:\/\/video.twimg.com\/ext_tw_video\/844504104512749568\/pu\/pl\/e91Du5N2TZ09ZaW_.m3u8&quot;,&quot;disable_embed&quot;:&quot;0&quot;,&quot;videoInfo&quot;:{&quot;title&quot;:null,&quot;description&quot;:null,&quot;publisher&quot;:{&quot;screen_name&quot;:&quot;MountainButorac&quot;,&quot;name&quot;:&quot;Mountain Butorac&quot;,&quot;profile_image_url&quot;:&quot;https:\/\/pbs.twimg.com\/profile_images\/808318456701521920\/vBvlAASx_normal.jpg&quot;}},&quot;cardUrl&quot;:&quot;https:\/\/t.co\/SdSorop3uN&quot;,&quot;content_type&quot;:&quot;application\/x-mpegURL&quot;,&quot;owner_id&quot;:&quot;14120461&quot;,&quot;looping_enabled&quot;:true,&quot;show_cookie_override_en&quot;:true,&quot;visit_cta_url&quot;:null,&quot;scribe_playlist_url&quot;:&quot;https:\/\/twitter.com\/MountainButorac\/status\/844505243538931714\/video\/1&quot;,&quot;source_type&quot;:&quot;consumer&quot;,&quot;image_src&quot;:&quot;https:\/\/pbs.twimg.com\/ext_tw_video_thumb\/844504104512749568\/pu\/img\/FFt3qkbeOh0RlGfZ.jpg&quot;,&quot;heartbeatIntervalInMs&quot;:5000.0,&quot;use_tfw_live_heartbeat_event_category&quot;:true,&quot;video_loading_timeout&quot;:45000.0,&quot;status&quot;:{&quot;created_at&quot;:&quot;Wed Mar 22 11:05:14 +0000 2017&quot;,&quot;id&quot;:844505243538931714,&quot;id_str&quot;:&quot;844505243538931714&quot;,&quot;text&quot;:&quot;Took my Goddaughter to meet the pope. She stole his hat! https:\/\/t.co\/SdSorop3uN&quot;,&quot;truncated&quot;:false,&quot;entities&quot;:{&quot;hashtags&quot;:[],&quot;symbols&quot;:[],&quot;user_mentions&quot;:[],&quot;urls&quot;:[],&quot;media&quot;:[{&quot;id&quot;:844504104512749568,&quot;id_str&quot;:&quot;844504104512749568&quot;,&quot;indices&quot;:[57,80],&quot;media_url&quot;:&quot;http:\/\/pbs.twimg.com\/ext_tw_video_thumb\/844504104512749568\/pu\/img\/FFt3qkbeOh0RlGfZ.jpg&quot;,&quot;media_url_https&quot;:&quot;https:\/\/pbs.twimg.com\/ext_tw_video_thumb\/844504104512749568\/pu\/img\/FFt3qkbeOh0RlGfZ.jpg&quot;,&quot;url&quot;:&quot;https:\/\/t.co\/SdSorop3uN&quot;,&quot;display_url&quot;:&quot;pic.twitter.com\/SdSorop3uN&quot;,&quot;expanded_url&quot;:&quot;https:\/\/twitter.com\/MountainButorac\/status\/844505243538931714\/video\/1&quot;,&quot;type&quot;:&quot;photo&quot;,&quot;sizes&quot;:{&quot;small&quot;:{&quot;w&quot;:340,&quot;h&quot;:604,&quot;resize&quot;:&quot;fit&quot;},&quot;thumb&quot;:{&quot;w&quot;:150,&quot;h&quot;:150,&quot;resize&quot;:&quot;crop&quot;},&quot;large&quot;:{&quot;w&quot;:576,&quot;h&quot;:1024,&quot;resize&quot;:&quot;fit&quot;},&quot;medium&quot;:{&quot;w&quot;:576,&quot;h&quot;:1024,&quot;resize&quot;:&quot;fit&quot;}}}]},&quot;source&quot;:&quot;\u003ca href=\&quot;http:\/\/twitter.com\/download\/iphone\&quot; rel=\&quot;nofollow\&quot;\u003eTwitter for iPhone\u003c\/a\u003e&quot;,&quot;in_reply_to_status_id&quot;:null,&quot;in_reply_to_status_id_str&quot;:null,&quot;in_reply_to_user_id&quot;:null,&quot;in_reply_to_user_id_str&quot;:null,&quot;in_reply_to_screen_name&quot;:null,&quot;geo&quot;:null,&quot;coordinates&quot;:null,&quot;place&quot;:null,&quot;contributors&quot;:null,&quot;retweet_count&quot;:0,&quot;favorite_count&quot;:0,&quot;favorited&quot;:false,&quot;retweeted&quot;:false,&quot;possibly_sensitive&quot;:false,&quot;lang&quot;:&quot;en&quot;},&quot;show_cookie_override_all&quot;:true,&quot;video_session_enabled&quot;:false,&quot;media_id&quot;:&quot;844504104512749568&quot;,&quot;view_counts&quot;:null,&quot;statusTimestamp&quot;:{&quot;local&quot;:&quot;4:05 AM - 22 Mar 2017&quot;},&quot;media_type&quot;:1,&quot;user&quot;:{&quot;screen_name&quot;:&quot;MountainButorac&quot;,&quot;name&quot;:&quot;Mountain Butorac&quot;,&quot;profile_image_url&quot;:&quot;https:\/\/pbs.twimg.com\/profile_images\/808318456701521920\/vBvlAASx_bigger.jpg&quot;},&quot;watch_now_cta_url&quot;:null,&quot;tweet_id&quot;:&quot;844505243538931714&quot;}" data-source-type="consumer">

复制上面的代码,并粘贴到记事本++(Notepad++)中,然后用"替换所有的&quot;,用/替换所有和\/。 (使用CTRL + H)

你会得到如下的内容

{
    "is_360": false,
    "duration": 28617,
    "scribe_widget_origin": true,
    "heartbeatEnabled": true,
    "video_url": "https://video.twimg.com/ext_tw_video/844504104512749568/pu/pl/e91Du5N2TZ09ZaW_.m3u8",

    "disable_embed": "0",
    "videoInfo": {
        "title": null,
        "description": null,
        "publisher": {
            "screen_name": "MountainButorac",
            "name": "Mountain Butorac",
            "profile_image_url": "https://pbs.twimg.com/profile_images/808318456701521920/vBvlAASx_normal.jpg"
        }
    },
    "cardUrl": "https://t.co/SdSorop3uN",
    "content_type": "application/x-mpegURL",
    "owner_id": "14120461",
    "looping_enabled": true,
    "show_cookie_override_en": true,
    "visit_cta_url": null,
    "scribe_playlist_url": "https://twitter.com/MountainButorac/status/844505243538931714/video/1",
    "source_type": "consumer",
    "image_src": "https://pbs.twimg.com/ext_tw_video_thumb/844504104512749568/pu/img/FFt3qkbeOh0RlGfZ.jpg",
    "heartbeatIntervalInMs": 5000.0,
    "use_tfw_live_heartbeat_event_category": true,
    "video_loading_timeout": 45000.0,
    "status": {
        "created_at": "Wed Mar 22 11:05:14 +0000 2017",
        "id": 844505243538931714,
        "id_str": "844505243538931714",
        "text": "Took my Goddaughter to meet the pope. She stole his hat! https://t.co/SdSorop3uN",
        "truncated": false,
        "entities": {
            "hashtags": [],
            "symbols": [],
            "user_mentions": [],
            "urls": [],
            "media": [{
                "id": 844504104512749568,
                "id_str": "844504104512749568",
                "indices": [57, 80],
                "media_url": "http://pbs.twimg.com/ext_tw_video_thumb/844504104512749568/pu/img/FFt3qkbeOh0RlGfZ.jpg",
                "media_url_https": "https://pbs.twimg.com/ext_tw_video_thumb/844504104512749568/pu/img/FFt3qkbeOh0RlGfZ.jpg",
                "url": "https://t.co/SdSorop3uN",
                "display_url": "pic.twitter.com/SdSorop3uN",
                "expanded_url": "https://twitter.com/MountainButorac/status/844505243538931714/video/1",
                "type": "photo",
                "sizes": {
                    "small": {
                        "w": 340,
                        "h": 604,
                        "resize": "fit"
                    },
                    "thumb": {
                        "w": 150,
                        "h": 150,
                        "resize": "crop"
                    },
                    "large": {
                        "w": 576,
                        "h": 1024,
                        "resize": "fit"
                    },
                    "medium": {
                        "w": 576,
                        "h": 1024,
                        "resize": "fit"
                    }
                }
            }]
        },
        "source": "\u003ca href=\"http://twitter.com/download/iphone\" rel=\"nofollow\"\u003eTwitter for iPhone\u003c/a\u003e",
        "in_reply_to_status_id": null,
        "in_reply_to_status_id_str": null,
        "in_reply_to_user_id": null,
        "in_reply_to_user_id_str": null,
        "in_reply_to_screen_name": null,
        "geo": null,
        "coordinates": null,
        "place": null,
        "contributors": null,
        "retweet_count": 0,
        "favorite_count": 0,
        "favorited": false,
        "retweeted": false,
        "possibly_sensitive": false,
        "lang": "en"
    },
    "show_cookie_override_all": true,
    "video_session_enabled": false,
    "media_id": "844504104512749568",
    "view_counts": null,
    "statusTimestamp": {
        "local": "4:05 AM - 22 Mar 2017"
    },
    "media_type": 1,
    "user": {
        "screen_name": "MountainButorac",
        "name": "Mountain Butorac",
        "profile_image_url": "https://pbs.twimg.com/profile_images/808318456701521920/vBvlAASx_bigger.jpg"
    },
    "watch_now_cta_url": null,
    "tweet_id": "844505243538931714"
}

从上面的JSON格式,可以看到video_url的值

https://video.twimg.com/ext_tw_video/844504104512749568/pu/pl/e91Du5N2TZ09ZaW_.m3u8

这里的问题是,在2016年8月1日之后,Twitter不再使用.mp4视频,而是转换为新的HLS,自适应流格式,带有.m3u8文件扩展名。

.m3u8文件基本上只是一个文本文的封装,它们非常小(300-500字节)。当您使用文本编辑器打开它们时,它们包含指向不同视频大小的链接

  1. 在记事本++(Notepad++)中打开文件m3u8,它会包含这样的代码

EXTM3U EXT-X-INDEPENDENT-SEGMENTS EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=256000,RESOLUTION=180x320,CODECS="mp4a.40.2,avc1.42001f"/ext_tw_video/844504104512749568/pu/pl/180x320/_Z42SY5zwMlLdFYx.m3u8 EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=832000,RESOLUTION=360x640,CODECS="mp4a.40.2,avc1.42001f"/ext_tw_video/844504104512749568/pu/pl/360x640/-Phfjbbx2yinirLi.m3u8

  1. 根据您需要的分辨率从上面复制对应的链接。重复相同的步骤,直到有.ts文件。下载.ts文件(视频文件)。

### 如何使用 Python 实现 Blob 类型视频下载 Blob 是一种二进制数据对象,通常用于表示多媒体资源(如图片、音视频)。由于其特殊性,直接通过 URL 访问可能不可行。以下是实现 Blob 视频下载的方法: #### 方法概述 为了从浏览器或网络中下载 Blob 类型视频文件,可以按照以下思路操作: 1. **分析目标页面**:利用浏览器开发者工具 (F12),找到 `window.__playinfo__` 或类似的 JavaScript 变量,提取实际的媒体流地址。 2. **处理加密链接**:如果视频源是 m3u8 格式的分片播放列表,则需要进一步抓取 ts 文件片段并拼接成完整的 MP4 文件[^3]。 3. **模拟请求头**:某些网站会验证用户的登录状态或者设置防盗链机制,因此需附加必要的 Cookies 和 Headers 来伪装合法访问行为。 --- #### 关键技术点详解 ##### 一、获取真实媒体流地址 对于 Bilibili 这样的平台来说,可以通过查看网页加载过程中的 XHR 请求来定位到真正的 mp4 或 m3u8 地址。这些信息往往隐藏于 JSON 数据结构之中,例如 `data.dash.video.base_url` 字段即指向高清画质下的视频部分[^1]。 ```python import json from urllib.parse import unquote def extract_media_urls(html_content): start_marker = 'window.__playinfo__=' end_marker = '</script>' play_info_start = html_content.find(start_marker) + len(start_marker) play_info_end = html_content.find(end_marker, play_info_start) raw_data = html_content[play_info_start:play_info_end].strip() decoded_json = json.loads(unquote(raw_data)) dash_videos = decoded_json['data']['dash']['video'] highest_quality_stream = max(dash_videos, key=lambda v: int(v.get('id', 0))) return highest_quality_stream['base_url'] # 返回最高分辨率的视频URL ``` 上述代码展示了如何解析 HTML 文档内的嵌入式 JSON 对象,并筛选出最优质量级别的视频流路径[^2]。 ##### 二、解决 blob 加密问题 当遇到无法直接复制粘贴的有效 URL 形态时,很可能是因为采用了动态生成的技术方案——比如 base64 编码后的字符串形式表达的实际 URI Scheme (`blob:http`) 。此时应考虑如下策略之一: - 如果存在对应的解密算法逻辑公开披露,则可以直接调用相应的库函数完成转换; - 否则就需要借助第三方插件录制整个回放周期内产生的临时缓存记录作为最终产物导出。 ##### 三、实施批量下载任务 一旦明确了所有的子资源构成要素之后,就可以着手编写自动化脚本来逐一拉取它们的内容了。这里推荐采用多线程并发模式提升效率的同时也要注意控制速率以免触发服务器防护措施。 ```python import os import threading import requests class Downloader(threading.Thread): def __init__(self, url_list, output_dir='./downloads'): super().__init__() self.url_list = url_list self.output_dir = output_dir if not os.path.exists(output_dir): os.makedirs(output_dir) def run(self): session = requests.Session() for idx, segment_url in enumerate(self.url_list): filename = f'{idx}.ts' filepath = os.path.join(self.output_dir, filename) with open(filepath, 'wb') as fp: resp = session.get(segment_url, stream=True) total_length = int(resp.headers.get('content-length')) downloaded_size = 0 for chunk in resp.iter_content(chunk_size=1024*1024): if chunk: written_bytes = fp.write(chunk) downloaded_size += written_bytes print(f'Segment {filename} completed ({downloaded_size}/{total_length}) bytes.') if __name__ == '__main__': segments_to_fetch = [...] # List of TS URLs extracted earlier. worker_threads = [] thread_count = min(len(segments_to_fetch), 5) # Limit to five simultaneous downloads at most. per_thread_load = len(segments_to_fetch)//thread_count for i in range(thread_count): lower_bound = i * per_thread_load upper_bound = None if i==thread_count-1 else (i+1)*per_thread_load current_segment_set = segments_to_fetch[lower_bound : upper_bound] new_worker = Downloader(current_segment_set) new_worker.start() worker_threads.append(new_worker) for t in worker_threads: t.join() print("All threads finished successfully.") ``` 最后一步便是运用 FFmpeg 工具将分离出来的音频轨道与视觉画面重新组合起来形成标准格式封装容器文件: ```bash ffmpeg -i input_audio.aac -i input_video.mp4 -c copy final_output.mp4 ``` --- ###
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值