爬取西瓜视频

1.代码展现:

import requests
import re
import base64
cookies = {
    'ixigua-a-s': '1',
    'support_webp': 'true',
    'support_avif': 'true',
    'csrf_session_id': '542c9ed3d4b8082b5614c08d38f2228e',
    'tt_scid': '1ZIk7kNEzHF3S4CaGA-VLF2yrvPU5EQN3bknqjbdhE0g-rt2juJeKfFPOXNHfm9T57f2',
    'ttwid': '1%7C4qTMsUte1tqc2ZJ6VNNdb17urCVPrqXd2fGM5fiURyQ%7C1710557599%7C9e23db30856aa04f2a8d1824e0105ff78f241116cdc7c388374b4887e7cb1447',
    'msToken': 'wY6AEm-fCAfHkwWZibSAD4xX1El4Utx1BBI486-_lw5iRSYbQcA0cOdmjQ07Vv8cd-ODj73BZQloBeQMZvJMrxu12urfpOmlSg_GWh9Vux64Xnt_gqBL',
    'msToken': 'ikUZAnEclU0KLXo4RzdUtC0kDermfBdz0ViJvDG-GVLLoXiFWjO6orWC7yVQVZk5a83rlB1NmX4pf3a8UZyeDcykEpfOIXwiZ3JiU4egihf5_RnoqTM2afOR8hq8bA==',
}

headers = {
    'authority': 'www.ixigua.com',
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
    'accept-language': 'zh-CN,zh;q=0.9',
    'cache-control': 'max-age=0',
    # 'cookie': 'ixigua-a-s=1; support_webp=true; support_avif=true; csrf_session_id=542c9ed3d4b8082b5614c08d38f2228e; tt_scid=1ZIk7kNEzHF3S4CaGA-VLF2yrvPU5EQN3bknqjbdhE0g-rt2juJeKfFPOXNHfm9T57f2; ttwid=1%7C4qTMsUte1tqc2ZJ6VNNdb17urCVPrqXd2fGM5fiURyQ%7C1710557599%7C9e23db30856aa04f2a8d1824e0105ff78f241116cdc7c388374b4887e7cb1447; msToken=wY6AEm-fCAfHkwWZibSAD4xX1El4Utx1BBI486-_lw5iRSYbQcA0cOdmjQ07Vv8cd-ODj73BZQloBeQMZvJMrxu12urfpOmlSg_GWh9Vux64Xnt_gqBL; msToken=ikUZAnEclU0KLXo4RzdUtC0kDermfBdz0ViJvDG-GVLLoXiFWjO6orWC7yVQVZk5a83rlB1NmX4pf3a8UZyeDcykEpfOIXwiZ3JiU4egihf5_RnoqTM2afOR8hq8bA==',
    'referer': 'https://www.ixigua.com/channel/vlog',
    'sec-ch-ua': '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"',
    'sec-ch-ua-mobile': '?0',
    'sec-ch-ua-platform': '"Windows"',
    'sec-fetch-dest': 'document',
    'sec-fetch-mode': 'navigate',
    'sec-fetch-site': 'same-origin',
    'sec-fetch-user': '?1',
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
}

params = {
    'logTag': '13591a4aab066c6198cc',
}
url = 'https://www.ixigua.com/7346431802373046836'
response = requests.get(url=url, params=params, cookies=cookies, headers=headers).text
main_url_list = re.findall('"main_url":"(.*?)"',response)
for main_url in main_url_list:
    main_url = str(base64.b64decode(main_url)).replace(r'.\xd3M\x85','?')[2:-1]
    print(main_url)

通过该代码可以获得相关视频是的url地址。

2.分析过程

打开某个视频,打开开发者工具,网络,XHR,刷新视频

 

分析应该是media-这个包,因为随着视频播放,它不断出现。复制链接,在网页中打开,发现他就是视频。

 

复制关键字,进行搜索,找这个链接在哪里存在,发现在源代码中 的main_url中

 

 

其是一个加密的,很明显看出他是base64加密,因为最后有等于号。但是有陷阱,\u002是不应该存在的。

aHR0cHM6Ly92OS1wLXhnLXdlYi1wYy5peGlndWEuY29tLzcxZTZhYmQyMmQ2ZjFjNDljZjVhYTQ3Y2Y2NmNjZTE3LzY1ZjdiMjZiL3ZpZGVvL3Rvcy9jbi90b3MtY24tdmUtMDAyNi9vc0JidEFRZ0NXQThnREdlbkVwZlc5YnF0am5EQUJOSXhoTmx1by8\u002FYT0xNzY4JmNoPTAmY3I9NyZkcj0wJmVyPTAmY2Q9MCU3QzAlN0MwJTdDMSZjdj0xJmJyPTU1OCZidD01NTgmY3M9NCZkcz0zJmVpZD0xMDI1JmZ0PVhoUWtlQks2eHhvdUtMLmc1UHYxMmxkVTZwdDJHYmttamNrd0ZfeU5iMk4xMk56N1QmbWltZV90eXBlPXZpZGVvX21wNCZxcz0wJnJjPVpUWTFNelk0TXpvNk5EdzRaV1k3TzBCcE0zbDFkRGc2Wm1vNWNUTXpOR1F6TTBCaUxpMDJOR0UxWHpNeFlpMWdMbDVpWVNOc2JHaHBjalF3Wlc1Z0xTMWtMaTl6Y3clM0QlM0QmYnRhZz1lMDAwMzAwMDAmZHlfcT0xNzEwNzI3NzE0JmZlYXR1cmVfaWQ9N2Q5Yjc5N2UwZTYyMDlmYzZkNjgzZWUzYThhN2Q5MDYmbD0yMDI0MDMxODEwMDgzNEY2N0ExMTg3RTVEQjQ0N0MwNkYz 

处理方式:

main_url = str(base64.b64decode(main_url)).replace(r'.\xd3M\x85','?')[2:-1]

 将错误的换成问号。

评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

努力学习各种软件

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值