Python爬虫如何爬取AJAX动态加载的数据

最新推荐文章于 2025-03-17 19:34:28 发布

你别教我打游戏

最新推荐文章于 2025-03-17 19:34:28 发布

阅读量349

点赞数

分类专栏： Python # 爬虫

本文链接：https://blog.csdn.net/qq_44846324/article/details/115388058

版权

Python 同时被 2 个专栏收录

15 篇文章

订阅专栏

爬虫

7 篇文章

订阅专栏

本文指导如何使用Python通过浏览器审查元素获取动态评论API地址，详解了利用requests库抓取并解析LiveRe API，获取用户评论内容的过程。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

爬虫地址Hello world!
现在我们要爬取动态加载的评论。

结果展示：
在这里插入图片描述

最重要的一步就是，我们要通过浏览器的审查元素去获取真正的地址。

import requests
import json

from util.randomHeaders import getHeader

link1 = "https://api-zero.livere.com/v1/comments/list?callback=jQuery112407330668384607038_1617262726311&limit=10" \
        "&offset="
link2 = "&repSeq=4272904&requestPath=%2Fv1%2Fcomments%2Flist&consumerSeq=1020&livereSeq=28583&smartloginSeq=5154"

for i in range(1, 31):
    # 通过链接的拼接实现跳转
    link = link1 + str(i) + link2
    r = requests.get(link, headers=getHeader())
    s = r.text
    # print(s)
    # 解析json
    json_data = json.loads(r.text[s.find('{'):-2])
    response_code = json_data['resultCode']
    if response_code == 200:
        comment_list = json_data['results']['parents']
        # print(comment_list)
        for cmt in comment_list:
            user = cmt['name']
            comment_content = cmt['content']
            print("用户 %s : %s" % (user, comment_content))