阿里云爬虫项目课程笔记【3】：腾讯视频评论实战

最新推荐文章于 2021-07-26 15:59:34 发布

hazelnut_x

最新推荐文章于 2021-07-26 15:59:34 发布

阅读量184

点赞数

分类专栏： python 文章标签： python

本文链接：https://blog.csdn.net/hazelnut_x/article/details/108588002

版权

python 专栏收录该内容

13 篇文章 0 订阅

订阅专栏

import urllib.request
import re

# 本页评论的 id
cid = '6710538280024647270'

for i in range(0, 10):
    url = "https://video.coral.qq.com/varticle/5885307195/comment/v2?callback=_varticle5885307195commentv2&orinum=10&oriorder=o&pageflag=1&cursor=" + cid + "&scorecursor=0&orirepnum=2&reporder=o&reppageflag=1&source=132&_=1600077359219"
    
    print("第" + str(i + 1) + "页评论")
    data = urllib.request.urlopen(url).read().decode('utf-8', 'ignore')
    pat1 = '"content":"(.*?)"'
    comments = re.compile(pat1, re.S).findall(data)
    for item in comments:
        print(str(item))
        print("-------")
    
    pat2 = '"last":"(.*?)"'
	
	# 下一页的 id 保存在返回的网页中
    cid = re.compile(pat2, re.S).findall(data)[0]

其他小节笔记

阿里云爬虫项目课程笔记【1】：正则表达式与 XPath表达式
 阿里云爬虫项目课程笔记【2】：Urllib模块与糗事百科爬取实战
 阿里云爬虫项目课程笔记【4】：Requests 模块与云栖社区博文爬虫实战
 阿里云爬虫项目课程笔记【5】：Scrapy 模块与当当爬虫实战
 阿里云爬虫项目课程笔记【6 - 8】：招聘信息、淘宝网商品信息与知乎爬虫实战
 阿里云爬虫项目课程笔记【9 & 10】常见的反爬策略与反爬攻克手段、腾讯漫画爬取实战与分布式爬虫

hazelnut_x

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
阿里云爬虫项目课程笔记【3】：腾讯视频评论实战

import urllib.requestimport re# 本页评论的 idcid = '6710538280024647270'for i in range(0, 10): url = "https://video.coral.qq.com/varticle/5885307195/comment/v2?callback=_varticle5885307195commentv2&orinum=10&oriorder=o&pageflag=1&curso
复制链接

扫一扫