目录
斗鱼直播:https://www.douyu.com/directory/all
进入这个网址之后我们想要爬取在线直播的所有直播房间
1.导入模块
import requests
from lxml import etree
2.网络请求
url = 'https://www.douyu.com/directory/all'
# 模仿浏览器访问
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.109 Safari/537.36'
}
content = requests.get(url=url, headers=headers).content.decode('utf-8')
with open('templates\\douyu.html', 'w', enco