假设现在有个爬虫需求,抓取某平台的历史文章标题,对标题的文字进行词频统计分析,并最终生成词云图。
平台如下:
代码如下:
import requests
from lxml import etree
import re
import json
from pyecharts import options as opts
from pyecharts.charts import WordCloud
# 发送请求
def demo(url):
# 添加请求头
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) \
Chrome/91.0.4472.124 Safari/537.36'}
response = requests.get(url, headers=headers).text
# print(response)
html = etree.HTML(response)
return html
# 获取响应,拿取数据
def load