这是关于python自动获取B站弹幕并生成词云的小例子
1、思路
- 用requests获取B站的网页内容
- 用BS来解析网页内容,并获得弹幕
- 将弹幕保存本地txt中
- 读取txt采用wordcloud生成词云
2、导入库
import requests
from bs4 import BeautifulSoup
import re
import jieba
import wordcloud
3、根据B站av号来获取弹幕
def cid_from_av(av):
url = 'http://www.bilibili.com/video/bv' + str(av)
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.106 Safari/537.36'}
response = requests.get(url=url, headers=headers)
response.encoding = 'utf-8'
html = response.text
try:
soup = BeautifulSoup(html, 'lxml')
title = soup.select('meta[name="title"]')[0][