python 爬取bilibiliTOP100视频信息，保存到excel和mysql

遗落丶

于 2020-10-01 09:12:59 发布

阅读量550

点赞数

文章标签： python

本文链接：https://blog.csdn.net/qq_45946788/article/details/108893148

版权

该博客演示了如何使用Python爬取Bilibili网站上的前100个视频信息，包括视频排名、名称、UP主、播放量、评论数、综合评分和视频URL。爬取的数据被保存到Excel表格中，并通过pymysql库导入到MySQL数据库。

摘要由CSDN通过智能技术生成

'''爬取B站前100的视频信息'''
import re
import urllib.request
import xlwt
import time
import pymysql
from bs4 import BeautifulSoup
#获取视频信息的列表字典
def getlist():
    url = 'https://www.bilibili.com/ranking/all/0/0/3'
    # 伪装浏览器头部信息
    head = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:80.0) Gecko/20100101 Firefox/80.0'}

    # 请求
    request = urllib.request.Request(url, headers=head)
    try:
        # 打开访问请求对象
        response = urllib.request.urlopen(request)
        # 输出html代码
        html = response.read().decode('utf-8')
        # print(html)
    except urllib.error.URLError as e:
        if hasattr(e, "code"):
            print(e.code)
        if hasattr(e, 'reason'):
            print(e.reason)

    # 通过re确定要爬取的html代码的格式

    findtop = re.compile(r'rank="(.*?)"')  # 视频排名
    findname = re.compile(r'<img alt="(.*?)" src=