Python—爬虫之BeautifulSoup模块(解析—提取数据)
安装BeautifulSoup模块
Mac电脑:打开终端软件(terminal),输入pip3 install BeautifulSoup4,点击enter;
Windows电脑:打开命令提示符(cmd),输入pip install BeautifulSoup4,点击enter;
解析数据
#案例说明
import requests #导入requests库
from bs4 import BeautifulSoup #引入BS库
#利用requests.get()获取网页数据
res = requests.get('https://localprod.pandateacher.com/python-manuscript/crawler-html/spider-men5.0.html')
html = res.text #解析为文本数据
soup = BeautifulSoup(html,'html.parser') #把网页解析为BeautifulSoup对象
print(type(soup)) #查看类型。结果是一个<class 'bs4.BeautifulSoup'>对象,便于后面提取数据
print(" ")