BeautifulSoup网页内容解析
1.处理相关包:
pip install bs4
pip install lxml
2.导入包:
from bs4 import BeautifulSoup
import requests
3.提取网站:
douban_headers={
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36',
}
url="https://movie.douban.com/top250?start=0&filter="
response=requests.get(url,headers=douban_headers)
content=response.text
soup=BeautifulSoup(content,'html.parser') #解析网页代码