小姐姐说最近约个电影看看,问我最近会上映哪些电影呢?让我去豆瓣上查查看都有哪些。我心想,这下展示我才华的时候到了哦,可以用Python爬虫爬取哦。经过不断调试,终于扒拉出来咯,代码如下,小伙伴也可以拿去用哦
# -*- coding: utf-8 -*-
__author__ = 'ouyangmin'
__time__ = '2021/2/14 23:22'
import requests
from bs4 import BeautifulSoup
#请求网页
url = "https://movie.douban.com/cinema/later/shenzhen/"
# 伪装成浏览器的header
fake_headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.79 Safari/537.36'
}
response = requests.get(url, headers=fake_headers)
# 保存网页到本地
file_obj = open('douban.html', 'w')
file_obj.write(response.content.decode('utf-8'))
file_obj.close()
# 解析网页
# 初始化BeautifulSoup方法:利用网页字符串自带的编码信息解析网页
soup = BeautifulSoup(response.content.decode('utf-8'), 'lxml')
all_movies = soup.find('div', id="showing-soon")
for ea