爬取豆瓣电影热播名单,包括题目(litile)、时间(time)、国家(country)、导演(director)、作者(actors)、评分(score)。
爬取下来的内容如下所示:
import requests
from bs4 import BeautifulSoup
import re
import pandas as pd
import urllib.request
import pandas as pd
url = "https://movie.douban.com/" #原始网址
r = urllib.request.Request(url)
response = urllib.request.urlopen(r)
data= response.read() #返回的网页内容
data= data.decode('utf-8')
soup = BeautifulSoup(data,"ht