爬虫-豆瓣电影

import requests,json,random,re
from fake_useragent import UserAgent
ua = UserAgent()
url = 'https://movie.douban.com/j/chart/top_list?type=11&interval_id=100%3A90&action=&start=0&limit=10'
headers = {'User-Agent':ua.random}
# response = requests.get(url,headers=headers).json()
response = requests.get(url,headers=headers).text
# lis = []
# for i in response:
#     dic = {}
#     title = i['title']
#     actors = i['actors']
#     cover_url = i['cover_url']
#     dic['title']=title
#     dic['actors']=actors
#     dic['url']=cover_url
#     lis.append(dic)
# with open('douban.json','w',encoding='utf-8') as f:
#     json.dump(lis,f,ensure_ascii=False)

# with open('douban1.json', 'w', encoding='utf-8') as f:
#     json.dump(response, f, ensure_ascii=False)

pattern = re.compile(r'{.*?}')
pattern_list = pattern.findall(response)
# print(pattern_list)
for i in pattern_list:
    actors = re.compile(r'"actors":\[(.*?)\]')
    actors_list = actors.findall(i)
    # print(actors_list)
    id = re.compile(r'"\w\w":".*?"')
    id_list = id.findall(i)
    print(id_list)

 

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值