一个个简单的爬虫,爬取豆瓣网的电影top250
代码如下:
import requests
from pyquery import PyQuery as pq
for url in ['https://movie.douban.com/top250?start={}'.format(page) for page in range(0,225,25)]:
html = requests.get(url).text
for item in pq(html)('.item').items():
num = item.find('.pic em').text()
title = item.find('.title').text()
title1 =str(title,'utf-8')
img = item.find('.pic img').attr('src')
start = item.find('.rating_num').text()
print (num , title1 , start , img)
python 学习笔记 简单爬虫
最新推荐文章于 2022-07-17 18:20:25 发布