Python 爬虫抓取图片(分页)

最新推荐文章于 2023-07-07 12:24:11 发布

OnTheWay_Seeking

最新推荐文章于 2023-07-07 12:24:11 发布

阅读量2.1k

点赞数

分类专栏： Python 文章标签：爬虫 python 分页

本文链接：https://blog.csdn.net/liujinwei2005/article/details/77829194

版权

Python 专栏收录该内容

17 篇文章 1 订阅

订阅专栏

import urllib
import re

error_count = 0
down_path = r'C:\liujwFiles\NON_IBM_Files\PycharmProjects\pa_chong_files'

for page in range(1, 11):    # page 1--10
    #The pages after the first page(2--10):
    page = int(page) * 10
    if page == 1:
        url_suffix = ''
    else:
        url_suffix = '?start=%i' % page

    print "Downloading current page: ", page
    res = urllib.urlopen(r'https://www.douban.com/location/wuhan/events/week-all%s' % url_suffix).read()

    reg = r'data-lazy="(.*?)"'   # key words2
    url_list = re.findall(reg, res)

    download_count = len(url_list)
    print "Begin to download files.........there are %i files in this page....." % download_count


    for url in url_list:
        try:
            image_name = url.split('/')[-1].split('?')[0]
            urllib.urlretrieve(url, '%s\download_img.%s' % (down_path, image_name))
        except IOError:
            print "File %s download failed......." % url.split('/')[-1]
            error_count = error_count + 1




print "Download complete! %i pages, %i files in total, %i files download failed!" % (page, download_count * page, error_count)

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

OnTheWay_Seeking

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Python 爬虫抓取图片(分页)

import urllibimport reerror_count = 0down_path = r'C:\liujwFiles\NON_IBM_Files\PycharmProjects\pa_chong_files'for page in range(1, 11): # page 1--10 #The pages after the first page(2--10)
复制链接

扫一扫