Python爬虫----爬取妹子图片

最新推荐文章于 2024-05-02 17:43:47 发布

545851354

最新推荐文章于 2024-05-02 17:43:47 发布

阅读量4.4k

点赞数 1

分类专栏： python3 文章标签： Python3 爬虫妹子

本文链接：https://blog.csdn.net/u025318883/article/details/79544819

版权

python3 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

近来无事，或许是独自无聊，便产生以下程序

我们的目的不是搞事，是合理利用资源

各位，虎躯重要，合理食用

环境：python 3.6

第三方库

requests BeautifulSoup4

推荐使用 Anaconda 集成包

以下是全部代码

#coding=utf-8
# 作者：听风
import requests
from bs4 import BeautifulSoup

def imgurl(url):
    res = requests.get(url)
    soup = BeautifulSoup(res.text, 'html.parser')
    # 获取总页数
    page = int(soup.select('.pagenavi span')[-2].text)
    # 获取图片链接
    a = soup.select('.main-image a')[0]
    src = a.select('img')[0].get('src')
    meiziid = src[-9:-6]
    print('开始下载妹子:',format(meiziid))
    for i in range(1, page+1):
        i = '%02d' % i
        img = src.replace('01.jpg', str(i)+'.jpg')
        headers = {
            'User-Agent':'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)',
            'Referer':'http://www.mzitu.com'
        }
        #此请求头破解防盗链
        response = requests.get(img,headers=headers)
        f = open('E:\\download\\meizi\\'+meiziid+'%s.jpg' % i, 'wb')
        f.write(response.content)
        f.close()
        print( '===> %s 完成 ' % (meiziid + i))
    print('妹子 %s 下载好了，请享用！\n' % meiziid)

def imgpage(page=''):
    res = requests.get('http://www.mzitu.com/page/' + page)
    soup = BeautifulSoup(res.text, 'html.parser')
    href = soup.select('#pins a')
    # 链接去重
    list = set([i.get('href') for i in href])
    # 遍历下载
    [imgurl(i) for i in list]

result = input('你要下载哪一页的妹子：')
imgpage(result)

# python version : 3.6