爬虫，爬取任意百度图片下载到本地

最新推荐文章于 2024-07-07 09:46:06 发布

啦啦啦涂涂

最新推荐文章于 2024-07-07 09:46:06 发布

阅读量598

点赞数

分类专栏： python爬虫文章标签：爬虫百度图片通用

本文链接：https://blog.csdn.net/weixin_42358470/article/details/83861767

版权

python爬虫专栏收录该内容

10 篇文章 9 订阅

订阅专栏

之前写的爬取百度斗图的图片，后来又爬取了蜡笔小新的图片

后来找到规律，干脆写为了通用的爬虫，可以直接输入任意想要下载的图片，进行下载

代码如下：（使用requests编写，可直接运行）额，没办法插入代码，只能把代码直接复制下来了

import json
import os
import requests
from urllib.parse import quote

pn = 30
# url='https://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj&ct=201326592&is=&fp=result&queryWord=%E7%BE%8E%E5%A5%B3&word=%E7%BE%8E%E5%A5%B3&cg=girl&pn=120&rn=30'
# url = 'https://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj&ct=201326592&fp=result&queryWord=%E5%AD%99%E5%85%81%E7%8F%A0&word=%E5%AD%99%E5%85%81%E7%8F%A0&pn={}&rn=30&gsm=5a&1537704741066='.format(pn)
# url='http://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj&ct=201326592&is=&fp=result&queryWord=%E5%AD%99%E5%85%81%E7%8F%A0&cl=2&lm=-1&ie=utf-8&oe=utf-8&adpicid=&st=&z=&ic=&word=%E5%AD%99%E5%85%81%E7%8F%A0&s=&se=&tab=&width=&height=&face=&istype=&qc=&nc=&fr=&expermode=&pn=90&rn=30&gsm=5a&1537704741066='
headers = {
    "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.84 Safari/537.36"
}

if not os.path.exists("./images/"):
    os.makedirs("./images/")
path = "./images/"
search_word = input('输入要搜索的图片名')
wd = quote(search_word)

for pn in range(11, 30):
    pn = (pn - 1) * 30
    # new_url = 'https://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj&ct=201326592&fp=result&queryWord=%E6%96%97%E5%9B%BE%E8%A1%A8%E6%83%85%E5%8C%85&word=%E6%96%97%E5%9B%BE%E8%A1%A8%E6%83%85%E5%8C%85&pn={}&rn=30&1537025889309='.format(pn)
    # new_url = 'https://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj&ct=201326592&is=&fp=result&queryWord=%E7%BE%8E%E5%A5%B3&word=%E7%BE%8E%E5%A5%B3&cg=girl&pn={}&rn=30'.format(pn)
    new_url = 'https://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj&ct=201326592&fp=result&word={}&pn={}&rn=30&gsm=5a&rn=30'.format(
        wd, pn)
    print("*************", new_url)
    try:
        response = requests.get(new_url, headers=headers, verify=False)
        content = response.text.encode('utf-8')
        # print(content)
        c = json.loads(content, encoding='utf-8')
        # print(c['data'])
        # print(c['data'][0]['thumbURL'])
        for item in c['data']:
            # print(item)
            print('-----------------')
            print(item.get('middleURL'))
            if item.get('middleURL', ''):
                # print(type(item['middleURL']))
                name = item['middleURL'].split('/')[-1]
                # print(name)
                res = requests.get(item['middleURL'])
                con = res.content
                # print(con)
                with open(path + str(name), "wb") as f:
                    # 注意写入的内容应该是byte类型
                    f.write(con)
            else:
                print('出错了')
    except:
        print('又出错了')

啦啦啦涂涂

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
爬虫，爬取任意百度图片下载到本地

之前写的爬取百度斗图的图片，后来又爬取了蜡笔小新的图片后来找到规律，干脆写为了通用的爬虫，可以直接输入任意想要下载的图片，进行下载代码如下：（使用requests编写，可直接运行）额，没办法插入代码，只能把代码直接复制下来了import jsonimport osimport requestsfrom urllib.parse import quotepn = 30# u...
复制链接

扫一扫

专栏目录