《追火车的猫》表情包很火，python爬虫爬一下当表情包

倒吃甘蔗

于 2020-03-20 19:58:31 发布

阅读量959

点赞数 1

分类专栏：爬虫 python

本文链接：https://blog.csdn.net/qq_38190111/article/details/104997346

版权

爬虫同时被 2 个专栏收录

5 篇文章 0 订阅

订阅专栏

python

5 篇文章 0 订阅

订阅专栏

类似这种，正好看到有网页有这种表情包的汇总：

https://mp.weixin.qq.com/s?__biz=MzA5MTY0NTYyOQ

爬它

# -*- coding:utf-8 -*-
import time
from selenium import webdriver
from selenium.webdriver import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
import requests
import os

url  = input("请输入百度图片网址：")
FILENAME = input("请输入要存放的文件夹名称(切记要输英文名！)：")
browser = webdriver.Chrome()
wait = WebDriverWait(browser, 10)
browser.get(url)
time.sleep(5)
#这个阶段自己拉网页
# while cnt>0:
#     go_scroll(num,browser)
#     cnt = cnt-1
#     time.sleep(0.5)
html = browser.page_source
soup = BeautifulSoup(html, 'lxml')
images = soup.find_all('img', {'class': '__bg_gif'})

#figures = soup.find_all('imgitem')
#print(figures)
root = r'D:\crawl_download'
if os.path.exists(root) is not True:
    os.mkdir(root)
    print("root created!"+root)
src =""
path2 = os.path.join(root,FILENAME)
if os.path.exists(path2) is not True:
    os.mkdir(path2)
    print("百度图片存放地址"+path2)
cnt = 1
for item in images:
    try:
        src= item['data-src']
        pic = requests.get(src).content
        print(src)
        cnt=cnt+1
        with open(os.path.join(path2,src[-50:-30])+'.gif','wb') as f:
            f.write(pic)
            f.close()
    except Exception:
        print(repr(Exception))
        continue
    print('download successful')
browser.close()

运行过程

结果：

over

倒吃甘蔗

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
《追火车的猫》表情包很火，python爬虫爬一下当表情包

类似这种，正好看到有网页有这种表情包的汇总：https://mp.weixin.qq.com/s?__biz=MzA5MTY0NTYyOQ爬它# -*- coding:utf-8 -*-import timefrom selenium import webdriverfrom selenium.webdriver import ActionChainsfrom selen...
复制链接

扫一扫

专栏目录