Python爬取百度、必应、搜狗、谷歌的import re import requests from urllib import error from bs4 import BeautifulSo图片

最新推荐文章于 2022-07-04 12:31:37 发布

u014295536

最新推荐文章于 2022-07-04 12:31:37 发布

阅读量1.1k

点赞数

分类专栏：入门笔记文章标签： Python 爬虫图片爬取百度必应

本文链接：https://blog.csdn.net/u014295536/article/details/100658999

版权

其实原理都是一样的，只是每个网站的结构不一样，因此只是稍加修改了一下

当然如果能用一个程序去跑可能更好，当时急着用，用完了也没后续优化，就先分享出来了。

百度

import re

import requests

from urllib import error

from bs4 import BeautifulSoup

import os

num = 0

numPicture = 0

file = ''

List = []


def Find(url):
    global List

    #print('正在检测图片总数，请稍等.....')

    t = 0

    i = 1

    s = 0

    while t < 2000:

        Url = url + str(t)

        try:

            Result = requests.get(Url, timeout=7)

        except BaseException:

            t = t + 60

            continue

        else:

            result = Result.text

            pic_url = re.findall('"objURL":"(.*?)",', result, re.S)  # 先利用正则表达式找到图片url

            s += len(pic_url)

            if len(pic_url) == 0:

                break

            else:

                List.append(pic_url)

                t = t + 60

    return s


def recommend(url):
    Re = []

    try:

        html = requests.get(url)

    except error.HTTPError as e:

        return

    else:

        html.encoding = 'utf-8'

        bsObj = BeautifulSoup(html.text, 'html.parser')

        div = bsObj.find('div', id='topRS')

        if div is not None:

            listA = div.findAll('a')

            for i in listA:

                if i is not None:
                    R

最低0.47元/天解锁文章

u014295536

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Python爬取百度、必应、搜狗、谷歌的import re import requests from urllib import error from bs4 import BeautifulSo图片

其实原理都是一样的，只是每个网站的结构不一样，因此只是稍加修改了一下当然如果能用一个程序去跑可能更好，当时急着用，用完了也没后续优化，就先分享出来了。百度import reimport requestsfrom urllib import errorfrom bs4 import BeautifulSoupimport osnum = 0numPictu...
复制链接

扫一扫