python下载图片、已知url_如何使用我已知道其URL地址的Python在本地保存图像?

原英文标题

How to save an image locally using Python whose URL address I already know?

我知道Internet上图像的URL。

现在,如何使用Python下载此图像,而无需在浏览器中实际打开URL并手动保存文件。

10 个回复:

===============>>#1 票数:268 已采纳

Python 2

如果您只想将其保存为文件,这是一种更直接的方法:import urllib

urllib.urlretrieve("http://www.digimouth.com/news/media/2011/09/google-logo.jpg", "local-filename.jpg")

第二个参数是应保存文件的本地路径。

Python 3

正如SergO建议的,下面的代码应该适用于Python 3。import urllib.request

urllib.request.urlretrieve("http://www.digimouth.com/news/media/2011/09/google-logo.jpg", "local-filename.jpg")

===============>>#2 票数:26

import urllib

resource = urllib.urlopen("http://www.digimouth.com/news/media/2011/09/google-logo.jpg")

output = open("file01.jpg","wb")

output.write(resource.read())

output.close()

file01.jpg将包含您的图像。

===============>>#3 票数:16

我写了一个脚本来做这个 ,它可以在我的github上供你使用。

我利用BeautifulSoup允许我解析任何网站的图像。 如果你要做很多网页抓取(或打算使用我的工具),我建议你sudo pip install BeautifulSoup 。 有关BeautifulSoup的信息,请点击此处 。

为方便起见,这是我的代码:from bs4 import BeautifulSoup

from urllib2 import urlopen

import urllib

# use this image scraper from the location that

#you want to save scraped images to

def make_soup(url):

html = urlopen(url).read()

return BeautifulSoup(html)

def get_images(url):

soup = make_soup(url)

#this makes a list of bs4 element tags

images = [img for img in soup.findAll('img')]

print (str(len(images)) + "images found.")

print 'Downloading images to current working directory.'

#compile our unicode list of image links

image_links = [each.get('src') for each in images]

for each in image_links:

filename=each.split('/')[-1]

urllib.urlretrieve(each, filename)

return image_links

#a standard call looks like this

#get_images('http://www.wookmark.com')

===============>>#4 票数:6

适用于Python 2和Python 3的解决方案:try:

from urllib.request import urlretrieve # Python 3

except ImportError:

from urllib import urlretrieve # Python 2

url = "http://www.digimouth.com/news/media/2011/09/google-logo.jpg"

urlretrieve(url, "local-filename.jpg")

或者,如果requests的附加要求是可接受的,并且如果它是http(s)URL:def load_requests(source_url, sink_path):

"""

Load a file from an URL (e.g. http).

Parameters

----------

source_url : str

Where to load the file from.

sink_path : str

Where the loaded file is stored.

"""

import requests

r = requests.get(source_url, stream=True)

if r.status_code == 200:

with open(sink_path, 'wb') as f:

for chunk in r:

f.write(chunk)

===============>>#5 票数:5

我在Yup。的脚本上扩展了一个脚本。 我修了一些东西。 它现在将绕过403:禁止的问题。 当图像无法检索时,它不会崩溃。 它试图避免损坏的预览。 它获得了正确的绝对网址。 它提供了更多信息。 它可以使用命令行中的参数运行。# getem.py

# python2 script to download all images in a given url

# use: python getem.py http://url.where.images.are

from bs4 import BeautifulSoup

import urllib2

import shutil

import requests

from urlparse import urljoin

import sys

import time

def make_soup(url):

req = urllib2.Request(url, headers={'User-Agent' : "Magic Browser"})

html = urllib2.urlopen(req)

return BeautifulSoup(html, 'html.parser')

def get_images(url):

soup = make_soup(url)

images = [img for img in soup.findAll('img')]

print (str(len(images)) + " images found.")

print 'Downloading images to current working directory.'

image_links = [each.get('src') for each in images]

for each in image_links:

try:

filename = each.strip().split('/')[-1].strip()

src = urljoin(url, each)

print 'Getting: ' + filename

response = requests.get(src, stream=True)

# delay to avoid corrupted previews

time.sleep(1)

with open(filename, 'wb') as out_file:

shutil.copyfileobj(response.raw, out_file)

except:

print ' An error occured. Continuing.'

print 'Done.'

if __name__ == '__main__':

url = sys.argv[1]

get_images(url)

===============>>#6 票数:5

Python 3from urllib.error import HTTPError

from urllib.request import urlretrieve

try:

urlretrieve(image_url, image_local_path)

except FileNotFoundError as err:

print(err) # something wrong with local path

except HTTPError as err:

print(err) # something wrong with url

===============>>#7 票数:5

这可以通过请求完成。 加载页面并将二进制内容转储到文件中。import os

import requests

url = 'https://apod.nasa.gov/apod/image/1701/potw1636aN159_HST_2048.jpg'

page = requests.get(url)

f_ext = os.path.splitext(url)[-1]

f_name = 'img{}'.format(f_ext)

with open(f_name, 'wb') as f:

f.write(page.content)

===============>>#8 票数:2

Python 3的版本

我为Python 3调整了@madprops的代码# getem.py

# python2 script to download all images in a given url

# use: python getem.py http://url.where.images.are

from bs4 import BeautifulSoup

import urllib.request

import shutil

import requests

from urllib.parse import urljoin

import sys

import time

def make_soup(url):

req = urllib.request.Request(url, headers={'User-Agent' : "Magic Browser"})

html = urllib.request.urlopen(req)

return BeautifulSoup(html, 'html.parser')

def get_images(url):

soup = make_soup(url)

images = [img for img in soup.findAll('img')]

print (str(len(images)) + " images found.")

print('Downloading images to current working directory.')

image_links = [each.get('src') for each in images]

for each in image_links:

try:

filename = each.strip().split('/')[-1].strip()

src = urljoin(url, each)

print('Getting: ' + filename)

response = requests.get(src, stream=True)

# delay to avoid corrupted previews

time.sleep(1)

with open(filename, 'wb') as out_file:

shutil.copyfileobj(response.raw, out_file)

except:

print(' An error occured. Continuing.')

print('Done.')

if __name__ == '__main__':

get_images('http://www.wookmark.com')

===============>>#9 票数:1

这是一个非常简短的答案。import urllib

urllib.urlretrieve("http://photogallery.sandesh.com/Picture.aspx?AlubumId=422040", "Abc.jpg")

===============>>#10 票数:-1

img_data=requests.get('https://apod.nasa.gov/apod/image/1701/potw1636aN159_HST_2048.jpg')

with open(str('file_name.jpg', 'wb') as handler:

handler.write(img_data)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值