用wget刮图像

The desire to download all images or video on the page has been around since the beginning of the internet.  Twenty years ago I would accomplish this task with a python script I downloaded.  I then moved on to browser extensions for this task, then started using a PhearJS Node.js JavaScript utility to scrape images.  All of these solutions are nice but I wanted to know how I could accomplish this task from command line.

自Internet诞生以来,人们一直渴望下载页面上的所有图像或视频。 二十年前,我将使用下载的python脚本来完成此任务。 然后,我转到该任务的浏览器扩展程序,然后开始使用PhearJS Node.js JavaScript实用程序来刮取图像。 所有这些解决方案都不错,但是我想知道如何从命令行完成此任务。

To scrape images (or any specific file extensions) from command line, you can use wget:

要从命令行刮取图像(或任何特定的文件扩展名),可以使用wget


wget -nd -H -p -A jpg,jpeg,png,gif -e robots=off http://boards.4chan.org/sp/


The script above downloads images across hosts (i.e. from a CDN or other subdomain) to the directory from which the command is run from.  You'll see downloaded media as they come down:

上面的脚本将跨主机的映像下载(即从CDN或其他子域)到运行命令的目录。 当下载的媒体出现故障时,您将看到它们:


Reusing existing connection to s.4cdn.org:80.
HTTP request sent, awaiting response... 200 OK
Length: 1505 (1.5K) [image/jpeg]
Saving to: '1490571194319s.jpg'

1490571194319s.jpg 100%[=====================>] 1.47K --.-KB/s in 0s

2017-03-26 18:33:26 (205 MB/s) - '1490571194319s.jpg' saved [1505/1505]

FINISHED --2017-03-26 18:33:26--
Total wall clock time: 2.7s
Downloaded: 66 files, 412K in 0.2s (2.10 MB/s)


Everyone loves cURL, which is another awesome resource, but don't foget about wget, which is arguably easier to use!

每个人都喜欢cURL,这是另一个很棒的资源,但是请不要担心wget ,它可以说更容易使用!

翻译自: https://davidwalsh.name/scrape-images-wget

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值