urlgrabber: A high-level cross-protocol url-grabber

urlgrabber: A high-level cross-protocol url-grabber

urlgrabber

urlgrabber is a pure python package that drastically simplifies
the fetching of files. It is designed to be used in programs
that need common (but not necessarily simple) url-fetching
features. It is extremely simple to drop into an existing
program and provides a clean interface to protocol-independant
file-access. Best of all, urlgrabber takes care of all those
pesky file-fetching details, and lets you focus on whatever it
is that your program is written to do!

urlgrabber came into existence as the part of yum that
downloads rpms and header files, but it quickly became clear
that this is a general problem that many applications must deal
with.

Features

Using urlgrabber, data can be fetched in three basic ways:

urlgrab(url) copy the file to the local filesystem
urlopen(url) open the remote file and return a file object
urlread(url) return the contents of the file as a string

When using these functions (or methods), urlgrabber supports the
following features:

  • identical behavior for http://, ftp://, and file:// urls
  • http keepalive - faster downloads of many files by using
    only a single connection
  • byte ranges - fetch only a portion of the file
  • reget - for a urlgrab, resume a partial download
  • progress meters - the ability to report download progress
    automatically, even when using urlopen!
  • throttling - restrict bandwidth usage
  • batched downloads using threads - download multiple files
    simultaneously (feature still in progress)
  • retries - automatically retry a download if it fails. The
    number of retries and failure types are configurable
  • authenticated server access for http and ftp
  • proxy support - support for authenticated http and ftp
    proxies
  • mirror groups - treat a list of mirrors as a single
    source, automatically switching mirrors if there is a
    failure
  • broad support - unix and windows, python 2.3 - 2.5 (it
    currently works for and is tested against 2.2 also, but that
    will be dropped if it becomes difficult)

Not sure if urlgrabber is the tool for you? Check out our comparison of the major options.

Documentation, Examples, and Help

There are many sources of urlgrabber-related assistance and
information

  • The urlgrabber package
    documentation
    , built from the __doc__ strings
    using pydoc
  • The examples page
  • Browsable urlgrabber
    git repo
  • The yum-devel
    mailing list. For now, urlgrabber is piggy-backing on this
    list. If it becomes necessary, we will get our own list.
    When posting to this list, please indicate that it is a
    urlgrabber-related post by beginning the subject with
    [UG].
posted on 2012-04-08 09:45  lexus 阅读( ...) 评论( ...) 编辑 收藏

转载于:https://www.cnblogs.com/lexus/archive/2012/04/08/2437325.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值