python中的urllib和urllib2

最新推荐文章于 2024-08-25 23:24:39 发布

felix_yujing

最新推荐文章于 2024-08-25 23:24:39 发布

阅读量464

点赞数 1

分类专栏： python

本文链接：https://blog.csdn.net/felix_yujing/article/details/50981895

版权

python 专栏收录该内容

10 篇文章 0 订阅

订阅专栏

python中有urllib和urllib2两个模块，但并不是urllib2就完全可以取代urllib

看官方文档上的解释，

urllib (https://docs.python.org/2/library/urllib.html )

open arbitrary resources by URL. This module provides a high-level interface for fetching data across the World Wide Web. It can only open URLs for reading, and no seek operations are available.

urllib2 (https://docs.python.org/2/library/urllib2.html )

extensible library for opening URLs. The urllib2 module defines functions and classes which help in opening URLs (mostly HTTP) in a complex world - basic and digest authentication, redirections, cookies and more.

单看上面这点描述，并不能明白他们之间的区别。简单来说，他们都有些功能是对方没有的。所以，通常我们都将urllib和urllib2配合来使用。

- urllib提供urlencode方法，这个方法可以用来生成url查询参数，但urllib2就没有这个方法

- urllib2可以接受一个Request对象，来设置url请求的headers属性，也就是可以使用此方法来做请求的伪装

下面来看一个示例（这个例子参考自Python Cookbook）：

#coding:utf-8

import urllib
import urllib2

url = 'http://httpbin.org/post'

# 查询参数
parms = {
         'name1': 'value1',
         'name2': 'value2'
         }
# 设置headers
headers = {
           'User-agent': 'felix /shanghai',
           'Spam ':'Eggs'
           }

# encode查询参数
querystring = urllib.urlencode(parms)

# 生成一个POST请求，并获取响应
request1 = urllib.urlopen(url, querystring.encode( 'ascii '))
response1 = request1.read()

# 生成一个修改了headers字段的请求，并获取响应
request2 = urllib2.Request(url, querystring.encode( 'ascii '), headers=headers)
response2 = urllib2.urlopen(request2).read()

print response1
print '============================='
print response2