httplib2

最新推荐文章于 2024-07-23 10:00:00 发布

weixin_33698043

最新推荐文章于 2024-07-23 10:00:00 发布

阅读量137

点赞数

文章标签： python

原文链接：http://blog.51cto.com/magicpwn/1731455

版权

Simple Retrieval

import httplib2  
h = httplib2.Http(".cache")  
resp, content = h.request("http://example.org/","GET")

Authentication

import httplib2  
h = httplib2.Http(".cache")  
h.add_credentials('name', 'password')  
resp, content = h.request("https://example.org/chap/2",   ##ssl + base认证      
"PUT", body="This is text",       
headers={'content-type':'text/plain'} )

Cache-Control

import httplib2  
h = httplib2.Http(".cache")  
resp, content = h.request("http://bitworking.org/")  #请求被缓存，下次还会用这个缓存而不去发送的请求，缓存生效时间有web配置决定   ...  
resp, content = h.request("http://bitworking.org/",       
headers={'cache-control':'no-cache'})   ##设置不用缓存，当次将不用缓存，而是直接发一个新的请求

Forms

>>> from httplib2 import Http  
>>> from urllib import urlencode  
>>> h = Http()  
>>> data = dict(name="Joe", comment="A test comment")  
>>> resp, content = h.request("http://bitworking.org/news/223/MeetAres", "POST", urlencode(data))  
>>> resp  {'status': '200', 'transfer-encoding': 'chunked', 'vary': 'Accept-Encoding,User-Agent',   'server': 'Apache', 'connection': 'close', 'date': 'Tue, 31 Jul 2007 15:29:52 GMT',    'content-type': 'text/html'}

Cookies

import urllib  
import httplib2    
http = httplib2.Http()    
url = 'http://www.example.com/login'     
body = {'USERNAME': 'foo', 'PASSWORD': 'bar'}  
headers = {'Content-type': 'application/x-www-form-urlencoded'}  
response, content = http.request(url, 'POST', headers=headers, body=urllib.urlencode(body))  
headers = {'Cookie': response['set-cookie']}  ###将获得cookie设置到请求头中，以备下次请求    
url = 'http://www.example.com/home'     
response, content = http.request(url, 'GET', headers=headers)  ##本次请求就不用带用户名，密码了

Proxies

import httplib2  
import socks      
httplib2.debuglevel=4h = httplib2.Http(proxy_info = httplib2.ProxyInfo(socks.PROXY_TYPE_HTTP, 'localhost', 8000))  r,c = h.request("

======================================================================================

下面是我自己对模块功能的尝试：

Http对象的构造方法：
__init__(self, cache=None, timeout=None, proxy_info=None, ca_certs=None, disable_ssl_certificate_validation=False)
proxy_info 的值是一个 ProxyInfo instance.
|
| 'cache'：
存放cache的位置，要么为字符串，要么为支持文件缓存接口的对象
|
| timeout：
超时时间，默认时会取python对socket链接超时的值
|
| ca_certs：
一个用于ssl服务器认证用的包涵了主CA认证的文件路径，默认会使用httplib2绑定的证书
|
| disable_ssl_certificate_validation：
确定是否进行ssl认证
|
| add_certificate(self, key, cert, domain)
| 添加一个ssl认证key和文件
|
| add_credentials(self, name, password, domain='')
| 添加一个用户名，密码信息
|
| clear_credentials(self)
| 删除掉所有的用户名，密码信息，貌似还是可以存多个用户名和密码
Http.request(self, uri, method='GET', body=None, headers=None, redirections=5, connection_type=None)
说明：
执行单次的http请求
uri：
一个以'http' 或 'https'开头的资源定位符字串，必须是一个绝对的地址
method：
支持所有的http请求方式。如： GET, POST, DELETE, etc..
body：
请求的附件数据，一个经过urllib.urlencode编码的字符串
headers：
请求头信息，一个字典对象
redirections：
最大的自动连续的重定向次数默认为5
返回：
(response, content)元组，response是一个httplib2.Response对象，content就是包含网页源码的字符串
httplib2.Response对象
其实就是一个包含所有头信息的字典，因为它本身就是集成自字典对象的

===========================================================================================

import httplib2
 
#首先我们要访问的是https.使用没有进行https认证的Http(),初始化时就设置好关闭ssl证书认证,disable_ssl_certificate_validation=True;

h = httplib2.Http(disable_ssl_certificate_validation=True)
d,c = h.request('https://ebank.xxxxx.com/pweb/test.do?actionType=1')
#header
print(d)
#content
print(c)
 
 
#访问普通http页面和上面一样
h = httplib2.Http()
d,c = h.request("http://www.xxxx.com/")
#header
print(d)
#content
print(c)
 
#当然也可以进行ssl证书认证
#h = httplib2.Http(proxy_info = httplib2.ProxyInfo(socks.PROXY_TYPE_SOCKS5, self.px_url, self.proxy_port))
#h.add_certificate(self.certificate.ikeyfile, self.certificate.certfile, self.url)
#resp, content = h.request("https://"+self.url+":"+str(self.remote_port)+self.path+query)
 
 
#带.cache
h2 = httplib2.Http('.cache')  
resp2,content2 = h2.request('http://www.baidu.com/')  
print resp2  
print content2  
#再".cache"目录下找到刚才访问的相关内容文件”#www.baidu.com,,f03f5717616221de41881be555473a02“，是baidu.com的缓存文件，用记事本打开可知里面带偶内容和httprespond头信息
 
 
#带.cache和ssl的用户名密码认证，算是结合上面两个吧
h3 = httplib2.Http(".cache")
h3.add_credentials('name', 'password')
resp3, content3 = h3.request("https://www.google.com",
    "GET",headers={'content-type':'text/plain'} )
print resp3
print content3