注意:此答案已过时。requests 作为AntonioHerraizS的答复文档,较新的支持版本直接获取请求内容。
无法获得请求的真实原始内容requests,因为它仅处理更高级别的对象,例如标头和方法类型。requests使用urllib3发送请求,但urllib3 还不能与原始数据处理-它使用httplib。这是请求的代表性堆栈跟踪:
-> r= requests.get("http://google.com")
/usr/local/lib/python2.7/dist-packages/requests/api.py(55)get()
-> return request('get', url, **kwargs)
/usr/local/lib/python2.7/dist-packages/requests/api.py(44)request()
-> return session.request(method=method, url=url, **kwargs)
/usr/local/lib/python2.7/dist-packages/requests/sessions.py(382)request()
-> resp = self.send(prep, **send_kwargs)
/usr/local/lib/python2.7/dist-packages/requests/sessions.py(485)send()
-> r = adapter.send(request, **kwargs)
/usr/local/lib/python2.7/dist-packages/requests/adapters.py(324)send()
-> timeout=timeout
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py(478)urlopen()
-> body=body, headers=headers)
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py(285)_make_request()
-> conn.request(method, url, **httplib_request_kw)
/usr/lib/python2.7/httplib.py(958)request()
-> self._send_request(method, url, body, headers)
在httplib机器内部,我们可以看到HTTPConnection._send_request间接使用HTTPConnection._send_output,它最终创建了原始请求和主体(如果存在),并使用HTTPConnection.send了分别发送它们。send终于到达插座。
由于没有钩子可以做您想做的事情,因此,万不得已时,您可以猴子补丁httplib来获取内容。这是一个脆弱的解决方案,如果httplib进行了更改,您可能需要对其进行调整。如果打算使用此解决方案分发软件,则可能要考虑打包httplib而不是使用系统的打包,这很容易,因为它是纯python模块。
las,不费吹灰之力,解决方案:
import requests
import httplib
def patch_send():
old_send= httplib.HTTPConnection.send
def new_send( self, data ):
print data
return old_send(self, data) #return is not necessary, but never hurts, in case the library is changed
httplib.HTTPConnection.send= new_send
patch_send()
requests.get("http://www.python.org")
产生输出:
GET / HTTP/1.1
Host: www.python.org
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: python-requests/2.1.0 CPython/2.7.3 Linux/3.2.0-23-generic-pae