原文链接: python爬虫-urllib+cookie+json+POST
import urllib.request
from http.cookiejar import CookieJar
import json
url = 'http://www.baidu.com'
req_dict = {'k': 'v'}
cj = CookieJar()
handler = urllib.request.HTTPCookieProcessor(cj)
opener = urllib.request.build_opener(handler)
req_json = json.dumps(req_dict)
req_post = req_json.encode('utf-8')
headers = {}
#headers['Content-Type'] = 'application/json'
req = urllib.request.Request(url=url, data=req_post, headers=headers)
#urllib.request.install_opener(opener)
#res = urllib.request.urlopen(req)
# 或
res = opener.open(req)
res = res.read().decode('utf-8')
print(res)
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
发送json, 去掉headers['Content-Type'] = 'application/json'
注释即可
相关知识:
- 默认的urllib.request.urlopen()不带cookie信息
- urlopen是一个封装好的OpenerDirector实例,参数为(url, data, timeout)
- 通过build_opener可以创建OpenerDirector实例, build_opener(*handlers), 将handler类实例化增加到OpenerDirector中
- 使用urllib2.install_opener()会设置urllib2的全局opener, 也可以直接调用opener.open()代替全局的urlopen方法
- 如果已知cookie内容, 且固定不变, 可以在header中直接添加cookie内容发送请求
headers["Cookie"] = "xxxxx"
req = urllib.request.Request(url=url, headers=headers, data=req_post)
- 1
- 2
- python3与python2.7部分区别:
python 3.x中urllib库和urilib2库合并为urllib库
urllib2.urlopen变为urllib.request.urlopen
urllib2.Request变为urllib.request.Request
urllib2.urlencode()变为urllib.parse.urlencode
cookielib变为http.cookiejar