之前遇到的坑,今天又遇到了,在此记录一下,
爬虫时,转码data时,出现下面错误信息:
POST data should be bytes, an iterable of bytes, or a file object. It cannot be of type str.
测试代码:
# -*- coding:utf-8 -*-
import urllib.request
def movieSpider():
"""
模拟Ajax请求
"""
url = "https://movie.douban.com/j/chart/top_list?"
header = {"User-Agent" : "Opera/9.80 (Windows NT 6.1; U; en) Presto/2.7.62 Version/11.01"}
formData = {
"type" : "11",
"interval_id" : "100:90",
"action" : "",
"start" : "0",
"limit" : "20"
}
data = urllib.parse.urlencode(formData)
request = urllib.request.Request(url, data=data, headers=header)
print(url)
print(data)
print(urllib.request.urlopen(request).read())
if __name__ == "__main__":
movieSpider()
报错截图:
查看源文档:https://docs.python.org/3/library/urllib.parse.html
Convert a mapping object or a sequence of two-element tuples, which may contain str or bytes objects, to a percent-encoded ASCII text string. If the resultant string is to be used as a data for POST operation with the urlopen() function, then it should be encoded to bytes, otherwise it would result in a TypeError. |
修改代码如下:
# -*- coding:utf-8 -*-
import urllib.request
def movieSpider():
"""
模拟Ajax请求
"""
url = "https://movie.douban.com/j/chart/top_list?"
header = {"User-Agent" : "Opera/9.80 (Windows NT 6.1; U; en) Presto/2.7.62 Version/11.01"}
formData = {
"type" : "11",
"interval_id" : "100:90",
"action" : "",
"start" : "0",
"limit" : "20"
}
#将str类型转换为bytes类型
data = urllib.parse.urlencode(formData).encode("utf-8")
request = urllib.request.Request(url, data=data, headers=header)
print(urllib.request.urlopen(request).read().decode("utf-8"))
if __name__ == "__main__":
movieSpider()
总结:
多看源文档少踩坑