python3 urllib.request程序崩溃,被Python3搞得好崩溃，抓取网页的有关问题

最新推荐文章于 2023-03-22 14:33:40 发布

weixin_39633171

最新推荐文章于 2023-03-22 14:33:40 发布

阅读量307

点赞数

文章标签： python3 urllib.request程序崩溃

被Python3搞得好崩溃，抓取网页的问题

赶时髦，装了个python3.3，发现网上很多资料都是2.7的，没关系，自己慢慢研究吧，可是搞了个抓取网页的程序，一运行就报错，找了几个网上类似的Python3的代码，跑了一下一样的错误，真的被这些脚本语言的环境和版本匹配搞得快崩溃了，哪位有类似经验的帮我看看吧：

代码：

import urllib.parse

import urllib.request

url='http://www.xxx.com'

user_agent='Mozilla/4.0 (compatible; MSIE5.5; Windows NT)'

values={'name':'Michael Foord',

'location':'Northampton',

'language':'Python'}

headers={ 'User-Agent' : user_agent}

data=urllib.parse.urlencode(values)

req=urllib.request.Request(url, data, headers)

response=urllib.request.urlopen(req)

the_page=response.read()

print (the_page)

一运行就得到

File "E:/work/url.py", line 13, in

response=urllib.request.urlopen(req)

File "C:\Python33\lib\urllib\request.py", line 160, in urlopen

return opener.open(url, data, timeout)

File "C:\Python33\lib\urllib\request.py", line 471, in open

req = meth(req)

File "C:\Python33\lib\urllib\request.py", line 1183, in do_request_

raise TypeError(msg)

TypeError: POST data should be bytes or an iterable of bytes. It cannot be of type str.

这样的错误，试了几个别人的代码都不行，有遇到类似问题的吗？是我环境中什么版本不对吗？

------解决方案--------------------

data should be a buffer in the standard application/x-www-form-urlencoded format. The urllib.parse.urlencode() function takes a mapping or sequence of 2-tuples and returns a string in this format. It should be encoded to bytes before being used as the data parameter. The charset parameter in Content-Type header may be used to specify the encoding. If charset parameter is not sent with the Content-Type header, the server following the HTTP 1.1 recommendation may assume that the data is encoded in ISO-8859-1 encoding. It is advisable to use charset parameter with encoding used in Content-Type header with the Request.

------解决方案--------------------

关于typeerror

2.7 是 ascii，3.3 是utf-8，都是string类型，但socket (urllib等都是基于socket) 使用bytes

timeout是网络问题，一般不是程序问题，最好加个捕捉来处理

但也有可能是其他错误(例如传输错误的内容)引起，所以先改好其他再测试一遍

------解决方案--------------------

req=urllib.request.Request(url, data, headers)

print(req.read())

输出结果是什么？还会报错吗

weixin_39633171

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python3 urllib.request程序崩溃,被Python3搞得好崩溃，抓取网页的有关问题

被Python3搞得好崩溃，抓取网页的问题赶时髦，装了个python3.3，发现网上很多资料都是2.7的，没关系，自己慢慢研究吧，可是搞了个抓取网页的程序，一运行就报错，找了几个网上类似的Python3的代码，跑了一下一样的错误，真的被这些脚本语言的环境和版本匹配搞得快崩溃了，哪位有类似经验的帮我看看吧：代码：importurllib.parseimporturllib.requesturl=...
复制链接

扫一扫