python urllib爬虫_python+urllib爬虫

最新推荐文章于 2023-11-03 15:40:05 发布

Jasmine松茸

最新推荐文章于 2023-11-03 15:40:05 发布

阅读量73

点赞数

文章标签： python urllib爬虫

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_33016287/article/details/113964516

版权

导入包

import urllib.request

import urllib.parse

获取get请求

response = urllib.request.urlopen("http://httpbin.org/get")

print(response.read().decode('utf-8'))

get超时(预备方案，问题最后解决使程序正常运行)

try:

response = urllib.request.urlopen("http://httpbin.org/get", timeout=0.01)

print(response.read().decode('utf-8'))

except urllib.error.URLError as e:

print("time out")

获取post请求

data = bytes(urllib.parse.urlencode({"hello":"world"}), encoding="utf-8")

response = urllib.request.urlopen("http://httpbin.org/post", data=data)

print(response.read().decode('utf-8'))

post伪装浏览器，解决418

1、User-Agent获取

随便打开一个网站(百度为例)，点F12(或者鼠标右键点击检查)，在点击Network

重新刷新网页(或者点击F5)，点击iconfont-9f2f4dde78.woff2

复制User-Agent

2、代码

url = "http://httpbin.org/post"

headers = {

"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"

}

data = bytes(urllib.parse.urlencode({'name':'lucy'}), encoding="utf-8")

req = urllib.request.Request(url=url,data=data,headers=headers,method="POST")

response = urllib.request.urlopen(req)

print(response.read().decode('utf-8'))

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python urllib爬虫_python+urllib爬虫

导入包import urllib.requestimport urllib.parse获取get请求response = urllib.request.urlopen("http://httpbin.org/get")print(response.read().decode('utf-8'))get超时(预备方案，问题最后解决使程序正常运行)try:response = urllib.reques...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。