python网络请求_Http请求的Python实现

最新推荐文章于 2024-06-14 22:00:57 发布

weixin_39791653

最新推荐文章于 2024-06-14 22:00:57 发布

阅读量576

点赞数

文章标签： python网络请求

一、urllib实现： urllib是Python中的内置模块

实现一个完整的请求与响应模型：urllib提供一个基础函数urlopen，通过向指定的url发出请求来获取数据。

import urllib.request

response = urllib.request.urlopen('http://www.zhihu.com')

html = response.read()

print(html)

二、httplib/urllib实现： httplib模块是一个底层基础模块，可以看到建立http请求的每一步，但是实现的功能比较少，在Python爬虫开发中基本上用不到。

三、Requests实现：是Python爬虫开发中最为常用的方式。Requests库是第三方模块，需要额外进行安装

pip3 install requests

1、实现完整的请求响应模型

(1)get方式：

import requests

r = requests.get('http://www.baidu.com')

print(r.text)

(2)post方式：

import requests

r = requests.post('http://www.baidu.com',data={‘key’:’value’})

print(r.text)

2、响应与编码“：

import requests

r = requests.get('http://www.baidu.com')

print('content----->'+str(r.content))

print('text----->'+r.text)

print('encoding----->'+r.encoding)

r.encoding = 'utf-8'

print('text----→'+r.text)

r.content：返回的是字节形式

r.text：返回的是文本形式

r.encoding：返回的是根据HTTP头猜测的网页编码格式

3、请求头headers处理：在Requests的get函数中添加headers参数即可

import requests

user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5;windows NT'

headers = {'User-Agent':user_agent}

r = requests.get('http://www.baidu.com',headers=headers)

print('content----→'+str(r.content))

4、响应码code和响应头headers处理

import requests

user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5;windows NT'

headers = {'User-Agent':user_agent}

r = requests.get('http://www.baidu.com',headers=headers)

if r.status_code == requests.codes.ok:

print(r.status_code) #响应码

print(r.headers) #响应头

print(r.headers.get('content-type'))#获取响应头的某个字段(推荐)

else:

r.raise_for_status()

注意：raise_for_status()函数是用来主动地产生一个异常，当响应码是4XX或5XX时，raise_for_status()函数会抛出异常，而响应码为200时，raise_for_status()函数返回None

5、Cookie处理：

(1)若响应中包含Cookie的值：

import requests

user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5;windows NT'

headers = {'User-Agent':user_agent}

r = requests.get('http://www.baidu.com',headers=headers)

for cookie in r.cookies.keys():

print(cookie+':'+r.cookies.get(cookie))

(2)自定义Cookie值发送出去：

import requests

user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5;windows NT'

headers = {'User-Agent':user_agent}

cookies = dict(name='maomi',age='3')

r = requests.get('http://www.baidu.com',headers=headers,cookies=cookies)

print(r.text)

(3)自动处理Cookie：不需要关心Cookie的值，只希望每次访问时，程序自动把Cookie带上

import requests

loginUrl = 'http://www.sina.com/login'

s = requests.Session()

#首先访问登录界面，作为游客，服务器会先分配一个cookie

r = s.get(loginUrl,allow_redirects=True)

datas = {'name':'maomi','passwd':'maomi'}

#向登录链接发送post请求，验证成功，游客权限转为会员权限

r = s.post(loginUrl,data=datas,allow_redirects=True)

print(r.text)

6、重定向和历史信息：处理重定向只是需要设置一下 allow_redirects字段即可，将 allow_redirects设置为True，则允许重定向；设置为False，则禁止重定向。如果允许重定向，则可以通过r.history字段查看历史信息

7、超时设置：通过参数timeout来进行设置

8、代理设置：使用代理Proxy,可以为任意请求方法通过设置proxies参数来配置单个请求

import requests

proxies = {

"http":"http://0.10.1.10:3318",

"https":"http://10.10.1.10:1080"

}

requests.get("http://example.org",proxies=proxies)

weixin_39791653

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
python网络请求_Http请求的Python实现

一、urllib实现： urllib是Python中的内置模块实现一个完整的请求与响应模型：urllib提供一个基础函数urlopen，通过向指定的url发出请求来获取数据。import urllib.requestresponse = urllib.request.urlopen('http://www.zhihu.com')html = response.read()print(html)二、...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。