Requests库(get请求,post请求,reponse属性、文件上传、cookies模拟登录、异常处理)

1. Requests库

avatar

  1. 概述: Requests是用python语言编写,基于Urllib,采用Apache2Licensed开源的协议HTTP库,他比urllib更加方便,可以节省大量的工作,完全满足HTTP

  2. Request支持HTTP连接保持和连接池,支持使用cookie保持会话,支持文件上传,支持自动响应内容的编码,支持国际化的URL和POST数据自动编码。

  3. import requests
    r = requests.get(“https://api.github.com/events”)
    print® # <Response [200]>
    print(type®) # <class ‘requests.models.Response’>
    print(r.status_code) # 200

  4. r = requests.post(“http://httpbin.org/post”,data = {‘key’:‘value’}) # 发送一个 HTTP POST 请求
    r = requests.delete(‘http://httpbin.org/delete’) # 发送一个 HTTP delete 请求:
    r = requests.head(‘http://httpbin.org/get’) # 发送一个 HTTP head 请求:
    r = requests.options(‘http://httpbin.org/get’) # 发送一个 HTTP options 请求:

  5. response = requests.get(“https://api.github.com/events”)
    print(response) # <Response [200]>
    print(response.text) # Json格式

2. 请求

带参数GET请求
import requests

data = {
	'name':'leadingme',
	'age':18
}

response = requests.get('http://httpbin/get',params=data)

解析Json

import requests

response = requests.get('htpp://httpbin/get')
print(response.text)
print(respons.json())
print(type(response.json())) # response.text为str  response.json()为json
获取二进制
  • List item
import requests

response = requests.get('http://github.com/favicon.ico')
print('response.text')
print('response.content')

with open('favicon.icon','wb') as f:
	f.write(response.content)
	f.close()
添加headers
import requests

headers = {
	'User-Agent': ''Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36'
}
response = requests.get('http://www.zhihu.com/explore',headers=headers)
print(response.text)
基本POST请求
import requests

data = {
	'name':'leadingme',
	'age':18
}
headers = {
	'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36'
}
response = requests.post('http://httpbin.org/post',data=data,headers=headers)
print(response.text)


3. 不允许网页跳转

response = requests.post('http://httpbin.org/post',data=data,headers=headers, allow_redirects=False)

4. 响应

response属性
import requests

response =request.get('http://www.jianshu.com')
print(type(response.status_code),response.status_code)
print(type(response.headers),response.headers)
print(type(response.cookies),response.cookies)
print(type(response.url),response.url)
print(type(response.history),response.history)

5.状态码判断

  1. 常见状态码及含义:
    301 Moved Permanently: 重定向到新的URL,永久性
    302 Fount: 重定向到新临时的URL,非永久性
    304 Not Modified: 请求资源未更新
    400 Bad Request: 非法请求
    401 Unauthorized 请求未经授权
    403 Forbidden: 禁止访问
    404 Not Found : 没有找到对应页面
    500 Internal Server Error: 服务器内部出错
    501 Not Implmented: 服务器不支持实现请求所需的功能
	import requests
	
	response = requests.get('http://www.jianshu.com')
	if not response.status_code == 200:
		exit() 
	else:
		print("Request Successfully!")
		
		
	response = requests.get('http://www.jianshu.com')
	exit() if not request.status_code == requests.codes.not_fount else print('404 Not Found')

6.文件上传

	import requests
	#  原理跟传data一样
	files = {'file':open('favicon.icon','rb')}
	response = requests.post('http://httpbin.org/post',files=data)
	print(response.text)

7.获取cookies

import requests

response = requests.get('http://www.baidu.com')
print(response.cookies)
for key,value in response.cookie.items():
	print(key +'=' + value)

8.会话维持

import requests

s = requests.Session()   # 创建一个会话
s.get('http://httpbin.org/cookies/set/number/123456789')  # 设置cookies
response = s.get('http://httpbin.org/cookies')   # 获取cookies
print(response.text)

9.证书安全

  1. SSL安全证书 是通过在客户端浏览器和Web服务器之间建立一条SSL安全通道保证了双方传递信息的安全性,而且用户可以通过服务器证书验证他所访问的网站是否真实可靠

    import requests
    from requests.packages import urllib3
    #  取消证书验证
    urllib3.disable_warnings()
    reponse = requests.get('https://www.12306.cn',verify=False)
    print(response.status_code)
    

10.代理设置

  1. 个完整的代理请求过程为:客户端首先与代理服务器创建连接,接着根据代理服务器所使用的代理协议,请求对目标服务器创建连接、或者获得目标服务器的指定资源(如:文件)。在后一种情况中,代理服务器可能对目标服务器的资源下载至本地缓存,如果客户端所要获取的资源在代理服务器的缓存之中,则代理服务器并不会向目标服务器发送请求,而是直接传回已缓存的资源

  2. 爬虫中可以使用ProxyHeadler设置代理,伪装自己的IP地址。爬取时可以不停地切换IP,服务器检测到不听地域的访问,不会禁用

    import requests
    
    proxies={
    	"https": "https://47.100.104.247:8080", 
    	"http": "http://36.248.10.47:8080", 
    }
    reponse = requests.get('http://www.taobao.com',proxies=proxies)
    print(response.status_code)
    

11.超时设置

import requests

try:
	response = requests.get('http://httpbin.org/get',timeout=0.1)
	print(response.status_code)
except requests.Timeout:
	print('TimeOut!')

12.认证设置

import requests

r = requests.get('http://120.27.34.24:9001',auth=('user','123'))
print(t.status_code)

13.异常函数

  1. requests.ConnectionError 网络连接错误异常,如DNS查询失败、拒绝连接等
  2. requests.HTTPError HTTP错误异常
  3. requests.URLRequired URL缺失异常
  4. requests.TooManyRedirects 超过最大重定向次数,产生重定向异常
  5. requests.ConnectTimeout 连接远程服务器超时异常
  6. requests.Timeout 请求URL超时,产生超时异常
  7. requests.SSLError

14.异常处理

import requests
from requests.exceptions import Timeout, HTTPError, RequestException

try:
	response = requests.get('http://httpbin.org/get',timeout=0.5)
except Timeout:
	print('TimeOut!')
except HTTPError:
	print('HttpError!')
except RequestsException:
	print('Error!')
else:
	print('Request Sucessfully!')
发布了57 篇原创文章 · 获赞 26 · 访问量 3万+
展开阅读全文

没有更多推荐了,返回首页

©️2019 CSDN 皮肤主题: 1024 设计师: 上身试试

分享到微信朋友圈

×

扫一扫,手机浏览