转 Requests库详解

安装方法:pip3 install requests

实例引入


 
 
  1. import requests
  2. response = requests.get( 'https://www.baidu.com/') #传入一个网址,解析它的源代码
  3. print(type(response)) #打印其类型
  4. print(response.status_code) #打印状态码
  5. print(type(response.text)) #打印网址源代码的属性
  6. print(response.text) #得到网址的源代码
  7. print(response.cookies) #打印Cookies
  8. 运行结果:
  9. ①:< class 'requests.models.Response'>
  10. ②:200
  11. ③:<class 'str'>
  12. ④:<!DOCTYPE html>feedback>æè§åé¦</a>&nbsp;京ICPè¯030173å·&nbsp; <img src=//www.baidu.com/img/gs.gif> </p> </div> </div> </div> </body> </html>
  13. ⑤:<RequestsCookieJar[<Cookie BDORZ=27315 for .baidu.com/>]>

各种请求方式


 
 
  1. import requests
  2. requsets.post( 'http://httpbin.org/post')
  3. requests.put( 'http://httpbin.org/put')
  4. requests.delete( 'http://httpbin.org/delete')
  5. requests.head( 'http://httpbin.org/get')
  6. requests.options( 'http://httpbin.org/get')

请求

基本GET请求

基本写法


 
 
  1. import requests
  2. response = requests.get( 'http://httpbin.org/get') #传入一个网址,并解析这个网址
  3. print(response.text) #将这个网址的源代码打印出来
  4. 运行结果:
  5. {
  6. "args": {},
  7. "headers": {
  8. "Accept": "*/*",
  9. "Accept-Encoding": "gzip, deflate",
  10. "Connection": "close",
  11. "Host": "httpbin.org",
  12. "User-Agent": "python-requests/2.19.1"
  13. },
  14. "origin": "222.210.218.46",
  15. "url": "http://httpbin.org/get"
  16. }

带参数GET请求


 
 
  1. import requests
  2. response = requests.get( 'http://httpbin.org/get?name=germey&age=22')
  3. print(response.text)
  4. 运行结果:
  5. {
  6. "args": {
  7. "age": "22",
  8. "name": "germey"
  9. },
  10. "headers": {
  11. "Accept": "*/*",
  12. "Accept-Encoding": "gzip, deflate",
  13. "Connection": "close",
  14. "Host": "httpbin.org",
  15. "User-Agent": "python-requests/2.19.1"
  16. },
  17. "origin": "222.210.218.46",
  18. "url": "http://httpbin.org/get?name=germey&age=22"
  19. }

另一种方式传参


 
 
  1. import requests
  2. data = {
  3. 'name' : 'germey',
  4. 'age' : 22
  5. }
  6. response = requests.get( 'http://httpbin.org/get',params=data) #传入一个字典形式的参数
  7. print(response.text)
  8. 运行结果:
  9. {
  10. "args": {
  11. "age": "22",
  12. "name": "germey"
  13. },
  14. "headers": {
  15. "Accept": "*/*",
  16. "Accept-Encoding": "gzip, deflate",
  17. "Connection": "close",
  18. "Host": "httpbin.org",
  19. "User-Agent": "python-requests/2.19.1"
  20. },
  21. "origin": "222.210.218.46",
  22. "url": "http://httpbin.org/get?name=germey&age=22"
  23. }
  24. PS:运行结果与第一种完全一致

解析json


 
 
  1. import requests
  2. import json
  3. response = requests.get( 'http://httpbin.org/get')
  4. print(type(response.text))
  5. print(response.json()) #返回json的字符串,如果传入的是json形式的字符串就会返回一个json的对象
  6. print(json.loads(response.text)) #返回结果与上面一致
  7. print(type(response.json()))
  8. 运行结果:
  9. ①:< class 'str'>
  10. ②:{'args': {}, 'headers': { 'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'Connection': 'close', 'Host': 'httpbin.org', 'User-Agent': 'python-requests/2.19.1'}, 'origin': '222.210.218.46', 'url': 'http://httpbin.org/get'}
  11. ③:{ 'args': {}, 'headers': { 'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'Connection': 'close', 'Host': 'httpbin.org', 'User-Agent': 'python-requests/2.19.1'}, 'origin': '222.210.218.46', 'url': 'http://httpbin.org/get'}
  12. ④:< class 'dict'>

获取二进制数据


 
 
  1. import requests #下载(获取)图片、视频常用的一个方法
  2. response = requests.get( 'https://github.com/favicon.ico') #将图片的链接传入
  3. print(type(response.text),type(response.content))
  4. print(response.text)
  5. print(response.content) #获取图片的二进制内容
  6. 运行结果:
  7. ①:< class 'str'> <class 'bytes'>
  8. ②::�������O L������ ���
  9. ③: b'\x00\x00\x01\x00\x02\x00\x10\x10\......

 
 
  1. import requests #将获取的图片保存下来
  2. response = requests.get( 'https://github.com/favicon.ico')
  3. with open( 'favicon.ico', 'wb') as f: #第一个传入的参数是图片名称,第二个是写入模式
  4. f.write(response.content)
  5. f.close()

添加headers


 
 
  1. import requests
  2. response = requests.get( 'https://www.zhihu.com/explore')
  3. print(response.text)
  4. 运行结果:
  5. <html>
  6. <head><title> 400 Bad Request</title></head>
  7. <body bgcolor= "white">
  8. <center><h1> 400 Bad Request</h1></center>
  9. <hr><center>openresty</center>
  10. </body>
  11. </html>
  12. PS:网站判断headers(浏览器头信息)不正确,拒绝访问

 
 
  1. import requests
  2. headers = {
  3. 'User-Agent' : 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; SE 2.X MetaSr 1.0; SE 2.X MetaSr 1.0; .NET CLR 2.0.50727; SE 2.X MetaSr 1.0)'
  4. }
  5. response = requests.get( 'https://www.zhihu.com/explore',headers=headers) #传入了一个浏览器头
  6. print(response.text)
  7. 运行结果:
  8. <!DOCTYPE html>
  9. <html lang= "zh-CN" dropEffect= "none" class="no-js no-auth ">
  10. <head>
  11. <meta charset="utf-8" />
  12. <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1" />
  13. <meta name="renderer" content="webkit" />
  14. <meta http-equiv="X-ZA-Response-Id" content="cecbafbb28655862d786c4985980a895">
  15. ......
  16. PS:加了一个浏览器头信息,网址已正常访问

基本POST请求


 
 
  1. import requests
  2. data = { 'name': 'germey', 'age': '22'}
  3. response = requests.post( 'http://httpbin.org/post',data=data)
  4. print(response.text)
  5. 运行结果:
  6. {
  7. "args": {},
  8. "data": "",
  9. "files": {},
  10. "form": {
  11. "age": "22",
  12. "name": "germey"
  13. },
  14. "headers": {
  15. "Accept": "*/*",
  16. "Accept-Encoding": "gzip, deflate",
  17. "Connection": "close",
  18. "Content-Length": "18",
  19. "Content-Type": "application/x-www-form-urlencoded",
  20. "Host": "httpbin.org",
  21. "User-Agent": "python-requests/2.19.1"
  22. },
  23. "json": null,
  24. "origin": "222.210.218.46",
  25. "url": "http://httpbin.org/post"
  26. }
  27. PS:可以非常方便的传入一个字典实现response操作

 
 
  1. import requests
  2. data = { 'name': 'germey', 'age': '22'}
  3. headers={
  4. 'User-Agent': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; SE 2.X MetaSr 1.0; SE 2.X MetaSr 1.0; .NET CLR 2.0.50727; SE 2.X MetaSr 1.0)'
  5. }
  6. response = requests.post( 'http://httpbin.org/post',data=data,headers=headers)
  7. print(response.json())
  8. 运行结果:
  9. { 'args': {}, 'data': '', 'files': {}, 'form': { 'age': '22', 'name': 'germey'}, 'headers':
  10. { 'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'Connection': 'close', 'Content-
  11. Length': '18', 'Content-Type': 'application/x-www-form-urlencoded', 'Host': 'httpbin.org',
  12. 'User-Agent': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; SE 2.X
  13. MetaSr 1.0; SE 2.X MetaSr 1.0; .NET CLR 2.0.50727; SE 2.X MetaSr 1.0)'}, 'json': None,
  14. 'origin': '222.210.218.46', 'url': 'http://httpbin.org/post'}

响应

reponse属性


 
 
  1. import requests
  2. response = requests.get( 'http://www.jianshu.com')
  3. print(type(response.status_code),response.status_code) #状态码及其类型
  4. print(type(response.headers),response.headers)
  5. print(type(response.cookies),response.cookies)
  6. print(type(response.url),response.url)
  7. print(type(response.history),response.history) #访问的历史记录
  8. 运行结果:
  9. ①:< class 'int'> 403
  10. ②:<class 'requests.structures.CaseInsensitiveDict'> {'Date': 'Wed, 12 Sep 2018 12:11:31
  11. GMT', 'Server': 'Tengine', 'Content-Type': 'text/html', 'Transfer-Encoding': 'chunked',
  12. 'Strict-Transport-Security': 'max-age=31536000; includeSubDomains; preload', 'Content-
  13. Encoding': 'gzip', 'X-Via': '1.1 shandianxin27:8 (Cdn Cache Server V2.0), 1.1 ndianxin72:8
  14. (Cdn Cache Server V2.0)', 'Connection': 'keep-alive'}
  15. ③:< class 'requests.cookies.RequestsCookieJar'> <RequestsCookieJar[]>
  16. ④:<class 'str'> https://www.jianshu.com/
  17. ⑤:< class 'list'> [<Response [301]>]

高级操作

文件上传


 
 
  1. import requests #文件上传用post操作
  2. files = { 'files':open( 'favicon.ico', 'rb')} #用files(自由指定上传的文件名称)、open的方法吧文件读取出来
  3. response = requests.post( 'http://httpbin.org/post',files=files)
  4. print(response.text)

获取cookie


 
 
  1. import requests
  2. response = requests.get( 'https://www.baidu.com')
  3. print(response.cookies)
  4. for key,value in response.cookies.items():
  5. print(key + '=' + value)
  6. 运行结果:
  7. <RequestsCookieJar[<Cookie BDORZ= 27315 for .baidu.com/>]>
  8. BDORZ= 27315

会话维持(模拟登录)


 
 
  1. import requests
  2. requests.get( 'http://httpbin.org/cookies/set/number/123456789') #为一个网站访问的时候设置一个cookie
  3. response = requests.get( 'http://httpbin.org/cookies') #访问cookies,就可以拿到网站当前的cookies
  4. print(response.text)
  5. 运行结果:
  6. {
  7. "cookies": {}
  8. }
  9. PS:错误的程序

 
 
  1. import requests
  2. s = requests.Session()
  3. s.get( 'http://httpbin.org/cookies/set/number/123456789') #为一个网站访问的时候设置一个cookie
  4. response = s.get( 'http://httpbin.org/cookies') #访问cookies,就可以拿到网站当前的cookies
  5. print(response.text)
  6. 运行结果:
  7. {
  8. "cookies": {
  9. "number": "123456789"
  10. }
  11. }

证书验证


 
 
  1. import requests
  2. response = requests.get( 'https://www.12306.cn')
  3. print(response.status_code)
  4. PS:网站证书错误,程序无法访问网页

 
 
  1. import requests
  2. from requests.packages import urllib3 #加入这个包和下面这一句话不在显示红色警告
  3. urllib3.disable_warnings()
  4. response = requests.get( 'https://www.12306.cn',verify= False) #添加一个verify,将证书验证的步骤省略即可正常访问页面
  5. print(response.status_code)
  6. PS:虽然可以正常的访问网页,但是会有一个红色的警告

 
 
  1. import requests
  2. response = requests.get( 'https://www.12306.cn',cert=( '/path/server.crt', '/path/key'))
  3. print(response.status_code)
  4. PS:另外可以手动指定一个证书(上面的证书是编的)

代理设置


 
 
  1. import requests
  2. proxies = {
  3. 'http': 'http://127.0.0.1:9743',
  4. 'https': 'http://127.0.0.1:9743',
  5. }
  6. response = requests.get( 'https://www.taobao.com',proxies=proxies)
  7. print(response.status_code)

 
 
  1. import requests
  2. proxies = {
  3. 'http': 'http://user:password@127.0.0.1:19743/',
  4. }
  5. response = requests.get( 'https://www.taobao.com',proxies=proxies)
  6. print(response.status_code)
  7. PS:当代理有用户名和密码的时候用此方法

如果你的代理不是http、https 而是socks代理,需要在python安装   pip3 install 'requests[socks]'   来测试使用代理,代码如下:


 
 
  1. import requests
  2. proxies = {
  3. 'http': 'socks5://127.0.0.1:9742',
  4. 'https': 'socks5://127.0.0.1:9742',
  5. }
  6. response = requests.get( 'https://www.taobao.com',proxies=proxies)
  7. print(response.status_code)

超时设置


 
 
  1. import requests
  2. response = requests.get( 'https://www.taobao.com',timeout = 1) #在规定的时间内没有应答就抛出一个异常
  3. print(response.status_code)

认证设置


 
 
  1. import requests
  2. from requests.auth import HTTPBasicAuth
  3. r = requests.get( 'http://120.27.34.24:9001',auth=HTTPBasicAuth( 'user', '123'))
  4. print(r.status_code)
  5. PS:遇到登录验证的网站使用此设置 HTTPBasicAuth里面是账号和密码

 
 
  1. import requests
  2. r = requests.get( 'http://120.27.34.24:9001',auth=( 'user', '123'))
  3. print(r.status_code)
  4. PS:这是另一种方法,结果一样

异常处理


 
 
  1. import requests
  2. from requests.exceptions import ReadTimeout,HTTPError,ConnectionError,RequestException
  3. try:
  4. response = requests.get( 'http://httpbin.org/get',timeout = 0.5)
  5. print(response.status_code)
  6. except ReadTimeout:
  7. print( 'Timeout')
  8. except HTTPError:
  9. print( 'Http error')
  10. except ConnectionError: #网络不通
  11. print( 'Connection error')
  12. except RequestException:
  13. print( 'Error')

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值