requests模块
前言:
通常我们利用Python写一些WEB程序、webAPI部署在服务端,让客户端request,我们作为服务器端response数据;
但也可以反主为客利用Python的requests模块模拟浏览器行为,向其他站点发送request,让其他站点response数据给我们;
私信小编001即可获取大量Python学习资料!
一、requests模块介绍
requests可以模拟浏览器的请求,比起之前用到的urllib,requests模块的api更加便捷(其本质就是封装了urllib3),
特点:requests库发送请求将网页内容下载下来以后,并不会执行js代码,这需要我们自己分析目标站点然后发起新的request请求
官网链接:http://docs.python-requests.org/en/master/
1、安装requests模块
pip3 install requests
2、requests模块支持的请求方式
常用的就是requests.get()和requests.post(),建议在正式学习requests前,先熟悉下HTTP协议;http://www.cnblogs.com/linhaifeng/p/6266327.html
>>> import requests>>> r = requests.get('https://api.github.com/events') >>> r = requests.post('http://httpbin.org/post', data = {'key':'value'})>>> r = requests.put('http://httpbin.org/put', data = {'key':'value'})>>> r = requests.delete('http://httpbin.org/delete')>>> r = requests.head('http://httpbin.org/get')>>> r = requests.options('http://httpbin.org/get')
二、requests发送GET请求
1、基本get请求
1 import requests2 response=requests.get('http://dig.chouti.com/')3 print(response.text)
response查看response编码
respose.encoding:查看返回网页数据默认编码
import requestsurl='https://www.baidu.com/'respose=requests.get( url=url, headers={ 'User-Agent':'Mozilla/5.0 (Windows NT 6.1;Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36' })print(respose.encoding)#查看网页编码respose.encoding='utf-8' #设置网页编码print(respose.status_code)with open('a.html','w',encoding='utf-8') as f: f.write(respose.text)
2、带参数的GET请求
url编码
#带参数的url,+url编码from urllib.parse import urlencodeimport requestsk=input('输入关键字: ').strip()res=urlencode({'wd':k},encoding='utf-8') #url编码respose=requests.get('https://www.baidu.com/s?%s'% res, headers={ 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36' }, # params={'wd':k} )with open('a.html','w',encoding='utf-8') as f: f.write(respose.text)
headers设置请求头
respose=requests.get('https://www.baidu.com/s?%s'% res, headers={ 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36' },
params 请求参数设置(自动处理URL后参数编码)
k=input('输入关键字: ').strip()# res=urlencode({'wd':k},encoding='utf-8') #url编码respose=requests.get('https://www.baidu.com/s?', headers={ 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36' }, params={'wd':k} )with open('a.html','w',encoding='utf-8') as f: f.write(respose.text)
Cookies 请求携带cookie信息
respose=requests.get('https://www.baidu.com/s?', headers={ 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36' }, params={'wd':k}, Cookies={'user_session':'wGMHFJKgDcmRIVvcA14_Wrt_3xaUyJNsBnPbYzEL6L0bHcfc'}, )
allow_redirects=False 禁止根据resposes的响应头的location做页面跳转,默认是true跳转;
设置为flase可以停留在本次请求(request),获取本次响应(responses)响应头,让跳转的loction地址;否则跳转了获取的就是跳转之后页面的响应内容了!
r3=session.get('https://passport.lagou.com/grantServiceTicket/grant.html', headers={ &