requests模块02
一、发送带请求头的请求
1、查看网页源码的方法:
■右键-查看网页源代码
或
■右键-检查
2、查看对应urI的响应内容的方法:
i.右键-检查
i.点击Net work
ii.勾选Preserve log
iv,刷新页面
v.查看Name一栏下和浏览器地址栏相同的url的Response
3、携带请求头发送请求的方法
requests . get(url, headers =headers )
●headers参 数接收字典形式的请求头
●请求头字段名作为key,字段对应的值作为value
示例如下:
import requests
url =' http://www.baidu.com'
response = requests.get(url)
print(len(response.content.decode()))
print(response.content.decode())
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36'
}
response1 = requests.get(url,headers=headers)
print(len(response1.content.decode()))
print(response1.content.decode())
二、发送带参数的请求
1、查询字符串
我们在使用百度搜索的时候经常发现url地址中会有一个?,那么该问号后边的就是请求参数,又叫做
查询字符串
2、在url携带参数
直接对含有参数的url发起请求
示例如下:
import requests
url = 'https://www.baidu.com/s?&wd=python'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36'
}
response=requests.get(url,headers=headers)
with open('baidu.html','wb')as f:
f.write(response.content)
3、通过params携带参数字典
1.构建请求参数字典
2.向接口发送请求的时候带上参数字典,参数字典设置给params
import requests
url = 'https://www.baidu.com/s?'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36'
}
data ={'wd':'python'}
response=requests.get(url,headers=headers,params=data)
print(response.url)
with open('baidu1.html','wb')as f:
f.write(response.content)
三、在headers参数中携带cookie
网站经常利用请求头中的Cookie字段来做用户访问状态的保持,那么我们可以在headers参数中添加
Cookie, 模拟普通用户的请求。
1、github登陆抓包分析
1.打开浏览器,右键-检查,点击Net work,勾选Preserve log
2.访问github登陆的url地址https://github.com/login
3.输入账号密码点击登陆后,访问一个需要登陆后才能获取正确内容的url,比如点击右.上角的Your profile
访问https : //github. com/USER_ NAME
4.确定url之后,再确定发送该请求所需要的请求头信息中的User-Agent和Cookie
2、完成代码
●从浏览器中复制User-Agent和Cookie
●浏览器中的请求头字段和值与headers参数中必须一 致
●headers请求参数字典中的Cookie键对应的值是字符串
示例如下:
import requests
url = 'https://github.com/RoninLJY'
headers={
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36',
'cookie': '_device_id=98c7e8f7c9fe00d90b09c0a7d1226664; _octo=GH1.1.1732856261.1618282978; has_recent_activity=1; tz=Asia%2FShanghai; user_session=RA_-59LYZnyts3zfK3Wnv98OZivABOMFII0PukUgGObTgdx6; __Host-user_session_same_site=RA_-59LYZnyts3zfK3Wnv98OZivABOMFII0PukUgGObTgdx6; tz=Asia%2FShanghai; color_mode=%7B%22color_mode%22%3A%22light%22%2C%22light_theme%22%3A%7B%22name%22%3A%22light%22%2C%22color_mode%22%3A%22light%22%7D%2C%22dark_theme%22%3A%7B%22name%22%3A%22dark%22%2C%22color_mode%22%3A%22dark%22%7D%7D; logged_in=yes; dotcom_user=RoninLJY; _gh_sess=UUxPqI8qELe%2B2iiSRIaSRwGavfapI2bARcRJrdwtxAoPv14WJN6G0yXyieyfMxsdqVRvtCCuVSn5%2B8j7q4oIxZkLTFbd9sA%2FgHwakfXT1Xpiyup4ngs8Psrqik1Li8mQaf8M1QiA5F0Xzbk6JGsaPCpCE7p5rNEW66bSY3H%2FhN62Fomf2puv7X7Bio1Xq5c%2BKaznCE%2FEMnkTXXQ1S4tXTCOL1NaqRFbQaT54Aqrf3pPiAB6X4ZhHqh%2Fg2bpvXIvxFCiP376ZviVVyIfxNUOqBqdEn4H6gBxkUo1zHKPGxX2LPVe6wBLhcvX8%2FAPimzGgi7PDuTLQ6yvsXbTF8FPfcRNx%2BKjxUWssZnMhxt5%2Fa8oOJEm3GzGW%2FwA7JRxa2Qp%2FiNGpER2PvdNULE%2FK22OF2yt8X7mk5cm2enkP8NCjk889vuRDUi2yf2IQBvfN1bz998FqCGGOpB86J00aFs%2Fb7iulZgjxlabQvaIWLrKSVlKmYbluYTCXcmNMZHVfUGxMtB%2F%2F2xxMClosyg0zDDp7WQ4vQKU%3D--edOm0scRhG8CSIdS--yf1HCbGPJtJ5dDP%2BoGDBEA%3D%3D'
}
response =requests.get(url,headers=headers)
with open("github_with_cookies_.html","wb")as f:
f.write(response.content)
3、cookies参数的使用
3.1 cookies参数的形式:字典
cookies = {“cookie的name” :“cookie的value”}
。该字典对应请求头中Cookie字符串,以分号、空格分割每-对字典键值对
。等号左边的是一个cookie的name,对应cookies字 典的key
。等号右边对应cookies字典的value
3.2 cookies参数的使用方法
response=requests. get(url, cookies)
示例如下:
import requests
url = 'https://github.com/RoninLJY'
headers={
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36'
}
temp='_device_id=98c7e8f7c9fe00d90b09c0a7d1226664; _octo=GH1.1.1732856261.1618282978; has_recent_activity=1; tz=Asia%2FShanghai; user_session=RA_-59LYZnyts3zfK3Wnv98OZivABOMFII0PukUgGObTgdx6; __Host-user_session_same_site=RA_-59LYZnyts3zfK3Wnv98OZivABOMFII0PukUgGObTgdx6; tz=Asia%2FShanghai; color_mode=%7B%22color_mode%22%3A%22light%22%2C%22light_theme%22%3A%7B%22name%22%3A%22light%22%2C%22color_mode%22%3A%22light%22%7D%2C%22dark_theme%22%3A%7B%22name%22%3A%22dark%22%2C%22color_mode%22%3A%22dark%22%7D%7D; logged_in=yes; dotcom_user=RoninLJY; _gh_sess=UUxPqI8qELe%2B2iiSRIaSRwGavfapI2bARcRJrdwtxAoPv14WJN6G0yXyieyfMxsdqVRvtCCuVSn5%2B8j7q4oIxZkLTFbd9sA%2FgHwakfXT1Xpiyup4ngs8Psrqik1Li8mQaf8M1QiA5F0Xzbk6JGsaPCpCE7p5rNEW66bSY3H%2FhN62Fomf2puv7X7Bio1Xq5c%2BKaznCE%2FEMnkTXXQ1S4tXTCOL1NaqRFbQaT54Aqrf3pPiAB6X4ZhHqh%2Fg2bpvXIvxFCiP376ZviVVyIfxNUOqBqdEn4H6gBxkUo1zHKPGxX2LPVe6wBLhcvX8%2FAPimzGgi7PDuTLQ6yvsXbTF8FPfcRNx%2BKjxUWssZnMhxt5%2Fa8oOJEm3GzGW%2FwA7JRxa2Qp%2FiNGpER2PvdNULE%2FK22OF2yt8X7mk5cm2enkP8NCjk889vuRDUi2yf2IQBvfN1bz998FqCGGOpB86J00aFs%2Fb7iulZgjxlabQvaIWLrKSVlKmYbluYTCXcmNMZHVfUGxMtB%2F%2F2xxMClosyg0zDDp7WQ4vQKU%3D--edOm0scRhG8CSIdS--yf1HCbGPJtJ5dDP%2BoGDBEA%3D%3D'
cookie_list =temp.split('; ')
cookies = {}
for cookie in cookie_list:
cookies[cookie.split('=')[0]]=cookie.split('=')[-1]
print(cookies)
3.3 将cookie字符串转换为cookies参数所需的字典:
cookies_ dict - {cookie.split(’-’)[@]:cookie.split(’=’)[-1] for cookie in cookies.
')}
示例如下:
import requests
url = 'https://github.com/RoninLJY'
headers={
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36'
}
temp='_device_id=98c7e8f7c9fe00d90b09c0a7d1226664; _octo=GH1.1.1732856261.1618282978; has_recent_activity=1; tz=Asia%2FShanghai; user_session=RA_-59LYZnyts3zfK3Wnv98OZivABOMFII0PukUgGObTgdx6; __Host-user_session_same_site=RA_-59LYZnyts3zfK3Wnv98OZivABOMFII0PukUgGObTgdx6; tz=Asia%2FShanghai; color_mode=%7B%22color_mode%22%3A%22light%22%2C%22light_theme%22%3A%7B%22name%22%3A%22light%22%2C%22color_mode%22%3A%22light%22%7D%2C%22dark_theme%22%3A%7B%22name%22%3A%22dark%22%2C%22color_mode%22%3A%22dark%22%7D%7D; logged_in=yes; dotcom_user=RoninLJY; _gh_sess=UUxPqI8qELe%2B2iiSRIaSRwGavfapI2bARcRJrdwtxAoPv14WJN6G0yXyieyfMxsdqVRvtCCuVSn5%2B8j7q4oIxZkLTFbd9sA%2FgHwakfXT1Xpiyup4ngs8Psrqik1Li8mQaf8M1QiA5F0Xzbk6JGsaPCpCE7p5rNEW66bSY3H%2FhN62Fomf2puv7X7Bio1Xq5c%2BKaznCE%2FEMnkTXXQ1S4tXTCOL1NaqRFbQaT54Aqrf3pPiAB6X4ZhHqh%2Fg2bpvXIvxFCiP376ZviVVyIfxNUOqBqdEn4H6gBxkUo1zHKPGxX2LPVe6wBLhcvX8%2FAPimzGgi7PDuTLQ6yvsXbTF8FPfcRNx%2BKjxUWssZnMhxt5%2Fa8oOJEm3GzGW%2FwA7JRxa2Qp%2FiNGpER2PvdNULE%2FK22OF2yt8X7mk5cm2enkP8NCjk889vuRDUi2yf2IQBvfN1bz998FqCGGOpB86J00aFs%2Fb7iulZgjxlabQvaIWLrKSVlKmYbluYTCXcmNMZHVfUGxMtB%2F%2F2xxMClosyg0zDDp7WQ4vQKU%3D--edOm0scRhG8CSIdS--yf1HCbGPJtJ5dDP%2BoGDBEA%3D%3D'
cookie_list =temp.split('; ')
cookies ={cookie.split('=')[0]:cookie.split('=')[-1]for cookie in cookie_list}
#cookies = {}
#for cookie in cookie_list:
# cookies[cookie.split('=')[0]]=cookie.split('=')[-1]
print(cookies)
response =requests.get(url,headers=headers,cookies=cookies)
with open("github_with_cookies3_.html","wb")as f:
f.write(response.content)
3.4 注意: cookie- -般是有过期时间的,一旦过期需要重新获取
4、cookieJar对象转换为cookies字典的方法
使用request获取的resposne对象,具有cookies属性。 该属性值是-个cookieJar类型,包含了对方服
务器设置在本地的cookie。
- 转换方法
cookies_ dict = requests .utils.dict_ from_cookiejar(response. cookies) - 其中response.cookies返回的就是cookieJar类型的对象
- requests .utils.dict from_ cookiejar函数返回cookies字典
示例如下:
import requests
url ='http://www.baidu.com'
response =requests.get(url)
print(response.cookies)
dict_cookies = requests.utils.dict_from_cookiejar(response.cookies)
print(dict_cookies)
jar_cookies =requests.utils.cookiejar_from_dict(dict_cookies)
print(jar_cookies)