requests知识点（1）

最新推荐文章于 2023-12-18 14:21:28 发布

小虫子啊

最新推荐文章于 2023-12-18 14:21:28 发布

阅读量141

点赞数

分类专栏：爬虫 python

本文链接：https://blog.csdn.net/m0_47090638/article/details/111473005

版权

python 同时被 2 个专栏收录

20 篇文章 1 订阅

订阅专栏

爬虫

3 篇文章 0 订阅

订阅专栏

一、下载 requests 模块：

pip install requests

二、基本使用流程

import requests

# 1.定制url，既即将访问的url
url = ‘www.sogou.com’
# 2.UA 伪装，既模拟浏览器访问 url ,可以去network中查找 user-agent后的内容就是。（定制请求头）
headers = {
	'User-Agent':'Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 SE 2.X MetaSr 1.0'
}
# 3.也可以给url中携带参数，讲参数保存到字典中
params = {
	‘name’:'lihaha'
}
# 4.既可以发送请求,response接响应数据
response=requsts.get(url=url,params=params,headers=headers)
# 5.响应数据持久化处理
with open('./so','w',encodig='utf-8') as f:
	f.write(response.text)

post 请求
response = requests.post(url=url,data=params,headers=headers)

三、response 属性

属性	描述
response.text	str形式的响应数据（网页源码）
response.content	bytes类型的响应数据
response.status_code	响应的状态码
response.headers	响应头信息
response.request	获取响应对应的请求
response.encoding	当前的字符编码
response.encoding = ‘utf-8’	设置字符编码
response.json()	request 内置的json解码器,前提是提前知道返回的是json数据，否则报错

四、代理

# 使用代理ip是一种非常必要的反反爬的一种方式

proxies = {
    "http": "https://175.44.148.176:9000",
    "https": "https://183.129.207.86:14002"
}
response = requests.get(url=url, proxies=proxies)

# 如果代理需要用户名和密码
proxies = {
    "http": "http://user:pass@10.10.1.10:3128/",
}

五、解码

有时候html页面的编码格式与python默认的utf-8解码格式不同
# 让python解码适应html源码格式
response = requests.get('https://www.csdn.net/')
response.encoding = response.apparent_encoding
print(response.text)

小虫子啊

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
requests知识点（1）

一、下载 requests 模块：pip install requests二、基本使用流程import requests# 1.定制url，既即将访问的urlurl = ‘www.sogou.com’# 2.UA 伪装，既模拟浏览器访问 url ,可以去network中查找 user-agent后的内容就是。（定制请求头）headers = { 'User-Agent':'Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, l
复制链接

扫一扫

专栏目录