Python爬虫之对requests和urllib库的认识和使用(二)

最新推荐文章于 2024-04-23 16:42:17 发布

harry5508

最新推荐文章于 2024-04-23 16:42:17 发布

阅读量394

点赞数

分类专栏：爬虫相关 python 文章标签：爬虫requests requests库的使用

本文链接：https://blog.csdn.net/harry5508/article/details/87621518

版权

上篇文章已经介绍了urllib库的基本使用，本篇博客介绍requests库的基本使用，爬虫极力推荐requests，看完你就明白辽～。

学习之前推荐一个非常好的http测试网站：http://httpbin.org，提供非常非常完善的接口调试、测试功能～

requests库的具体使用

Python里默认是没有requests库滴，安装完Python需要手动安装requests库：

pip install requests

一、混个脸熟

访问baidu，获取一些基本信息：

 import requests
    #打开网页获取响应
    response = requests.get("https://www.baidu.com")
    #打印响应类型
    print('response:',type(response))
    #打印状态码
    print('status_code:',response.status_code)
    #打印字符串形式的json响应体的类型
    print(type(response.text))
    #打印字符串形式的响应体
    print('text:',response.text)
    打印cookie
    print('cookie:',response.cookies)
    print('二进制content:',response.content)
    print('content:',response.content.decode("utf-8"))

ps：response.text是获取字符串形式的网页内容，但是由于编码问题，很容易乱码，所以还可以使用response.content以二进制形式获取，然后使用decode方法进行转码。

其实使用requset.text避免乱码的方式还有一个，就是发出请求后，获取内容之前使用response.encoding属性来改变编码，例如：

response =requests.get("http://www.baidu.com")
#设置响应内容的编码方式为utf-8
response.encoding="utf-8"
print(response.text)

二、requests的请求

requests支持http的各种请求，比如：

GET：请求指定的页面信息，并返回实体主体。

HEAD：只请求页面的首部。

POST：请求服务器接受所指定的文档作为对所标识的URI的新的从属实体。

PUT：从客户端向服务器传送的数据取代指定的文档的内容。

DELETE：请求服务器删除指定的页面。

OPTIONS：允许客户端查看服务器的性能。

1.最基本的get请求

1).一个带参数的get请求：

import requests
    #将参数写在字典里，通过params传入，params接受字典或序列
    data = {
        "name": "hanson",
        "age": 24
    }
    #发出一个get请求，获得响应
    response = requests.get("http://httpbin.org/get", params=data)
    #打印url
    print(response.url)
    #打印响应内容
    print(response.text)

结果为：

url：http://httpbin.org/get?name=hanson&age=24
text：{
"args": {
"age": "24",
"name": "hanson"
},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Host": "httpbin.org",
"User-Agent": "python-requests/2.18.4"
},
"origin": "124.74.47.82, 124.74.47.82",
"url": "https://httpbin.org/get?name=hanson&age=24"
}

2).响应json的解析

示例：

 import requests
    import json
    #发出一个get请求
    response = requests.get("http://httpbin.org/get")
    #text响应类型
    print(type(response.text))
    #直接解析响应json(成字典)
    print(response.json())
    #获取响应内容后json进行解析(成字典)
    print(jso

最低0.47元/天解锁文章

harry5508

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Python爬虫之对requests和urllib库的认识和使用(二)

上篇文章已经介绍了urllib库的基本使用，本篇博客介绍requests库的基本使用，爬虫极力推荐requests，看完你就明白辽～。学习之前推荐一个非常好的http测试网站：http://httpbin.org，提供非常非常完善的接口调试、测试功能～requests库的具体使用Python里默认是没有requests库滴，安装完Python需要手动安装requests库：p...
复制链接

扫一扫

专栏目录