Python爬虫——Requests库的基本方法

最新推荐文章于 2024-07-16 13:17:37 发布

Mr. Wanderer

最新推荐文章于 2024-07-16 13:17:37 发布

阅读量250

点赞数

分类专栏： Python爬虫

本文链接：https://blog.csdn.net/mr_wanderer/article/details/114966147

版权

Python爬虫专栏收录该内容

2 篇文章 0 订阅

订阅专栏

文章目录

Requests库安装
Requests库的基本方法
get方法
post方法
requests方法

Requests库安装

Requests是用python语言基于urllib编写的，采用的是Apache2 Licensed开源协议的HTTP库。简单的爬虫一般使用Requests库编写。

使用pip安装Requests：

pip install requests

Requests库的基本方法

方法	说明
requests.request()	构造一个请求，支撑以下各方法的基础方法
requests.get()	获取HTML网页的主要方法
requests.head()	获取HTML网页头部信息
requests.post()	向HTML网页提交POST请求的方法
requests.put()	向HTML网页提交PUT请求的方法
requests.patch()	向HTML网页提交局部修改请求
requests.delete()	向HTML网页提交删除请求

get方法

r=requests.get(url)

参数：

r.status_code：200表示成功，其他数字为失败
r.encoding：编码方式
r.apparent_encoding：备选编码方式
r.text：html内容

例子：

import requests
url = 'http://baidu.com/s?wd='+'爬虫'
response = requests.get(url)
print(response.status_code) # 200
print(response.text)

post方法

r = requests.post(url,data={请求体的字典})

参数：

r.status_code：200表示成功，其他数字为失败
r.encoding：编码方式
r.apparent_encoding：备选编码方式
r.text：html内容

例子：

import requests
url = 'https://fanyi.baidu.com'
data = {'from': 'zh',
        'to': 'en',
        'query': '人生苦短，我用python'
        }
response = requests.post(url, data=data)
print(response.status_code)  # 200
print(response.encoding)  # utf-8
print(response.apparent_encoding)  # utf-8
print(response.text)

requests方法

requests.request(method,url,**kwargs)

method：

r= requests.request(‘GET’,url,**kwargs)
r= requests.request(‘HEAD’,url,**kwargs)
r= requests.request(‘POST’,url,**kwargs)
r= requests.request(‘PUT’,url,**kwargs)
r= requests.request(‘PATCH’,url,**kwargs)
r= requests.request(‘delete’,url,**kwargs)
r= requests.request(‘OPTIONS’,url,**kwargs)

**kwargs控制访问参数：

params：字典或字节序列，将作为参数增加到url中

import requests
kv = {'key1':'value1', 'key2':'value2'}
r = requests.request('GET','http://python123.io/ws',params=kv)
print(r.url)  # https://python123.io/ws?key1=value1&key2=value2

json：JSON格式的数据，作为Request的内容，常用与post方法向服务器提交数据

kv = {'key1':'value1'}
r = requests.request('POST','http://python123.io/ws',json=kv)

headers：字典，HTTP请求头

hd = {'user-agent':'Chrome/10'}
r = requests.request('POST','http://python123.io/ws',headers=hd)

files：字典类型，传输文件

fs = {'file':open('data.xls','rb')}
r = requests.request('POST','http://python123.io/ws',files=fs)

timeout：设定超时时间，单位为秒

r = requests.request('GET','http://python123.io/ws',timeout = 10)

proxies：字典类型，设定访问代理服务器（防止爬虫逆追踪），可以增加登录认证

pxs = {
    'http':'http://user:pass@10.10.10.1:1234',
    'https':'https://10.10.10.1:4321'
}
r = requests.request('GET','http://python123.io/ws',proxies = pxs)

Mr. Wanderer

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Python爬虫——Requests库的基本方法

文章目录Requests库安装Requests库的基本方法get方法post方法requests方法Requests库安装Requests是用python语言基于urllib编写的，采用的是Apache2 Licensed开源协议的HTTP库。简单的爬虫一般使用Requests库编写。使用pip安装Requests：pip install requestsRequests库的基本方法方法说明requests.request()构造一个请求，支撑以下各方法的基础方法
复制链接

扫一扫

专栏目录