[爬虫]requests发请求进行数据采集

最新推荐文章于 2023-05-12 14:57:38 发布

小狼女做笔记

最新推荐文章于 2023-05-12 14:57:38 发布

阅读量298

点赞数

分类专栏：爬虫文章标签： python 爬虫

本文链接：https://blog.csdn.net/lyjlyj3277/article/details/116790099

版权

爬虫专栏收录该内容

3 篇文章 0 订阅

订阅专栏

# -*- coding:utf-8 -*-

import lxml.html
import requests
from fake_useragent import UserAgent

headers={"referer":""}
cookie={"Cookie":""}
proxies={"http": "http://username:password@host", "https": "http://username:password@host"}
headers["user-agent"] = UserAgent().random
url=""

# 发get请求
response=requests.get(url=url,
                    headers=headers,
                    timeout=10,
                    cookies=cookie,
                    proxies=proxies)
# 发post请求
# 看请求头里面的"Content-Type"类型，指定了请求参数的形式，请求头需携带Content-Type键值对
# post请求参数的四种类型form-data、x-www-form-urlencoded、raw、binary
headers = {
    "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/7046A194A",
    "referer": "",
    "Content-Type": "application/x-www-form-urlencoded; charset=UTF-8",  # 该值从请求头获取
}
form_data={}
response = requests.post(
    url=url,
    headers=headers,
    timeout=10,
    proxies=proxies,
    data=form_data,
    cookies=cookie,
)

# html数据
result = lxml.html.fromstring(response.text)

# json数据
parsedJson = response.json()
# 或者 ,等价于上句
parseJson = json.loads(response.text)