python访问网站添加请求头_python爬虫请求头如何设置？

最新推荐文章于 2023-07-06 12:42:06 发布

Byte DIY

最新推荐文章于 2023-07-06 12:42:06 发布

阅读量508

点赞数

文章标签： python访问网站添加请求头

本文链接：https://blog.csdn.net/weixin_33903769/article/details/112904874

版权

本文介绍了Python中使用requests、Selenium+Chrome、Selenium+PhantomJS、Scrapy框架和Aiohttp库设置HTTP请求头的详细步骤，包括示例代码，以模拟浏览器行为并避免被目标网站识别为爬虫。

摘要由CSDN通过智能技术生成

原标题：python爬虫请求头如何设置？

一、requests设置请求头:

import requests

url="http://www.targetweb.com"

headers={

'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',

'Cache-Control':'max-age=0',

'Connection':'keep-alive',

'Referer':'http://www.baidu.com/',

'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.104 Safari/537.36 Core/1.53.4882.400 QQBrowser/9.7.13059.400'}

res=requests.get(url,headers=headers)

#图片下载时要用到字节流，请求方式如下

#res=requests.get(url,stream=True,headers)

二、Selenium+Chrome请求头设置:

from selenium import webdriver

options = webdriver.ChromeOptions()

options.add_argument('lang=zh_CN.UTF-8')# 设置中文

options.add_argument('user-agent="Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.104 Safari/537.36 Core/1.53.4882.400 QQBrowser/9.7.13059.400"')# 设置头部

browser = webdriver.Chrome(chrome_options=options)

url="http://www.targetweb.com"

browser.get(url)

browser.quit()

三、selenium+phantomjs请求头设置：

from selenium import webdriver

from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

des_cap = dict(DesiredCapabilities.PHANTOMJS)

des_cap["phantomjs.page.settings.userAgent"] = ("Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.104 Safari/537.36 Core/1.53.4882.400 QQBrowser/9.7.13059.400")

browser = webdriver.PhantomJS(desired_capabilities=des_cap)

url="http://www.targetweb.com"

browser.get(url)

browser.quit()

四、爬虫框架scrapy设置请求头：

在settings.py文件中添加如下：

DEFAULT_REQUEST_HEADERS = {

'accept': 'image/webp,*/*;q=0.8',

'accept-language': 'zh-CN,zh;q=0.8',

'referer': 'https://www.baidu.com/',

'user-agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.104 Safari/537.36 Core/1.53.4882.400 QQBrowser/9.7.13059.400',}

五、Python异步Aiohttp请求头设置:

import aiohttp

url="http://www.targetweb.com"

headers={

'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',

'Cache-Control':'max-age=0',

'Connection':'keep-alive',

'Referer':'http://www.baidu.com/',

'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.104 Safari/537.36 Core/1.53.4882.400 QQBrowser/9.7.13059.400'}

asyncwithaiohttp.ClientSession(headers=headers)assession:

asyncwithsession.get(url)asresp:

print(resp.status)

print(awaitresp.text())

原文至：https://www.py.cn/spider/guide/18537.html返回搜狐，查看更多

责任编辑：

Byte DIY

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python访问网站添加请求头_python爬虫请求头如何设置？

原标题：python爬虫请求头如何设置？一、requests设置请求头:import requestsurl="http://www.targetweb.com"headers={'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8','Cache-Control':'max-ag...
复制链接

扫一扫