西北乱跑娃 --- requests爬虫五大反反爬机制

西北乱跑娃

已于 2022-09-14 22:15:32 修改

阅读量4.7k

点赞数 9

分类专栏： python 爬虫 python技术前沿文章标签：爬虫 ssl

于 2019-07-30 16:24:01 首次发布

本文链接：https://blog.csdn.net/human_soul/article/details/97793669

版权

python技术前沿同时被 3 个专栏收录

17 篇文章 1 订阅 ¥39.90 ¥99.00

订阅专栏

超级会员免费看

python

59 篇文章 0 订阅

订阅专栏

爬虫

12 篇文章 0 订阅

订阅专栏

提及爬虫相信大家都知道，今天为大家介绍五点解决反爬的机制。

1、SSL证书验证错误

错误：

requests.exceptions.SSLError: ("bad handshake: Error([('SSL routines', 
'tls_process_server_certificate', 'certificate verify failed')],)",)

解决办法

import requests 

url = "https://www.baidu.com/"
response = requests.get(url,verify=False).content.decode()     # 默认解码方式为UTF-8
print(response)

2、用户user-agent过于频繁

注意：有些网站会统计单个user-agent单位时间访问服务器的次数。
解决方法：

pip install fake_useragent      # 安装ua库

from fake_useragent import UserAgent
import requests 

url = 'https://www.baidu.com/'
ua = UserAgent().random
header = {
        'User-Agent': ua
 }
response = requests.get(url, headers=header, verify=False).content.decode()
print(response)

了解本专栏

订阅专栏解锁全文

超级会员免费看

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

西北乱跑娃

关注关注

9
点赞
踩
39

收藏

觉得还不错? 一键收藏
打赏
2
评论
西北乱跑娃 --- requests爬虫五大反反爬机制

提及爬虫相信大家都知道，今天为大家介绍五点解决反爬的机制。1、SSL证书验证错误错误：requests.exceptions.SSLError: ("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')],)",)解决办法import ...
复制链接

扫一扫