Python爬虫嵩天老师慕课实例-IP地址查询全代码小小改进

꯭哈꯭哈

于 2021-08-11 23:19:58 发布

阅读量244

点赞数 1

分类专栏： python 文章标签： python

本文链接：https://blog.csdn.net/jhw0813/article/details/119619516

版权

python 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

由于时间的推移，嵩天老师17年的代码由于网站的反爬虫机制，需要在原有基础上加入对user-agent的虚拟才可以正常使用授课用的实例网站 www.ip138.com

PS：现在大多数网站都需要进行User-Agent的虚拟以及cookie的虚拟，将其放入headers字典里可以实现虚拟。

原代码（为了便于理解，我简化了代码）：

import requests
url='https://m.ip138.com/iplookup.asp?ip=202.204.80.112'
try:
    r=requests.get(url,headers=headers)
    r.raise_for_status()
    r.encoding=r.apparent_encoding
    print(r.text)
except:
    print("爬取失败")

改进代码（加入了对user-agent的虚拟）：

headers={
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)'
}

完整代码：

import requests
headers={
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)'
}
url='https://m.ip138.com/iplookup.asp?ip=202.204.80.112'
try:
    r=requests.get(url,headers=headers)
    r.raise_for_status()
    r.encoding=r.apparent_encoding
    print(r.text)
except:
    print("爬取失败")

结果：