Python网络爬虫信息提取mooc代码实例

最新推荐文章于 2024-05-23 17:47:04 发布

程序员学府

最新推荐文章于 2024-05-23 17:47:04 发布

阅读量1k

点赞数

分类专栏： python基础编程文章标签： python 编程语言

本文链接：https://blog.csdn.net/chengxun02/article/details/104998851

版权

这篇文章主要介绍了python网络爬虫与信息提取mooc,文中通过示例代码介绍的非常详细，对大家的学习或者工作具有一定的参考学习价值,需要的朋友可以参考

实例一–爬取页面

import requests
url="https//itemjd.com/2646846.html"
try:
 r=requests.get(url)
 r.raise_for_status()
 r.encoding=r.apparent_encoding
 print(r.text[:1000])
except:
 print("爬取失败")

正常页面爬取

实例二–爬取页面

import requests
url="https://www.amazon.cn/gp/product/B01M8L5Z3Y"
try:
 kv={'user-agent':'Mozilla/5.0'}
 r=requests.get(url,headers=kv)
 r.raise_for_status()
 r.encoding=r.apparent_encoding
 print(r.text[1000:2000])
except:
 print("爬取失败")

对访问用户名有限制，模拟浏览器对网站请求

实例三–爬取搜索引擎

#百度的关键词接口：http://www.baidu.com/s?wd=keyword
#360的关键词接口：http://www.so.com/s?q=keyword
import requests
keyword="python"
try:
 kv={'wd':keyword}
 r=requests.get("http://www.baidu.com/s",par

最低0.47元/天解锁文章

程序员学府

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
Python网络爬虫信息提取mooc代码实例

这篇文章主要介绍了python网络爬虫与信息提取mooc,文中通过示例代码介绍的非常详细，对大家的学习或者工作具有一定的参考学习价值,需要的朋友可以参考实例一–爬取页面import requestsurl="https//itemjd.com/2646846.html"try: r=requests.get(url) r.raise_for_status() r.encoding=r...
复制链接

扫一扫