python scapy模块安装,Python_Scrapy-第三方模块安装与使用

最新推荐文章于 2024-03-26 13:44:58 发布

翻越时空你也难以忘记

最新推荐文章于 2024-03-26 13:44:58 发布

阅读量271

点赞数

文章标签： python scapy模块安装

第三方模块的安装

1、request库的安装与使用

requests库本质上就是模拟了我们用浏览器打开一个网页，发起请求是的动作。它能够迅速的把请求的html源文件保存到本地

安装方式

“win+R”输入“cmd”打开命令提示符面板，键入“pip install requests”,安装pip第三方模块。

查看安装结果

“win+R”输入“cmd”打开命令提示符面板，键入“pip list”,查看通过pip所安装的所有第三方模块。

简单使用

首先我们先导入requests这个包

import requests

我们来吧百度的index页面的html源码抓取到本地，并用r变量保存

注意这里，网页前面的http://一定要写出来，它并不能像真正的浏览器一样帮我们补全http协议

r = requests.get("http://www.baidu.com")

将下载到的内容打印一下：

print(r.text)

所获取的百度源码文件

2、bs4库的安装与使用

bs4库是解析、遍历、维护、“标签树“的功能库。

安装方式

“win+R”输入“cmd”打开命令提示符面板，键入“pip install beautifulsoup4”,安装pip第三方模块。

查看安装结果

“win+R”输入“cmd”打开命令提示符面板，键入“pip list”,查看通过pip所安装的所有第三方模块。

简单使用

1、以一段HTML代码将作为例子

The Dormouse's story

The Dormouse's story

Once upon a time there were three little sisters; and their names were

http://example.com/elsie" class="sister" id="link1">Elsie,

http://example.com/lacie" class="sister" id="link2">Lacie and

http://example.com/tillie" class="sister" id="link3">Tillie;

and they lived at the bottom of a well.

...

2、下面我们开始用bs4库解析这一段html网页代码。

#导入bs4模块

from bs4 import BeautifulSoup

soup = BeautifulSoup(html，'html.parser')

#输出结果

print(soup.prettify())

'''

OUT:

# The Dormouse's story

# Once upon a time there were three little sisters; and their names were

# Elsie

# ,

# Lacie

# and

# Tillie

# ; and they lived at the bottom of a well.

# ...

'''

通俗一点说就是： bs4库把html源代码重新进行了格式化，

从而方便我们对其中的节点、标签、属性等进行操作。

3、BS4库的解析器的安装与使用

我们所选用的是lxml解析器

安装

pip install lxml

具体使用

1、依旧使用上一节HTML文档

2、使用lxml进行解析

import bs4

#首先我们先将html文件已lxml的方式做成一锅汤

soup = bs4.BeautifulSoup(open('Beautiful Soup 爬虫/demo.html'),'lxml')

#我们把结果输出一下，是一个很清晰的树形结构。

#print(soup.prettify())

'''

OUT:

The Dormouse's story

Once upon a time there were three little sisters; and their names were

Elsie

Lacie

and

Tillie

;

and they lived at the bottom of a well.

...

'''

翻越时空你也难以忘记

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python scapy模块安装,Python_Scrapy-第三方模块安装与使用

第三方模块的安装1、request库的安装与使用requests库本质上就是模拟了我们用浏览器打开一个网页，发起请求是的动作。它能够迅速的把请求的html源文件保存到本地安装方式“win+R”输入“cmd”打开命令提示符面板，键入“pip install requests”,安装pip第三方模块。查看安装结果“win+R”输入“cmd”打开命令提示符面板，键入“pip list”,查看通过pip所...
复制链接

扫一扫