'''
BeautifulSoup库的使用:
HTML解析器:html.parser
lxml的HTML解析器, BeautifulSoup(mk, 'lxml') pip install lxml
lxml的XML解析 BeautifulSoup(mk, 'xml') pip install lxml
html5lib的解析器 BeautifulSoup(mk, 'html5lib') pip install html5lib
BeautfulSoup 的基本元素
Tag:标签
Name:表情的名字 <Tag>.name
Attributes:标签的属性,字典形式组织<Tag>.attrs
NavigableString 标签内的非属性字符串, <Tag>.string
Comment 标签内字符串的注释部分
'''
import requests
from bs4 import BeautifulSoup
# import bs4.
r = requests.get('http://python123.io/ws/demo.html')
demo =r.text
soup = BeautifulSoup(demo, 'html.parser')
# print(soup.prettify())
# tag = soup.a
# print(tag)
# print(tag.parent.name)#p
# print(tag.attrs['class'])
# print(tag.string)
# 标签树的下行遍历
# .contents 子节点的列表,将存入所有的son节点信息
# .childern 子节点的迭代类型,用于循环遍历son节点
# .descndants 子孙节点的迭代类型,包含所有子孙节点,用于循环遍历
# print(soup.head.cont
PythonBeautifulSoup库学习
最新推荐文章于 2023-07-06 21:52:05 发布