Beautiful Soup库入门
-
安装
pip install beautifulsoup4 -
测试
>>> # 使用requests获取demo >>> import requests >>> r = requests.get("http://python123.io/ws/demo.html") >>> r.text '<html><head><title>This is a python demo page</title></head>\r\n<body>\r\n<p class="title"><b>The demo python introduces several python courses.</b></p>\r\n<p class="course">Python is a wonderful general-purpose programming language. You can learn Python from novice to professional by tracking the following courses:\r\n<a href="http://www.icourse163.org/course/BIT-268001" class="py1" id="link1">Basic Python</a> and <a href="http://www.icourse163.org/course/BIT-1001870001" class="py2" id="link2">Advanced Python</a>.</p>\r\n</body></html>' >>> demo = r.text >>> >>> >>> >>> # 导入BeautifulSoup 注意大小写敏感 >>> from bs4 import BeautifulSoup >>> soup = BeautifulSoup(demo,'html.parser') # prettify()为HTML文本<>及其内容增加更加'\n' # prettify()可用于标签,方法: <tag>.prettify() >>> print(soup.prettify()) <html> <head> <title> This is a python demo page </title> </head> <body> <p class="title">

这篇博客介绍了Beautiful Soup库的安装、测试和基本用法,重点讲解了如何进行标签树的下行、上行和平行遍历,是Python中解析HTML/XML文档的实用工具。
最低0.47元/天 解锁文章
223

被折叠的 条评论
为什么被折叠?



