Beautiful Soup库入门
-
安装
pip install beautifulsoup4
-
测试
>>> # 使用requests获取demo >>> import requests >>> r = requests.get("http://python123.io/ws/demo.html") >>> r.text '<html><head><title>This is a python demo page</title></head>\r\n<body>\r\n<p class="title"><b>The demo python introduces several python courses.</b></p>\r\n<p class="course">Python is a wonderful general-purpose programming language. You can learn Python from novice to professional by tracking the following courses:\r\n<a href="http://www.icourse163.org/course/BIT-268001" class="py1" id="link1">Basic Python</a> and <a href="http://www.icourse163.org/course/BIT-1001870001" class="py2" id="link2">Advanced Python</a>.</p>\r\n</body></html>' >>> demo = r.text >>> >>> >>> >>> # 导入BeautifulSoup 注意大小写敏感 >>> from bs4 import BeautifulSoup >>> soup = BeautifulSoup(demo,'html.parser') # prettify()为HTML文本<>及其内容增加更加'\n' # prettify()可用于标签,方法: <tag>.prettify() >>> print(soup.prettify()) <html> <head> <title> This is a python demo page </title> </head> <body> <p class="title">