导学
requests库与beautiful soup库结合使用解析html页面
安装命令pip install beautifulsoup4
单元4:Beautiful Soup库入门
beautifulsoup4库的安装
演示hmtl页面地址:http://python123.io/ws/demo.html
识别出源代码
import requests
r = requests.get('http://python123.io/ws/demo.html')
print(r.text)
解析出的html页面文本内容
<html><head><title>This is a python demo page</title></head>
<body>
<p class="title"><b>The demo python introduces several python courses.</b></p>
<p class="course">Python is a wonderful general-purpose programming language. You can learn Python from novice to professional by tracking the following courses:
<a href="http://www.icourse163.org/course/BIT-268001" class="py1" id="link1">Basic Python</a> and <a href="http://www.icourse163.org/course/BIT-1001870001" class="py2" id="link2">Advanced Python</a>.</p>
</body></html>
解析成规则html页面
import requests
r = requests.get('http://python123.io/ws/demo.html')
demo = r.text
from bs4 import BeautifulSoup #重点
soup = BeautifulSoup(demo