一、抓取百度首页提取标题内容
#!/usr/bin/env python
#-*- coding:utf-8 -*-
from lxml import html
import requests
#抓取html页面
url = 'https://www.baidu.com/'
page = requests.get(url)
page.encoding = 'UTF-8'
#解析html页面
tree = html.fromstring(page.text)
title = tree.xpath('//head/title/text()')
print(title)
二、requests库
参考文档:http://cn.python-requests.org/zh_CN/latest/
三、lxml库
参考文档:https://lxml.de/
四、xpath
参考文档:https://lxml.de/xpathxslt.html
xpath教程文档:http://www.w3school.com.cn/xpath/