step1:用request模块获取网页源代码
import requests
keyword = input("输入爬取的关键字:")
url = r"https://xian.zbj.com/search/f/?type=new&kw=%s"%keyword
resp = requests.get(url)
step2:分析页面源代码中是否有对应数据
step3: 用xpath进行解析
from lxml import etree
tree = etree.HTML(resp.text)
step4:定位数据,右键检测,在Google浏览器中复制Xpath。
代码:
import requests
from lxml import etree
def spier():
keyword = input("输入爬取的关键字:")
url = r"https://xian.zbj.com/search/f/?type=new&kw=%s"%keyword