之前介绍的BeautifulSoup和requests,主要针对静态网页对象,当遇到包含JavaScript代码的动态页面时,可以用PhantomJS来爬取网页数据。
1.PhantomJS简介
2.测试
测试代码如下所示:
# -*- coding: utf-8 -*-
"""
Created on Wed May 17 21:36:29 2017
@author: Administrator
"""
from selenium import webdriver
driver = webdriver.PhantomJS()
driver.get('http://hotel.qunar.com/city/beijing_city/dt-20438/?in_track=hotel_recom_beijing_city02')
data = driver.find_element_by_id("jd_comments").text
print(data)
driver.close()
测试结果如下图所示: