解决网站需要cookies登录和内容需要动态加载问题

phantomjsMiddleware

 1 class PhantomJSMiddleware(object):
 2     @classmethod
 3     def process_request(cls, request, spider):
 4         from selenium import webdriver
 5         from scrapy.http import HtmlResponse
 6 
 7         driver = webdriver.PhantomJS(r'C:\InstallFile\Phantomjs\bin\phantomjs.exe')
 8         driver.get(request.url)
 9         driver.implicitly_wait(1)
10 
11         saved_cookies = driver.get_cookies()
12         driver2 = webdriver.PhantomJS(r'C:\InstallFile\Phantomjs\bin\phantomjs.exe')
13         driver2.get(request.url)
14         driver2.implicitly_wait(1)
15         driver2.delete_all_cookies()
16 
17         for cookie in saved_cookies:
18             for k in ('name','value','domain','path','expiry'):
19                 if k not in list(cookie.keys()):
20                     if k == 'expiry':
21                         cookie[k] = 1475825481
22 
23             driver2.add_cookie({k:cookie[k] for k in ('name', 'value', 'domain', 'path', 'expiry') if k in cookie})
24             print(cookie)
25         driver2.get(request.url)
26         driver2.implicitly_wait(1)
27 
28         content = driver.page_source.encode('utf-8')
29         driver.quit()
30 
31         return HtmlResponse(request.url, encoding='utf-8', body=content, request=request)

 

转载于:https://www.cnblogs.com/liyugeng/p/7908787.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值