scrapy 通过FormRequest模拟登录再继续

1.参考

https://doc.scrapy.org/en/latest/topics/spiders.html#scrapy.spiders.Spider.start_requests

自动提交 login.php 返回表单

https://doc.scrapy.org/en/latest/topics/request-response.html#using-formrequest-from-response-to-simulate-a-user-login

 

2.模拟登录雪球

# -*- coding: utf-8 -*-
import os
import scrapy
from scrapy.shell import inspect_response

# https://doc.scrapy.org/en/latest/topics/spiders.html start_requests() 章节

class LoginSpider(scrapy.Spider):
    name = 'login'
    allowed_domains = ['xueqiu.com']
    # start_urls = ['http://xueqiu.com/']  #The default implementation generates Request(url, dont_filter=True) for each url in start_urls.
    
    url_login = 'https://xueqiu.com/snowman/login',
    url_somebody = 'https://xueqiu.com/u/6146070786'
    data_dict = {
    'remember_me': 'true',
    # 'username': 'fake',  #返回200 {"error_description":"用户名或密码错误","error_uri":"/provider/oauth/token","error_code":"20082"}
    'username': os.getenv('xueqiu_username'),
    'password': os.getenv('xueqiu_password'),
    }
    
    def start_requests(self):
        return [scrapy.FormRequest(url = self.url_login,
                                    headers={'X-Requested-With': 'XMLHttpRequest'},  #否则404将导致退出,抓包页面显示登录成功
                                    meta={'proxy': 'http://127.0.0.1:8888'},  #否则fiddler导致返回缓慢
                                    formdata = self.data_dict,
                                    callback=self.logged_in)]

    def logged_in(self, response):
        # inspect_response(response, self)
        assert os.getenv('xueqiu_nickname') in response.text  #AssertionError 将导致退出
        return scrapy.Request(self.url_somebody, dont_filter=True, meta={'proxy': 'http://127.0.0.1:8888'})
        
    def parse(self, response):
        # inspect_response(response, self)
        self.log(os.getenv('xueqiu_nickname') in response.text)

 

转载于:https://www.cnblogs.com/my8100/p/scrapy_login.html

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值