Scrapy的Request类支持设置cookie属性,要在爬虫请求中带上cookie,可以重载Spider的start_requests方法。
import sys
from scrapy.spider import Spider
from scrapy.selector import Selector
from scrapy.http.request import Request
class InfoqSpider(Spider):
name = "techbrood"
allowed_domains = ["techbrood.com"]
start_urls = [
"http://techbrood.com",
]
def start_requests(self):
for url in self.start_urls:
yield Request(url, cookies={'techbrood.com': 'true'})
参考文档: