爬虫
DwyanePeng
这个作者很懒,什么都没留下…
展开
-
scrapy爬虫
创建一个新scrapy项目:scrapy genspider boatcompany www.sofreight.com使用scrapy爬取航司网站数据爬取一个div中所有文字text:text = response.xpath('//div[@class="carrier_desc"]').xpath('string(.)').extract_first()爬取链接时要注意空链接判断使...原创 2019-06-20 17:18:12 · 242 阅读 · 0 评论 -
爬虫相关
模拟浏览器访问:from selenium import webdriverfrom scrapy.selector import Selectorbrowser = webdriver.Firefox()browser.get("https://www.planespotters.net/deliveries/1960/01")res = Selector(text=browser...原创 2019-07-25 16:53:14 · 291 阅读 · 0 评论 -
selenium模拟firefox点击,优化内存
import scrapyfrom parse_tools.parseTools import get_text, get_js_webpagefrom parse_tools.Postgredata import Postgredatafrom customswords.items import realtime_flight_filterfrom selenium import we...原创 2019-08-02 09:29:54 · 1031 阅读 · 0 评论 -
scrapy 通过命令行传参数批量爬取
class EypSpider(scrapy.Spider): name = 'eyp' def __init__(self, category=None, *args, **kwargs): super(EypSpider, self).__init__(*args, **kwargs) cat = [category, category] ...原创 2019-09-10 16:09:42 · 562 阅读 · 0 评论