爬虫代码
#coding:utf-8
import scrapy
from seo1.items import Seo1Item
query = "手表回收"
class Dmozspider(scrapy.Spider):
name = "seo1"
start_urls = ['http://www.baidu.com/s?wd=%s' % query]
def parse(self, response):
print response.url
print response.headers
html = response.body
title = response.xpath("/html/head/title/text()").extract()[0]
item = Seo1Item()
item['title'] = title
yield item
items文件代码
import scrapy
class Seo1Item(scrapy.Item):
title = scrapy.Field()
pass
最后使用scrapy crawl seo1 -o items.csv 即可将获取到的title的内容放入items.csv文件了。多次运行就放入多次。
分享到: