scrapy crawl xmlfeed spider

最新推荐文章于 2020-06-22 18:50:53 发布

weixin_33734785

最新推荐文章于 2020-06-22 18:50:53 发布

阅读量61

点赞数

文章标签： python

原文链接：http://www.cnblogs.com/Erick-L/p/6835510.html

版权

from scrapy.spiders import XMLFeedSpider
from myxml.items import MyxmlItem

class XmlspiderSpider(XMLFeedSpider):
    name = 'xmlspider'
    allowed_domains = ['sina.com.cn']
    start_urls = ['http://blog.sina.com.cn/rss/1165656262.xml']
    iterator = 'iternodes' # you can change this; see the docs
    itertag = 'rss' # change it accordingly

    def parse_node(self, response, selector):
        i =MyxmlItem()
        i['title'] = selector.xpath('/rss/channel/item/title/text()').extract()
        #i['url'] = selector.select('url').extract()
        #i['name'] = selector.select('name').extract()
        #i['description'] = selector.select('description').extract()
        for j in range(len(i['title'])):
            print(i['title'][j])
        return i

转载于:https://www.cnblogs.com/Erick-L/p/6835510.html

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

weixin_33734785

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
scrapy crawl xmlfeed spider

from scrapy.spiders import XMLFeedSpiderfrom myxml.items import MyxmlItemclass XmlspiderSpider(XMLFeedSpider): name = 'xmlspider' allowed_domains = ['sina.com.cn'] start_urls = ...
复制链接

扫一扫