Scrapy中extract_first()和extract()的区别

最新推荐文章于 2023-05-23 23:42:49 发布

来时春尽

最新推荐文章于 2023-05-23 23:42:49 发布

阅读量865

点赞数 1

分类专栏：爬虫文章标签：爬虫

本文链接：https://blog.csdn.net/weixin_41998371/article/details/110477214

版权

爬虫专栏收录该内容

7 篇文章 0 订阅

订阅专栏

测试用到的爬取网站
在这里插入图片描述

In [11]: 
print(response.xpath('//h3/a/@title'))
# scrapy.selector.unified.SelectorList 是Selector组成的列表
Out[11]:
# 为了方便阅读换行符我手打的
[<Selector xpath='//h3/a/@title' data='A Light in the Attic'>,
<Selector xpath='//h3/a/@title' data='Tipping the Velvet'>, 
<Selector xpath='//h3/a/@title' data='Soumission'>, 
<Selector xpath='//h3/a/@title' data='Sharp Objects'>, 
<Selector xpath='//h3/a/@title' data='Sapiens: A Brief History of Humankind'>, 
<Selector xpath='//h3/a/@title' data='The Requiem Red'>,
<Selector xpath='//h3/a/@title' data='The Dirty Little Secrets of Getting Y...'>, 
<Selector xpath='//h3/a/@title' data='The Coming Woman: A Novel Based on th...'>,
<Selector xpath='//h3/a/@title' data='The Boys in the Boat: Nine Americans ...'>,]


In [9]: print(response.xpath('//h3/a/@title').extract())
# List
Out[9]:
# 为了方便阅读换行符我手打的
['A Light in the Attic', 
 'Tipping the Velvet', 
 'Soumission', 
 'Sharp Objects', 
 'Sapiens: A Brief History of Humankind', 
 'The Requiem Red', 
 'The Dirty Little Secrets of Getting Your Dream Job', 
 'The Coming Woman: A Novel Based on the Life of the Infamous Feminist, Victoria Woodhull',
 'The Boys in the Boat: Nine Americans and Their Epic Quest for Gold at the 1936 Berlin Olympics', ]

In [7]: print(response.xpath('//h3/a/@title').extract_first())
# Str
Out[7]:
A Light in the Attic

在有很多数据情况下	type()	说明
直接Xpath	是Selector组成的列表
Xpath.extract()	List	把爬到的数据组成一个列表
Xpath.extract_first()	Str	把爬到的数据组成一个列表提取第一个转化为Str格式
Xpath.get() Str	把爬到的数据组成一个列表提取第一个转化为Str格式

别人博客的补充说明
https://www.ucloud.cn/yun/43396.html

来时春尽

关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
Scrapy中extract_first()和extract()的区别

测试用到的爬取网站In [11]: print(response.xpath('//h3/a/@title'))# scrapy.selector.unified.SelectorList 是Selector组成的列表Out[11]:# 为了方便阅读换行符我手打的[<Selector xpath='//h3/a/@title' data='A Light in the Attic'>,<Selector xpath='//h3/a/@title' data='Tippin
复制链接

扫一扫

专栏目录