python selenium qwebengineview获取页面元素_Python学习第八十三天：页面元素解析

最新推荐文章于 2024-06-16 00:06:25 发布

weixin_39992665

最新推荐文章于 2024-06-16 00:06:25 发布

阅读量772

点赞数

文章标签： python selenium qwebengineview获取页面元素

本文链接：https://blog.csdn.net/weixin_39992665/article/details/111633626

版权

1.解析字段信息

我们知道蜘蛛运行时会下载要爬取的页面，然后传给给start_urls，页面的返回对象response响应体就会封装到parse方法response对象里面，然后通过response对象css选择器定位元素，返回一个selector对象，通过extract()方法来提取selector对象中标签的信息。
那现在我们使用dribbble网站来试着解析字段信息，创建一个dribbble蜘蛛，就和之前创建csdn一样，然后将测试页面中的execute()方法中的参数改为需要测试的蜘蛛页面中的name属性值。

import scrapy
from urllib import parse
from scrapy.http import Request
class DribbbleSpider(scrapy.Spider):
    name = 'dribbble'
    allowed_domains = ['dribbble.com']
    start_urls = ['https://dribbble.com/stories']
    def parse(self, response):
        # 获取a标签的url值
        # urls = response.css('h2 a::attr(href)').extract()
        a_nodes = response.css('header div.teaser a')
        for a_node in a_nodes:
            # print(a_node)
            a_url = a_node.css('::attr(href)').extract()[0]
            a_i

最低0.47元/天解锁文章

weixin_39992665

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python selenium qwebengineview获取页面元素_Python学习第八十三天：页面元素解析

1.解析字段信息我们知道蜘蛛运行时会下载要爬取的页面，然后传给给start_urls，页面的返回对象response响应体就会封装到parse方法response对象里面，然后通过response对象css选择器定位元素，返回一个selector对象，通过extract()方法来提取selector对象中标签的信息。那现在我们使用dribbble网站来试着解析字段信息，创建一个dribbble蜘蛛...
复制链接

扫一扫

python selenium qwebengineview获取页面元素_Python学习第八十三天：页面元素解析

1.解析字段信息

“相关推荐”对你有帮助么？