Elasticsearch 报错解决记录（2）：Scrapy pipelines

最新推荐文章于 2023-08-22 12:06:27 发布

Hello-H

最新推荐文章于 2023-08-22 12:06:27 发布

阅读量1.6k

点赞数 4

分类专栏： Elasticsearch Scrapy 文章标签： Scrapy Elasticsearch

本文链接：https://blog.csdn.net/m0_37897007/article/details/89762068

版权

Scrapy 同时被 2 个专栏收录

4 篇文章 0 订阅

订阅专栏

Elasticsearch

3 篇文章 0 订阅

订阅专栏

Elasticsearch 报错解决记录（2）

错误三：
scrapy elasticsearch_dsl 操作ES报错：

Traceback (most recent call last):
  File "D:/Workspace/Pycharm/ITnest/itnest_spider/itnest_spider/models/cs_new.py", line 29, in <module>
    NewType.init()
  File "D:\Users\fzhsm\Anaconda3\lib\site-packages\elasticsearch_dsl-6.4.0-py3.7.egg\elasticsearch_dsl\document.py", line 140, in init
  File "D:\Users\fzhsm\Anaconda3\lib\site-packages\elasticsearch_dsl-6.4.0-py3.7.egg\elasticsearch_dsl\index.py", line 293, in save
  File "D:\Users\fzhsm\Anaconda3\lib\site-packages\elasticsearch_dsl-6.4.0-py3.7.egg\elasticsearch_dsl\index.py", line 405, in exists
  File "D:\Users\fzhsm\Anaconda3\lib\site-packages\elasticsearch_dsl-6.4.0-py3.7.egg\elasticsearch_dsl\index.py", line 120, in _get_connection
ValueError: You cannot perform API calls on the default index.

最初版本的 cs_news.py 文件代码如下：

# -*- coding:utf-8 -*-

from elasticsearch_dsl import DocType, Nested, Date, Boolean, analyzer, Completion, Text, Keyword, Integer
from elasticsearch_dsl.connections import connections


conn = connections.create_connection(hosts=["localhost"])


class NewsType(DocType):
    title = Text(analyzer="ik_max_word", search_analyzer="ik_max_word")
    time = Date()
    url = Keyword()
    tags = Text(analyzer="ik_max_word", search_analyzer="ik_max_word")
    views = Integer()
    summary = Text(analyzer="ik_max_word", search_analyzer="ik_max_word")
    author = Keyword()
    source = Keyword()
    content = Text(analyzer="ik_max_word", search_analyzer="ik_max_word")
    support = Integer()
    oppose = Integer()

    class Meta:
        # 索引名称
        index = "news"


if __name__ == '__main__':
    NewType.init()

上面代码是参考网上的示例，因为网上重复的很多就不贴链接了。
但是运行总是报错误三
问度娘，度娘也不说！！
问谷哥，问了好久才给我参考链接

最终版本的 cs_news.py 文件代码如下

# -*- coding:utf-8 -*-
from elasticsearch_dsl import connections, Document, Keyword, Text, Integer, Date

connections.create_connection(hosts=["localhost"])


class NewsIndex(Document):
    news_id = Keyword()
    title = Text(analyzer="ik_max_word")
    news_time = Date()
    crawl_time = Date()
    url = Keyword()
    tags = Text(analyzer="ik_max_word")
    views = Integer()
    summary = Text(analyzer="ik_max_word")
    author = Keyword()
    source = Keyword()
    content = Text(analyzer="ik_max_word")
    support = Integer()
    oppose = Integer()

    class Index:
        name = "itnest_news"


if __name__ == '__main__':
    NewIndex.init()

1. class Mate: 改为 class index：
2. 父类 DocType 改为 Document
3. Elasticsearch 7 之后将取消 type，所以 type 就使用了默认名 doc
运行成功！Kibana 显示结果如下：
在这里插入图片描述

Hello-H

关注

4
点赞
踩
7

收藏

觉得还不错? 一键收藏
0
评论
Elasticsearch 报错解决记录（2）：Scrapy pipelines

Elasticsearch 搜索引擎报错解决记录（2）错误三：Python 通过elasticsearch_dsl 操作ES报错：Traceback (most recent call last): File "D:/Workspace/Pycharm/ITnest/itnest_spider/itnest_spider/models/cs_new.py", line 29, in &l...
复制链接

扫一扫

专栏目录