elasticsearch 大字段高亮速度慢优化

最新推荐文章于 2023-07-03 21:52:19 发布

yingchenwy

最新推荐文章于 2023-07-03 21:52:19 发布

阅读量939

点赞数

分类专栏： elastic search fvh highlight

本文链接：https://blog.csdn.net/u010483897/article/details/103978250

版权

elastic search 同时被 3 个专栏收录

22 篇文章 1 订阅

订阅专栏

fvh

1 篇文章 0 订阅

订阅专栏

highlight

1 篇文章 0 订阅

订阅专栏

对大字段在设计mapping时，添加term_vector参数，如下：

"description": {
          "similarity": "customize_bm25",
          "type": "text",
          "store": true,
          "analyzer": "my_jieba_index_analyzer",
          "search_analyzer": "my_jieba_search_analyzer",
          "term_vector" : "with_positions_offsets"
        }

配置该参数后，能明显看到高亮速度快了很多。

但是，当输入某些查询词时，可能会遇到如下错误：

错误Lucense解析字段中的空格导致的。

解决方案：把空格term，使用filter过滤掉。

但是，在添加空格filter时，发现一个问题，就是使用jieba分词器，就算添加了如下filter过滤器，也没办法过滤到空格term：

"my_stop_filter": {
            "ignore_case": "true",
            "type": "stop",
            "stopwords": [
              " ",
              "的",
              "得",
              "地"
            ]
          },

而使用ik分词器是可以，所以就转战ik了。定义了两个解析器，如下：

"my_ik_index_analyzer": {
            "filter": [
              "my_stop_filter"
            ],
            "type": "custom",
            "tokenizer": "ik_max_word"
          },
          "my_ik_search_analyzer": {
            "filter": [
              "my_stop_filter"
            ],
            "type": "custom",
            "tokenizer": "ik_smart"
          }

大字段mapping定义如下：

"description": {
          "similarity": "customize_bm25",
          "type": "text",
          "store": true,
          "analyzer": "my_ik_index_analyzer",
          "search_analyzer": "my_ik_search_analyzer",
          "term_vector" : "with_positions_offsets"
        }

如此，上述报错就会消失。

done......

yingchenwy

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
elasticsearch 大字段高亮速度慢优化

对大字段在设计mapping时，添加term_vector参数，如下："description": { "similarity": "customize_bm25", "type": "text", "store": true, "analyzer": "my_jieba_index_analyzer", ...
复制链接

扫一扫