Elasticsearch nested字段高亮

一、数据准备

创建索引结构

PUT my_index
{
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }
      },
      "age": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }
      },
      "interests": {
        "type": "nested",
        "properties": {
          "name": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword"
              }
            }
          },
          "level": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword"
              }
            }
          }
        }
      }
    }
  }
}

生成测试数据

POST my_index/_bulk
{ "index": { "_id": "1" }}
{ "name": "John Doe", "age": "25", "interests": [{ "name": "skiing interesting", "level": "8" }, { "name": "hiking interesting", "level": "6" }] }
{ "index": { "_id": "2" }}
{ "name": "Jane Smith", "age": "30", "interests": [{ "name": "reading interesting", "level": "9" }, { "name": "traveling interesting", "level": "7" }] }
{ "index": { "_id": "3" }}
{ "name": "Bob Johnson", "age": "40", "interests": [{ "name": "cooking interesting", "level": "5" }, { "name": "painting interesting", "level": "4" }] }
{ "index": { "_id": "4" }}
{ "name": "John Doe1", "age": "35", "interests": [{ "name": "fitness interesting", "level": "8" }, { "name": "hiking interesting", "level": "6" }] }
{ "index": { "_id": "5" }}
{ "name": "Jane Smith1", "age": "31", "interests": [{ "name": "fitness interesting", "level": "9" }, { "name": "traveling interesting", "level": "7" }] }
{ "index": { "_id": "6" }}
{ "name": "Bob Johnson1", "age": "44", "interests": [{ "name": "running interesting", "level": "5" }, { "name": "painting interesting", "level": "4" }] }
{ "index": { "_id": "7" }}
{ "name": "John Doe2", "age": "27", "interests": [{ "name": "hurlbat interesting", "level": "8" }, { "name": "hiking interesting", "level": "6" }] }
{ "index": { "_id": "8" }}
{ "name": "Jane Smith2", "age": "36", "interests": [{ "name": "game interesting", "level": "9" }, { "name": "hurlbat interesting", "level": "7" }] }
{ "index": { "_id": "9" }}
{ "name": "Bob Johnson2", "age": "42", "interests": [{ "name": "running interesting", "level": "5" }, { "name": "painting interesting", "level": "4" }] }
{ "index": { "_id": "10" }}
{ "name": "John Doe3", "age": "22", "interests": [{ "name": "game interesting", "level": "8" }, { "name": "hurlbat interesting", "level": "6" }] }
{ "index": { "_id": "11" }}
{ "name": "Jane Smith3", "age": "39", "interests": [{ "name": "hurlbat interesting", "level": "9" }, { "name": "traveling interesting", "level": "7" }] }
{ "index": { "_id": "12" }}
{ "name": "Bob Johnson4", "age": "49", "interests": [{ "name": "swimming interesting", "level": "5" }, { "name": "painting interesting", "level": "4" }] }

二、DSL高亮检索

1.非nested字段高亮

在查询中设置 highlight 参数,指定需要高亮显示的字段和高亮格式。例如:

GET /my_index/_search
{
  "query": {
    "match": {
      "name": "Bob Johnson4"
    }
  },
  "highlight": {
    "fields": {
      "name": {}  // 高亮显示 name字段
    },
    "pre_tags": ["<em>"],  // 高亮起始标签
    "post_tags": ["</em>"]  // 高亮结束标签
  }
}

发送查询请求后,查询结果中会包含一个 highlight 字段,其中包含了高亮后的内容。例如:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : 3.220356,
    "hits" : [
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "12",
        "_score" : 3.220356,
        "_source" : {
          "name" : "Bob Johnson4",
          "age" : "49",
          "interests" : [
            {
              "name" : "swimming interesting",
              "level" : "5"
            },
            {
              "name" : "painting interesting",
              "level" : "4"
            }
          ]
        },
        "highlight" : {
          "name" : [
            "<em>Bob</em> <em>Johnson4</em>"
          ]
        }
      },
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.060872,
        "_source" : {
          "name" : "Bob Johnson",
          "age" : "40",
          "interests" : [
            {
              "name" : "cooking interesting",
              "level" : "5"
            },
            {
              "name" : "painting interesting",
              "level" : "4"
            }
          ]
        },
        "highlight" : {
          "name" : [
            "<em>Bob</em> Johnson"
          ]
        }
      },
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 1.060872,
        "_source" : {
          "name" : "Bob Johnson1",
          "age" : "44",
          "interests" : [
            {
              "name" : "running interesting",
              "level" : "5"
            },
            {
              "name" : "painting interesting",
              "level" : "4"
            }
          ]
        },
        "highlight" : {
          "name" : [
            "<em>Bob</em> Johnson1"
          ]
        }
      },
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "9",
        "_score" : 1.060872,
        "_source" : {
          "name" : "Bob Johnson2",
          "age" : "42",
          "interests" : [
            {
              "name" : "running interesting",
              "level" : "5"
            },
            {
              "name" : "painting interesting",
              "level" : "4"
            }
          ]
        },
        "highlight" : {
          "name" : [
            "<em>Bob</em> Johnson2"
          ]
        }
      }
    ]
  }
}

可以看到,查询结果中的 highlight 字段中包含了高亮后的内容,用 <em> </em> 包裹了关键词。在前端页面中,可以使用 CSS 样式来修改高亮显示的样式。

2.nested字段高亮

在嵌套查询时,如果使用普通高亮查询,会有一个问题,因为一个文档数据记录中可能会有多个对象,所以一条文档可能会有多个满足高亮的字段,而普通高亮只会把高亮的字段在数组中罗列出来,并不知道一条高亮字段属于文档中的哪个对象。在查询中设置 highlight 参数,指定需要高亮显示的字段和高亮格式。对于 nested 类型的字段,需要使用 inner_hits 子句来指定需要高亮的子字段。例如:

GET /my_index/_search
{
  "query": {
    "nested": {
      "path": "interests",
      "query": {
        "bool": {
          "must": [
            {
              "match": {
                "interests.name": "hurlbat"
              }
            },
            {
              "match": {
                "interests.level": "8"
              }
            }
          ]
        }
      },
      "inner_hits": {
        "_source": false, // inner_hits中的_source不会返回字段内容
        "highlight": {
          "fields": {
            "interests.name": {}
          },
          "pre_tags": [
            "<em>"
          ],
          "post_tags": [
            "</em>"
          ]
        }
      }
    }
  }
}

发送查询请求后,查询结果中的 inner_hits 字段中包含了高亮后的内容。例如:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 3.429597,
    "hits" : [
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "7",
        "_score" : 3.429597,
        "_source" : {
          "name" : "John Doe2",
          "age" : "27",
          "interests" : [
            {
              "name" : "hurlbat interesting",
              "level" : "8"
            },
            {
              "name" : "hiking interesting",
              "level" : "6"
            }
          ]
        },
        "inner_hits" : {
          "interests" : {
            "hits" : {
              "total" : {
                "value" : 1,
                "relation" : "eq"
              },
              "max_score" : 3.429597,
              "hits" : [
                {
                  "_index" : "my_index",
                  "_type" : "_doc",
                  "_id" : "7",
                  "_nested" : { // 数组中的位置信息
                    "field" : "interests",// path
                    "offset" : 0 // 偏移量 这里说明是数组中第一个元素
                  },
                  "_score" : 3.429597,
                  "_source" : {
                    "level" : "8",
                    "name" : "hurlbat interesting"
                  },
                  "highlight" : {
                    "interests.name" : [
                      "<em>hurlbat</em> interesting"
                    ]
                  }
                }
              ]
            }
          }
        }
      }
    ]
  }
}

可以看到,查询结果中的 inner_hits 字段中包含了高亮后的内容,用 <em> </em> 包裹了关键词。在前端页面中,可以使用 CSS 样式来修改高亮显示的样式。

三、SpringBoot实现Elasticsearch检索高亮

SpringBoot集成Elasticsearch查看我另一篇博客SpringBoot集成Elasticsearch
由于高亮highlight是单独的属性,并没有体现在返回内容当中,所以就需要获取到highlight字段内容,然后替换掉检索结果_source中的字段内容。
封装一个ES组件

@Component
public class EsComponent {

    @Resource
    private ElasticsearchRestTemplate elasticsearchRestTemplate;

    /**
     * 检索
     *
     * @return
     */
    public SearchHits<Map> search(Query query) {
        final SearchHits<Map> searchHits = elasticsearchRestTemplate.search(query, Map.class, IndexCoordinates.of("my_index"));
        return searchHits;

    }
}

1.非nested字段高亮

public void nonNestedFieldsHighlight() {
        // 1.构建检索Query
        final HighlightBuilder highlightBuilder = new HighlightBuilder()
                .preTags("<em>").postTags("</em>")
                .highlighterType("unified")
                .field("name");
        final NativeSearchQueryBuilder nativeSearchQueryBuilder = new NativeSearchQueryBuilder()
                .withQuery(QueryBuilders.matchQuery("name", "Bob Johnson4"))
                .withTrackTotalHits(true)
                .withHighlightBuilder(highlightBuilder);
        // 2.检索
        final SearchHits<Map> searchResponse = esComponent.search(nativeSearchQueryBuilder.build());

        // 3.检索结果处理
        final List<SearchHit<Map>> searchHits = searchResponse.getSearchHits();
        for (SearchHit searchHit : searchHits) {
            final Map<String, ?> highlightFields = searchHit.getHighlightFields();
            //使用高亮替换检索内容
            highlightFields.entrySet().forEach(entry -> ((Map) searchHit.getContent()).put(entry.getKey(), entry.getValue()));
        }
        searchHits.forEach(entry -> System.out.println(entry.getContent()));
    }

2.nested字段高亮

 /**
     * nested字段高亮
     */
    public void nestedFieldsHighlight() {
        // 1.构建检索Query
        InnerHitBuilder innerHitBuilder = new InnerHitBuilder();
        HighlightBuilder highlightBuilder = new HighlightBuilder();
        highlightBuilder.preTags("<span style='color:red'>").postTags("</span>");
        //设置高亮的方法
        highlightBuilder.highlighterType("plain");
        //设置分段的数量不做限制
        highlightBuilder.numOfFragments(0);
        highlightBuilder.field("interests.name").field("interests.level");
        innerHitBuilder.setHighlightBuilder(highlightBuilder);
        innerHitBuilder.setName(UUID.randomUUID().toString());
        innerHitBuilder.setFetchSourceContext(new FetchSourceContext(false));

        final NestedQueryBuilder nestedQueryBuilder = QueryBuilders.nestedQuery(
                "interests",
                QueryBuilders.boolQuery()
                        .must(QueryBuilders.matchQuery("interests.name", "hurlbat"))
                        .must(QueryBuilders.matchQuery("interests.level", "8")),
                ScoreMode.None);
        nestedQueryBuilder.innerHit(innerHitBuilder);

        final NativeSearchQueryBuilder nativeSearchQueryBuilder = new NativeSearchQueryBuilder()
                .withQuery(nestedQueryBuilder)
                .withTrackTotalHits(true);
        // 2.检索
        final SearchHits<Map> searchResponse = esComponent.search(nativeSearchQueryBuilder.build());

        // 3.检索结果处理
        final List<SearchHit<Map>> searchHits = searchResponse.getSearchHits();
        for (SearchHit<Map> searchHit : searchHits) {
            // Map<第一层属性,Map<第二层属性,value>>
            final Map<String, ?> map = (Map<String, Map<String, String>>) searchHit.getContent();
            final Map<String, SearchHits<?>> innerHitMap = searchHit.getInnerHits();

            if (CollectionUtil.isNotEmpty(innerHitMap)) {
                for (Map.Entry<String, SearchHits<?>> entry : innerHitMap.entrySet()) {
                    final String innerHitName = entry.getKey();
                    final List<? extends SearchHit<?>> highlightSearchHits = entry.getValue().getSearchHits();
                    if (CollectionUtil.isNotEmpty(highlightSearchHits)) {
                        final Map<String, List<String>> highlightFields = highlightSearchHits.get(0).getHighlightFields();
                        final NestedMetaData nestedMetaData = highlightSearchHits.get(0).getNestedMetaData();
                        if (CollectionUtil.isNotEmpty(highlightFields)) {
                            highlightFields.entrySet().forEach(e -> {
                                final String[] keys = e.getKey().split("\\.");
                                final List<String> value = e.getValue();
                                // map.get(keys[0])两种情况:一种是对象 一种是集合
                                final Object o = map.get(keys[0]);
                                if (o instanceof Map) {
                                    ((Map<String,String>) o).put(keys[1], value.get(0));
                                } else if (o instanceof List) {
                                    ((List<Map<String,String>>) o).get(nestedMetaData.getOffset()).put(keys[1], value.get(0));
                                }

                            });
                        }
                    }
                }
            }
        }

        searchHits.forEach(entry -> System.out.println(entry.getContent()));
    }

五、参考文献

Elasticsearch highlighting

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值