Elasticsearch嵌套查询

一、背景

最近在做基于宴会厅档期的商户搜索推荐时,如果用传统平铺式的mapping结构,无法满足需求场景,于是用到了Elasticsearch支持的Nested(嵌套)查询。

二、普通对象与嵌套对象的索引异同

如果一个对象不是嵌套类型,那么以如下原数据为例:

PUT /my_index/blogpost/1  
{  
  "title":"Nest eggs",  
  "body":  "Making your money work...",  
  "tags":  [ "cash", "shares" ],  
  "comments":[  
     {  
      "name":    "John Smith",  
      "comment": "Great article",  
      "age":     28,  
      "stars":   4,  
      "date":    "2014-09-01"  
     },  
     {  
      "name":    "Alice White",  
      "comment": "More like this please",  
      "age":     31,  
      "stars":   5,  
      "date":    "2014-10-22"  
     }  
  ]  
}

由于是json格式的结构化文档,es会平整成索引内的一个简单键值格式,如下:

{  
  "title":  [ eggs, nest ],  
  "body":  [ making, money, work, your ],  
  "tags":    [ cash, shares ],  
  "comments.name":    [ alice, john, smith, white ],  
  "comments.comment":  [ article, great, like, more, please, this ],  
  "comments.age":      [ 28, 31 ],  
  "comments.stars":     [ 4, 5 ],  
  "comments.date":      [ 2014-09-01, 2014-10-22 ]  
}

这样的话,像这种john/28,Alice/31间的关联性就丢失了,Nested Object就是为了解决这个问题。

将comments指定为Nested类型,如下mapping:

curl -XPUT 'localhost:9200/my_index' -d '  
{  
  "mappings":{  
     "blogpost":{  
         "properties":{  
             "comments":{  
                "type":"nested",   //声明为nested类型
                "properties":{  
                   "name":    {"type":"string"},  
                   "comment": { "type": "string"},  
                   "age":     { "type": "short"},  
                   "stars":   { "type": "short"},  
                   "date":    { "type": "date"}  
                }  
             }  
         }  
     }  
  }  
}

这样,每一个nested对象将会作为一个隐藏的单独文本建立索引,进而保持了nested对象的内在关联关系,如下:

{ ①  
  "comments.name":    [ john, smith ],  
  "comments.comment": [ article, great ],  
  "comments.age":     [ 28 ],  
  "comments.stars":   [ 4 ],  
  "comments.date":    [ 2014-09-01 ]  
}  
{   
  "comments.name":    [ alice, white ],  
  "comments.comment": [ like,more,please,this],  
  "comments.age":     [ 31 ],
  "comments.stars":   [ 5 ],  
  "comments.date":    [ 2014-10-22 ]  
}  
{   
  "title":          [ eggs, nest ],  
  "body":         [ making, money, work, your ],  
  "tags":          [ cash, shares ]  
}  
①nested object

三、嵌套对象的查询

命令查询(输出结果1):

curl -XGET localhost:9200/yzsshopv1/shop/_search?pretty -d '{"query" : {"bool" : {"filter" : {"nested" : {"path":"hallList","query":{"bool":{"filter":{"term":{"hallList.capacityMin" : "11"}}}}}}}}}'
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.0,
    "hits" : [ {
      "_index" : "yzsshopv1",
      "_type" : "shop",
      "_id" : "89999988",
      "_score" : 0.0,
      "_source" : {
        "cityId" : "1",
        "shopName" : "xxxx婚宴(yyyy店)",
        "shopId" : "89999988",
        "categoryId" : [ "55", "165", "2738" ],
        "hallList" : [ {
          "hallId" : "20625",
          "schedule" : ["2017-11-10", "2017-11-09"],
          "capacityMax" : 16,
          "capacityMin" : 12
        },  {
          "hallId" : "21080",
          "schedule" : [ "2017-12-10", "2017-09-09",  "2017-02-25"],
          "capacityMax" : 20,
          "capacityMin" : 11
        } ],
        "wedHotelTagValue" : [ "12087", "9601", "9603", "9602" ],
        "regionId" : [ "9", "824" ]
      }
    } ]
  }
}

java api查询封装:

BoolQueryBuilder boolBuilder = new BoolQueryBuilder();
NestedQueryBuilder nestedQuery = new NestedQueryBuilder("hallList", new TermQueryBuilder("hallList.capacityMin","11"));   //注意:除path之外,fieldName也要带上path (hallList)

boolBuilder.filter(nestedQuery);
searchRequest.setQuery(boolBuilder); //设置查询条件

java api输出字段封装:

searchRequest.addField("shopId");
searchRequest.addField("hallList. schedule");
searchRequest.addField("hallList.capacityMin");
searchRequest.addField("hallList.capacityMax");

如果输出的outputField为searchRequest.addField("hallList"),则会报错:illegal_argument_exception,reason:field [hallList] isn't a leaf field;

如果输出的outputField为searchRequest.addField("capacityMin"),则不报错,但没有capacityMin字段的值;

正确调用search后的输出结果(输出结果2):

{
  "took" : 8,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.0,
    "hits" : [{
      "_index" : "yzsshopv1",
      "_type" : "shop",
      "_id" : "89999988",
      "_score" : 0.0,
      "fields" : {
        "shopId" : [ "89999988" ],
        "hallList.hallId" : [ "20625", "21080"],
        "hallList.capacityMin" : [12, 11 ],
        "hallList.capacityMax" : [16, 20 ],
        "hallList.schedule" : [ "2017-11-10", "2017-11-09",  "2017-12-10", "2017-09-09",  "2017-02-25"]
      }
    }]
  }
}

对比输出结果1和2发现,命令输出嵌套对象结果1没问题,但通过java api输出结果2时,嵌套对象内部的关系也会打乱,比如hallList.schedule字段,无法区分到底哪些值属于hallList.hallId-20625,哪些属于21080。

//============以下更新20170331===========

经过后续调试,发现要让java api输出正确结果的嵌套对象,不能通过searchRequest.addField的方式,因为嵌套对象并不是叶子节点,需要通过以下的方式添加输出字段:

searchRequest.setFetchSource(new String[]{"shopId","hallList"},new String[]{});

还有一个不足点是: 嵌套查询请求返回的是整个文本,而不仅是匹配的nested文本。

四、参考文档

  1. https://www.elastic.co/guide/en/elasticsearch/guide/master/nested-objects.html
  2. http://stackoverflow.com/questions/23562192/unable-to-retrieve-nested-objects-using-elasticsearch-java-api
  3. http://elasticsearch.cn/book/elasticsearch_definitive_guide_2.x/nested-aggregation.html

转载于:https://my.oschina.net/weiweiblog/blog/1572727

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值