elasticsearch中DSL之Joining queries(Has Child Query 和 Has Parent Query)

前言

Elasticsearch这样的分布式计算系统执行全SQL风格的表连接操作代价昂贵。相应地,Elasticsearch提供了两种形式的联结可以实现水平规模的扩展。

1.Nested Query

嵌套查询,嵌套查询首先要定义嵌套字段类型,然后使用嵌套查询(我认为这种方式价值不高,既然使用嵌套字段,为什么不直接在上层字段直接新建字段表示嵌套字段的含义呢)。

可以保持嵌套对象中各个属性相关关联的关系,避免联合查一个对象中的一个属性值和另一个对象的属性值,两个对象都可以查到,其实这时候想要都查不到。而嵌套对象就是做上面的场景的。
下面情况就应该出现。
在这里插入图片描述
设置为嵌套对象。
在这里插入图片描述

2.Has Child Query 和 Has Parent Query

一般sql我们要jion查询是在两个表的。所以父子查询也要在两个type中查询,但是这两个type必须属于同一个索引(一个索引对应多个类型官方是不建议的,大概7版本后要求一个索引只有一个type)
下面是例子:

PUT my_index1
{
  "mappings": {
    "my_parent": {
        "properties": {
            "parentId" :{
                "type": "keyword"
            },
            "name" :{
                "type": "keyword"
            },
            "age" :{
                "type": "integer"
            }
        }
    },
    "my_child": {
      "_parent": {
        "type": "my_parent" 
      },
      "properties": {
          "childId" :{
              "type": "keyword"
          },
          "name" :{
              "type": "keyword"
          },
          "age" :{
              "type": "integer"
          }
      }
    }
  }
}

新建索引的mapping
"mappings": {
   "my_child": {
      "_parent": {
         "type": "my_parent"
      },
      "_routing": {
         "required": true
      },
      "properties": {
         "age": {
            "type": "integer"
         },
         "childId": {
            "type": "keyword"
         },
         "name": {
            "type": "keyword"
         },
         "parentId": {
            "type": "keyword"
         }
      }
   },
   "my_parent": {
      "properties": {
         "age": {
            "type": "integer"
         },
         "name": {
            "type": "keyword"
         },
         "parentId": {
            "type": "keyword"
         }
      }
   }
}

可以发现两点:

  • my_child_parent元属性,该值的"type": "my_parent"构建父子type关系。
  • my_child_routing元属性是true,要通过_routing构建具体文档的父子关系。
    下面插入两个父文档
PUT my_index1/my_parent/parent100
{
  "parentId": "parent100",
  "name": "zhangsan",
  "age": "45"
}

PUT my_index1/my_parent/parent200
{
  "parentId": "parent200",
  "name": "lily",
  "age": "42"
}

在插入响应的子文档


PUT my_index1/my_child/1?parent=parent100
{
  "childId": "child100",
  "name": "xiaoming",
  "age": "14"
}

POST my_index1/my_child/2?parent=parent100
{
  "childId": "child200",
  "name": "xiaohong",
  "age": "17"
}

POST my_index1/my_child/3?parent=parent200
{
  "childId": "child300",
  "name": "lucy",
  "age": "21"
}

具体文档的关系如下 :

“parent100”, “zhangsan”, “45”“parent200”, “lily”, “42”
“child100”,“xiaoming”,“14” 、“child200”, “xiaohong”, “17”“child300”, “lucy”, “21”

查询举例:

  • 1.用子文档条件查询父文档 has_child---- 查询子文档xiaoming的父文档
GET my_index1/my_parent/_search
{
    "query": {
        "has_child": {
            "type": "my_child",
            "query": {
                "term": {
                   "name": {
                      "value": "xiaoming"
                   }
                }
            }
        }
    }
}

返回的结果:

 "hits": [
    {
       "_index": "my_index1",
       "_type": "my_parent",
       "_id": "parent100",
       "_score": 1,
       "_source": {
          "parentId": "parent100",
          "name": "zhangsan",
          "age": "45"
       }
    }
 ]
  • 1.用子文档条件查询父文档 has_child---- 查询子文档年龄大于10岁的父文档
GET my_index1/my_parent/_search
{
    "query": {
        "has_child": {
            "type": "my_child",
            "query": {
                "range": {
                   "age": {
                      "gt": "10"
                   }
                }
            }
        }
    }
}

返回的结果:

 "hits": [
    {
       "_index": "my_index1",
       "_type": "my_parent",
       "_id": "parent100",
       "_score": 1,
       "_source": {
          "parentId": "parent100",
          "name": "zhangsan",
          "age": "45"
       }
    },
    {
       "_index": "my_index1",
       "_type": "my_parent",
       "_id": "parent200",
       "_score": 1,
       "_source": {
          "parentId": "parent200",
          "name": "lily",
          "age": "42"
       }
    }
 ]
  • 2.父文档条件查子文档 has_parent---- 查询zhangsan的子文档
GET my_index1/my_child/_search
{
    "query": {
        "has_parent": {
            "type": "my_parent",
            "query": {
                "term": {
                   "name": "zhangsan"
                }
            }
        }
    }
}

返回的结果

"hits": [
   {
      "_index": "my_index1",
      "_type": "my_child",
      "_id": "1",
      "_score": 1,
      "_routing": "parent100",
      "_parent": "parent100",
      "_source": {
         "childId": "child100",
         "name": "xiaoming",
         "age": "14"
      }
   },
   {
      "_index": "my_index1",
      "_type": "my_child",
      "_id": "2",
      "_score": 1,
      "_routing": "parent100",
      "_parent": "parent100",
      "_source": {
         "childId": "child200",
         "name": "xiaohong",
         "age": "17"
      }
   }
]
  • 2.父文档条件查子文档 has_parent---- 查询父文档年龄小于43岁的子文档
GET my_index1/my_child/_search
{
    "query": {
        "has_parent": {
            "type": "my_parent",
            "query": {
                "range": {
                   "age": {
                      "lt": "43"
                   }
                }
            }
        }
    }
}

返回结果:

"hits": [
   {
      "_index": "my_index1",
      "_type": "my_child",
      "_id": "3",
      "_score": 1,
      "_routing": "parent200",
      "_parent": "parent200",
      "_source": {
         "childId": "child300",
         "name": "lucy",
         "age": "21"
      }
   }
]
  • 3.综合查询实例:
    最后说下,has_parenthas_child查询出的结果,仍然可以再用条件查询,达到真正的过滤,就是把has_parenthas_child作为bool查询中一个子查询。下面是一个例子。(其他类推)

查询张三子文档中年龄大于15的文档。

GET my_index1/my_child/_search
{
    "query": {
        "bool": {
            "must": [
               {
                   "range": {
                      "age": {
                         "gt": "15"
                      }
                   }
               },
               {
                   "has_parent": {
                        "type": "my_parent",
                        "query": {
                            "term": {
                               "name": "zhangsan"
                            }
                        }
                    }
               }
            ]
        }
    }
}

返回结果:

"hits": [
   {
      "_index": "my_index1",
      "_type": "my_child",
      "_id": "2",
      "_score": 2,
      "_routing": "parent100",
      "_parent": "parent100",
      "_source": {
         "childId": "child200",
         "name": "xiaohong",
         "age": "17"
      }
   }
]

Has Child QueryHas Parent Query是很耗时的,官方建议如果追求性能的话,建议不使用该查询。
has_child查询有min_childrenmax_children参数可以设置满足子文档数量的限制。

3.Parent Id Query

通过父文档的id查询子文档

GET my_index1/my_child/_search
{
    "query": {
        "parent_id" : {
            "type" : "my_child",
            "id" : "parent200"
        }
    }
}
  • type:指向子文档type
  • id:父文档的id

上面的查询和下面的查询是一样的

GET /my_index1/_search
{
  "query": {
    "has_parent": {
      "type": "my_parent",
        "query": {
          "term": {
            "_id": "parent200"
        }
      }
    }
  }
}

4.terms lookup mechanism:相当于sql中的级联查询(可以跨索引,也可以自己查自己)

参考我的博客《elasticsearch中DSL之Term level query(term query)》跳转

5. 在7版本后使用join属性来构建父子关系

5.1 设定 Parent/Child Mapping

blog_comments_relation设置为join属性,relation指定父子关系。
这里blog表示父文档,comment表示子文档

PUT my_blogs
{
  "settings": {
    "number_of_shards": 2
  },
  "mappings": {
    "properties": {
      "blog_comments_relation": {
        "type": "join",
        "relations": {
          "blog": "comment"
        }
      },
      "content": {
        "type": "text"
      },
      "title": {
        "type": "keyword"
      }
    }
  }
}

插入几条数据
父文档

PUT my_blogs/_doc/blog1
{
  "title":"Learning Elasticsearch",
  "content":"learning ELK @ geektime",
  "blog_comments_relation":{
    "name":"blog"
  }
}

父文档

PUT my_blogs/_doc/blog2
{
  "title":"Learning Hadoop",
  "content":"learning Hadoop",
    "blog_comments_relation":{
    "name":"blog"
  }
}

子文档(这里routing是把父子文档放到一个分区中)

PUT my_blogs/_doc/comment1?routing=blog1
{
  "comment":"I am learning ELK",
  "username":"Jack",
  "blog_comments_relation":{
    "name":"comment",
    "parent":"blog1"
  }
}

子文档

PUT my_blogs/_doc/comment2?routing=blog2
{
  "comment":"I like Hadoop!!!!!",
  "username":"Jack",
  "blog_comments_relation":{
    "name":"comment",
    "parent":"blog2"
  }
}

子文档

PUT my_blogs/_doc/comment3?routing=blog2
{
  "comment":"Hello Hadoop",
  "username":"Bob",
  "blog_comments_relation":{
    "name":"comment",
    "parent":"blog2"
  }
}

查询所有文档

POST my_blogs/_search
{

}

根据父文档ID查看

GET my_blogs/_doc/blog2

根据Parent Id 查询

POST my_blogs/_search
{
  "query": {
    "parent_id": {
      "type": "comment",
      "id": "blog2"
    }
  }
}

Has Child 查询,返回父文档

POST my_blogs/_search
{
 "query": {
   "has_child": {
     "type": "comment",
     "query" : {
               "match": {
                   "username" : "Jack"
               }
           }
   }
 }
}

Has Parent 查询,返回相关的子文档

POST my_blogs/_search
{
  "query": {
    "has_parent": {
      "parent_type": "blog",
      "query" : {
                "match": {
                    "title" : "Learning Hadoop"
                }
            }
    }
  }
}

通过ID ,访问子文档

GET my_blogs/_doc/comment3

通过ID和routing ,访问子文档

GET my_blogs/_doc/comment3?routing=blog2

更新子文档

PUT my_blogs/_doc/comment3?routing=blog2
{
    "comment": "Hello Hadoop??",
    "blog_comments_relation": {
      "name": "comment",
      "parent": "blog2"
    }
}

如果对你有帮助,请点赞、评论、加收藏哦~~~

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值