前言
Elasticsearch这样的分布式计算系统执行全SQL风格的表连接操作代价昂贵。相应地,Elasticsearch提供了两种形式的联结可以实现水平规模的扩展。
1.Nested Query
嵌套查询,嵌套查询首先要定义嵌套字段类型,然后使用嵌套查询(我认为这种方式价值不高,既然使用嵌套字段,为什么不直接在上层字段直接新建字段表示嵌套字段的含义呢)。
可以保持嵌套对象中各个属性相关关联的关系,避免联合查一个对象中的一个属性值和另一个对象的属性值,两个对象都可以查到,其实这时候想要都查不到。而嵌套对象就是做上面的场景的。
下面情况就应该出现。
设置为嵌套对象。
2.Has Child Query 和 Has Parent Query
一般sql我们要jion查询是在两个表的。所以父子查询也要在两个type
中查询,但是这两个type
必须属于同一个索引(一个索引对应多个类型官方是不建议的,大概7版本后要求一个索引只有一个type
)
下面是例子:
PUT my_index1
{
"mappings": {
"my_parent": {
"properties": {
"parentId" :{
"type": "keyword"
},
"name" :{
"type": "keyword"
},
"age" :{
"type": "integer"
}
}
},
"my_child": {
"_parent": {
"type": "my_parent"
},
"properties": {
"childId" :{
"type": "keyword"
},
"name" :{
"type": "keyword"
},
"age" :{
"type": "integer"
}
}
}
}
}
新建索引的mapping
"mappings": {
"my_child": {
"_parent": {
"type": "my_parent"
},
"_routing": {
"required": true
},
"properties": {
"age": {
"type": "integer"
},
"childId": {
"type": "keyword"
},
"name": {
"type": "keyword"
},
"parentId": {
"type": "keyword"
}
}
},
"my_parent": {
"properties": {
"age": {
"type": "integer"
},
"name": {
"type": "keyword"
},
"parentId": {
"type": "keyword"
}
}
}
}
可以发现两点:
my_child
有_parent
元属性,该值的"type": "my_parent"
构建父子type
关系。my_child
有_routing
元属性是true
,要通过_routing
构建具体文档的父子关系。
下面插入两个父文档
PUT my_index1/my_parent/parent100
{
"parentId": "parent100",
"name": "zhangsan",
"age": "45"
}
PUT my_index1/my_parent/parent200
{
"parentId": "parent200",
"name": "lily",
"age": "42"
}
在插入响应的子文档
PUT my_index1/my_child/1?parent=parent100
{
"childId": "child100",
"name": "xiaoming",
"age": "14"
}
POST my_index1/my_child/2?parent=parent100
{
"childId": "child200",
"name": "xiaohong",
"age": "17"
}
POST my_index1/my_child/3?parent=parent200
{
"childId": "child300",
"name": "lucy",
"age": "21"
}
具体文档的关系如下 :
“parent100”, “zhangsan”, “45” | “parent200”, “lily”, “42” |
---|---|
“child100”,“xiaoming”,“14” 、“child200”, “xiaohong”, “17” | “child300”, “lucy”, “21” |
查询举例:
- 1.用子文档条件查询父文档
has_child
---- 查询子文档xiaoming
的父文档
GET my_index1/my_parent/_search
{
"query": {
"has_child": {
"type": "my_child",
"query": {
"term": {
"name": {
"value": "xiaoming"
}
}
}
}
}
}
返回的结果:
"hits": [
{
"_index": "my_index1",
"_type": "my_parent",
"_id": "parent100",
"_score": 1,
"_source": {
"parentId": "parent100",
"name": "zhangsan",
"age": "45"
}
}
]
- 1.用子文档条件查询父文档
has_child
---- 查询子文档年龄大于10岁的父文档
GET my_index1/my_parent/_search
{
"query": {
"has_child": {
"type": "my_child",
"query": {
"range": {
"age": {
"gt": "10"
}
}
}
}
}
}
返回的结果:
"hits": [
{
"_index": "my_index1",
"_type": "my_parent",
"_id": "parent100",
"_score": 1,
"_source": {
"parentId": "parent100",
"name": "zhangsan",
"age": "45"
}
},
{
"_index": "my_index1",
"_type": "my_parent",
"_id": "parent200",
"_score": 1,
"_source": {
"parentId": "parent200",
"name": "lily",
"age": "42"
}
}
]
- 2.父文档条件查子文档
has_parent
---- 查询zhangsan
的子文档
GET my_index1/my_child/_search
{
"query": {
"has_parent": {
"type": "my_parent",
"query": {
"term": {
"name": "zhangsan"
}
}
}
}
}
返回的结果
"hits": [
{
"_index": "my_index1",
"_type": "my_child",
"_id": "1",
"_score": 1,
"_routing": "parent100",
"_parent": "parent100",
"_source": {
"childId": "child100",
"name": "xiaoming",
"age": "14"
}
},
{
"_index": "my_index1",
"_type": "my_child",
"_id": "2",
"_score": 1,
"_routing": "parent100",
"_parent": "parent100",
"_source": {
"childId": "child200",
"name": "xiaohong",
"age": "17"
}
}
]
- 2.父文档条件查子文档
has_parent
---- 查询父文档年龄小于43岁的子文档
GET my_index1/my_child/_search
{
"query": {
"has_parent": {
"type": "my_parent",
"query": {
"range": {
"age": {
"lt": "43"
}
}
}
}
}
}
返回结果:
"hits": [
{
"_index": "my_index1",
"_type": "my_child",
"_id": "3",
"_score": 1,
"_routing": "parent200",
"_parent": "parent200",
"_source": {
"childId": "child300",
"name": "lucy",
"age": "21"
}
}
]
- 3.综合查询实例:
最后说下,has_parent
和has_child
查询出的结果,仍然可以再用条件查询,达到真正的过滤,就是把has_parent
和has_child
作为bool查询中一个子查询。下面是一个例子。(其他类推)
查询张三子文档中年龄大于15的文档。
GET my_index1/my_child/_search
{
"query": {
"bool": {
"must": [
{
"range": {
"age": {
"gt": "15"
}
}
},
{
"has_parent": {
"type": "my_parent",
"query": {
"term": {
"name": "zhangsan"
}
}
}
}
]
}
}
}
返回结果:
"hits": [
{
"_index": "my_index1",
"_type": "my_child",
"_id": "2",
"_score": 2,
"_routing": "parent100",
"_parent": "parent100",
"_source": {
"childId": "child200",
"name": "xiaohong",
"age": "17"
}
}
]
Has Child Query
和Has Parent Query
是很耗时的,官方建议如果追求性能的话,建议不使用该查询。
has_child
查询有min_children
和max_children
参数可以设置满足子文档数量的限制。
3.Parent Id Query
通过父文档的id查询子文档
GET my_index1/my_child/_search
{
"query": {
"parent_id" : {
"type" : "my_child",
"id" : "parent200"
}
}
}
- type:指向子文档type
- id:父文档的id
上面的查询和下面的查询是一样的
GET /my_index1/_search
{
"query": {
"has_parent": {
"type": "my_parent",
"query": {
"term": {
"_id": "parent200"
}
}
}
}
}
4.terms lookup mechanism:相当于sql中的级联查询(可以跨索引,也可以自己查自己)
参考我的博客《elasticsearch中DSL之Term level query(term query)》跳转
5. 在7版本后使用join属性来构建父子关系
5.1 设定 Parent/Child Mapping
用blog_comments_relation
设置为join
属性,relation
指定父子关系。
这里blog
表示父文档,comment
表示子文档
PUT my_blogs
{
"settings": {
"number_of_shards": 2
},
"mappings": {
"properties": {
"blog_comments_relation": {
"type": "join",
"relations": {
"blog": "comment"
}
},
"content": {
"type": "text"
},
"title": {
"type": "keyword"
}
}
}
}
插入几条数据
父文档
PUT my_blogs/_doc/blog1
{
"title":"Learning Elasticsearch",
"content":"learning ELK @ geektime",
"blog_comments_relation":{
"name":"blog"
}
}
父文档
PUT my_blogs/_doc/blog2
{
"title":"Learning Hadoop",
"content":"learning Hadoop",
"blog_comments_relation":{
"name":"blog"
}
}
子文档(这里routing是把父子文档放到一个分区中)
PUT my_blogs/_doc/comment1?routing=blog1
{
"comment":"I am learning ELK",
"username":"Jack",
"blog_comments_relation":{
"name":"comment",
"parent":"blog1"
}
}
子文档
PUT my_blogs/_doc/comment2?routing=blog2
{
"comment":"I like Hadoop!!!!!",
"username":"Jack",
"blog_comments_relation":{
"name":"comment",
"parent":"blog2"
}
}
子文档
PUT my_blogs/_doc/comment3?routing=blog2
{
"comment":"Hello Hadoop",
"username":"Bob",
"blog_comments_relation":{
"name":"comment",
"parent":"blog2"
}
}
查询所有文档
POST my_blogs/_search
{
}
根据父文档ID查看
GET my_blogs/_doc/blog2
根据Parent Id 查询
POST my_blogs/_search
{
"query": {
"parent_id": {
"type": "comment",
"id": "blog2"
}
}
}
Has Child 查询,返回父文档
POST my_blogs/_search
{
"query": {
"has_child": {
"type": "comment",
"query" : {
"match": {
"username" : "Jack"
}
}
}
}
}
Has Parent 查询,返回相关的子文档
POST my_blogs/_search
{
"query": {
"has_parent": {
"parent_type": "blog",
"query" : {
"match": {
"title" : "Learning Hadoop"
}
}
}
}
}
通过ID ,访问子文档
GET my_blogs/_doc/comment3
通过ID和routing ,访问子文档
GET my_blogs/_doc/comment3?routing=blog2
更新子文档
PUT my_blogs/_doc/comment3?routing=blog2
{
"comment": "Hello Hadoop??",
"blog_comments_relation": {
"name": "comment",
"parent": "blog2"
}
}
如果对你有帮助,请点赞、评论、加收藏哦~~~