elasticsearch 父子文档
官网:https://www.elastic.co/guide/en/elasticsearch/reference/7.13/parent-join.html
has parent query:https://www.elastic.co/guide/en/elasticsearch/reference/7.13/query-dsl-has-parent-query.html
has child query:https://www.elastic.co/guide/en/elasticsearch/reference/7.13/query-dsl-has-child-query.html
parent id query:https://www.elastic.co/guide/en/elasticsearch/reference/7.13/query-dsl-parent-id-query.html#parent-id-query-ex-request
children aggregation:https://www.elastic.co/guide/en/elasticsearch/reference/7.13/search-aggregations-bucket-children-aggregation.html
********************
join field type
join类型在同一个index内创建文档的父子关系
PUT my-index-000001
{
"mappings": {
"properties": {
"my_id": {
"type": "keyword"
},
"my_join_field": {
"type": "join", # my_join_field为join字段
"relations": {
"question": "answer" # question:父文档标识
} # answer:子文档标识
}
}
}
}
创建父文档
PUT my-index-000001/_doc/1?refresh
{
"my_id": "1",
"text": "This is a question",
"my_join_field": {
"name": "question"
}
}
# 简化写法
PUT my-index-000001/_doc/2?refresh
{
"my_id": "2",
"text": "This is another question",
"my_join_field": "question"
}
创建子文档
PUT my-index-000001/_doc/3?routing=1&refresh
{
"my_id": "3",
"text": "This is an answer",
"my_join_field": {
"name": "answer", # 子文档标识
"parent": "1" # 父文档id
}
}
说明:routing字段必须,父子文档必须在同一个分片(shard)上
性能说明
The join field shouldn’t be used like joins in a relation database.
In Elasticsearch the key to good performance is to de-normalize your
data into documents.
# join字段不应该该像关系型数据库那样使用,从性能方面考虑,可以进行反范式设计
Each join field, has_child or has_parent query adds a significant tax
to your query performance.
# join字段、has_child query、has_parent query都会影响查询性能
The only case where the join field makes sense is if your data contains
a one-to-many relationship where one entity significantly outnumbers
the other entity.
# join字段常用于一对多的关系
join 使用限制
Only one join field mapping is allowed per index.
# 同一个index最多只能使用一个join字段
Parent and child documents must be indexed on the same shard.
# 父子文档必须在同一个分片(shard)上
An element can have multiple children but only one parent.
# 一个文档可以有多个子文档,但最多只能有一个父文档
It is possible to add a new relation to an existing join field.
# 可以在join字段中添加新的关联关系(子文档可以成为其他文档的父文档)
It is also possible to add a child to an existing element but
only if the element is already a parent.
# 只能给父文档添加子文档
******************
多重父子关联(multiple levels of parent join)
PUT my-index-000001
{
"mappings": {
"properties": {
"my_join_field": {
"type": "join",
"relations": {
"question": ["answer", "comment"],
"answer": "vote"
}
}
}
}
}
说明:多重父子关联会影响查询性能,不推荐使用,
创建文档
# 文档 1
PUT my-index-000001/_doc/1?refresh
{
"my_id": "1",
"text": "This is a question",
"my_join_field": "question"
}
# 文档 2:文档 1的子文档
PUT my-index-000001/_doc/2?routing=1&refresh #routing的值为文档 1的路由id
{
"my_id": "3",
"text": "This is an answer",
"my_join_field": {
"name": "answer",
"parent": "1" # 父文档的id(文档 1)
}
}
# 文档 3:文档 2的子文档
PUT my-index-000001/_doc/3?routing=1&refresh #routing的值为文档 1的路由id
{
"text": "This is a vote",
"my_join_field": {
"name": "vote",
"parent": "2" # 父文档的id(文档 2)
}
}
********************
has parent query
查询条件为父文档,查询结果为子文档
# 索引 my-index-000001
PUT /my-index-000001
{
"mappings": {
"properties": {
"my-join-field": {
"type": "join",
"relations": {
"parent": "child"
}
},
"tag": {
"type": "keyword"
}
}
}
}
# 查询条件:父文档tag=Elasticsearch
GET /my-index-000001/_search
{
"query": {
"has_parent": {
"parent_type": "parent",
"query": {
"term": {
"tag": {
"value": "Elasticsearch"
}
}
}
}
}
}
相关参数
parent_type:必须,父文档标识
query:必须,搜索匹配的父文档,返回对应的子文档
score:可选,true、false(默认)
(Optional, Boolean) Indicates whether the relevance score of a
matching parent document is aggregated into its child documents.
false:父文档的score不会计入子文档
true:父文档的score会计入子文档
ignore_unmapped:可选,true、false(默认)
(Optional, Boolean) Indicates whether to ignore an unmapped
parent_type and not return any documents instead of an error
false:父文档类型不匹配,返回error
true:父文档类型不匹配,不返回error
********************
has child query
查询条件为子文档,返回父文档
# 索引:my-index-000001
PUT /my-index-000001
{
"mappings": {
"properties": {
"my-join-field": {
"type": "join",
"relations": {
"parent": "child"
}
}
}
}
}
# 查询条件:最少有2个子文档、最多有10个子文档
GET /_search
{
"query": {
"has_child": {
"type": "child",
"query": {
"match_all": {} #匹配所有子文档
},
"max_children": 10, #最多10个子文档
"min_children": 2, #最少2个子文档
"score_mode": "min" #返回最少得分
}
}
}
相关参数
type:必选,子文档标识
query:必选,匹配子文档
ignore_unmapped:可选,true、false(默认)
(Optional, Boolean) Indicates whether to ignore an unmapped
type and not return any documents instead of an error
false:文档标识不匹配,返回error
true:文档表示不匹配,不返回error
max_children:可选,最多含有满足条件的子文档数,超过该值不返回父文档
min_children:可选,最少含有满足条件的子文档数,小于该值不返回父文档
score_mode:可选,标识子文档得分如何影响父文档得分,可选值如下
none (Default):无影响,父文档得分为0
avg:所有匹配子文档的平均得分
max:所有匹配子文档的最高得分
min:所有匹配子文档的最低得分
sum:所有匹配子文档得分和
********************
parent id query
父文档的id值为指定值,返回子文档
# 索引:my-index-000001
PUT /my-index-000001
{
"mappings": {
"properties": {
"my-join-field": {
"type": "join",
"relations": {
"my-parent": "my-child"
}
}
}
}
}
# 插入父文档
PUT /my-index-000001/_doc/1?refresh
{
"text": "This is a parent document.",
"my-join-field": "my-parent"
}
# 插入子文档
PUT /my-index-000001/_doc/2?routing=1&refresh
{
"text": "This is a child document.",
"my_join_field": {
"name": "my-child",
"parent": "1"
}
}
# 查询条件:父文档的id=1
GET /my-index-000001/_search
{
"query": {
"parent_id": {
"type": "my-child",
"id": "1"
}
}
}
相关参数
type:必选,子文档标识
id:必选,父文档 id
ignore_unmapped:可选,true、false(默认)
(Optional, Boolean) Indicates whether to ignore an unmapped
type and not return any documents instead of an error
false:子文档表示不匹配,返回error
true:子文档表示不匹配,不返回error
********************
children aggregation
字聚合:特殊的单桶聚合(a special single bucket aggregation),根据join 子标识字段聚合,
# 新建索引:child_example
PUT child_example
{
"mappings": {
"properties": {
"join": {
"type": "join",
"relations": {
"question": "answer"
}
}
}
}
}
# 父文档
PUT child_example/_doc/1
{
"join": {
"name": "question"
},
"body": "<p>I have Windows 2003 server and i bought a new Windows 2008 server...",
"title": "Whats the best way to file transfer my site from server to a newer one?",
"tags": [
"windows-server-2003",
"windows-server-2008",
"file-transfer"
]
}
# 子文档
PUT child_example/_doc/2?routing=1
{
"join": {
"name": "answer",
"parent": "1"
},
"owner": {
"location": "Norfolk, United Kingdom",
"display_name": "Sam",
"id": 48
},
"body": "<p>Unfortunately you're pretty much limited to FTP...",
"creation_date": "2009-05-04T13:45:37.030"
}
# 子文档
PUT child_example/_doc/3?routing=1&refresh
{
"join": {
"name": "answer",
"parent": "1"
},
"owner": {
"location": "Norfolk, United Kingdom",
"display_name": "Troll",
"id": 49
},
"body": "<p>Use Linux...",
"creation_date": "2009-05-05T13:45:37.030"
}
children aggregation
POST child_example/_search?size=0
{
"aggs": {
"top-tags": { # 父文档根据字段tags聚合
"terms": {
"field": "tags.keyword",
"size": 10
},
"aggs": {
"to-answers": {
"children": { # children 聚合
"type" : "answer"
},
"aggs": {
"top-names": {
"terms": { # 子文档根据owner.diaplay_name聚合
"field": "owner.display_name.keyword",
"size": 10
}
}
}
}
}
}
}
}
聚合结果
{
"took": 25,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped" : 0,
"failed": 0
},
"hits": {
"total" : {
"value": 3,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"aggregations": {
"top-tags": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "file-transfer",
"doc_count": 1,
"to-answers": {
"doc_count": 2,
"top-names": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Sam",
"doc_count": 1
},
{
"key": "Troll",
"doc_count": 1
}
]
}
}
},
{
"key": "windows-server-2003",
"doc_count": 1,
"to-answers": {
"doc_count": 2,
"top-names": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Sam",
"doc_count": 1
},
{
"key": "Troll",
"doc_count": 1
}
]
}
}
},
{
"key": "windows-server-2008",
"doc_count": 1,
"to-answers": {
"doc_count": 2,
"top-names": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Sam",
"doc_count": 1
},
{
"key": "Troll",
"doc_count": 1
}
]
}
}
}
]
}
}
}
********************
示例
join ==> school:student
父文档:id、name、join(school)
子文档:id、name、sex、join(student)
创建索引
PUT school_student
{
"mappings": {
"properties": {
"join": {
"type": "join",
"relations": {
"school": "student"
}
}
}
}
}
插入父文档
PUT school_student/_doc/1
{
"name":"火影忍者",
"join": "school"
}
PUT school_student/_doc/2
{
"name":"海贼王",
"join": "school"
}
插入子文档:parent = 1
# parent 1的子文档
PUT school_student/_doc/3?routing=1
{
"name":"鸣人",
"sex": "male",
"join": {
"name":"student",
"parent": 1
}
}
PUT school_student/_doc/4?routing=1
{
"name":"雏田",
"sex": "female",
"join": {
"name":"student",
"parent": 1
}
}
PUT school_student/_doc/5?routing=1
{
"name":"小樱",
"sex": "female",
"join": {
"name":"student",
"parent": 1
}
}
PUT school_student/_doc/6?routing=1
{
"name":"二柱子",
"sex": "male",
"join": {
"name":"student",
"parent": 1
}
}
插入子文档:parent = 2
PUT school_student/_doc/7?routing=2
{
"name":"路飞",
"sex": "male",
"join": {
"name":"student",
"parent": 2
}
}
PUT school_student/_doc/8?routing=2
{
"name":"绿藻头",
"sex": "male",
"join": {
"name":"student",
"parent": 2
}
}
PUT school_student/_doc/9?routing=2
{
"name":"色厨子",
"sex": "male",
"join": {
"name":"student",
"parent": 2
}
}
PUT school_student/_doc/10?routing=2
{
"name":"撒谎布",
"sex": "male",
"join": {
"name":"student",
"parent": 2
}
}
PUT school_student/_doc/11?routing=2
{
"name":"娜美",
"sex": "female",
"join": {
"name":"student",
"parent": 2
}
}
PUT school_student/_doc/12?routing=2
{
"name":"罗宾",
"sex": "female",
"join": {
"name":"student",
"parent": 2
}
}
parent id query:parent id = 1的子文档
# 查询语句
GET school_student/_search
{
"query": {
"parent_id": {
"type": "student",
"id": "1"
}
}
}
# 查询结果:返回4条子文档
{
"took" : 1021,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : 0.6325226,
"hits" : [
{
"_index" : "school_student",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.6325226,
"_routing" : "1",
"_source" : {
"name" : "鸣人",
"sex" : "male",
"join" : {
"name" : "student",
"parent" : 1
}
}
},
{
"_index" : "school_student",
"_type" : "_doc",
"_id" : "4",
"_score" : 0.6325226,
"_routing" : "1",
"_source" : {
"name" : "雏田",
"sex" : "female",
"join" : {
"name" : "student",
"parent" : 1
}
}
},
{
"_index" : "school_student",
"_type" : "_doc",
"_id" : "5",
"_score" : 0.6325226,
"_routing" : "1",
"_source" : {
"name" : "小樱",
"sex" : "female",
"join" : {
"name" : "student",
"parent" : 1
}
}
},
{
"_index" : "school_student",
"_type" : "_doc",
"_id" : "6",
"_score" : 0.6325226,
"_routing" : "1",
"_source" : {
"name" : "二柱子",
"sex" : "male",
"join" : {
"name" : "student",
"parent" : 1
}
}
}
]
}
}
has child query:查询包含子文档name = 罗宾的父文档
# 查询语句
GET school_student/_search
{
"query": {
"has_child": {
"type": "student",
"query": {
"match": {
"name": "罗宾"
}
}
}
}
}
# 查询结果
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "school_student",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "海贼王",
"join" : "school"
}
}
]
}
}
has parent query:查询父文档name = 海贼王的子文档
# 查询语句
GET school_student/_search
{
"query": {
"has_parent": {
"parent_type": "school",
"query": {
"match": {
"name": "海贼王"
}
}
}
}
}
# 查询结果
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "school_student",
"_type" : "_doc",
"_id" : "7",
"_score" : 1.0,
"_routing" : "2",
"_source" : {
"name" : "路飞",
"sex" : "male",
"join" : {
"name" : "student",
"parent" : 2
}
}
},
{
"_index" : "school_student",
"_type" : "_doc",
"_id" : "8",
"_score" : 1.0,
"_routing" : "2",
"_source" : {
"name" : "绿藻头",
"sex" : "male",
"join" : {
"name" : "student",
"parent" : 2
}
}
},
{
"_index" : "school_student",
"_type" : "_doc",
"_id" : "9",
"_score" : 1.0,
"_routing" : "2",
"_source" : {
"name" : "色厨子",
"sex" : "male",
"join" : {
"name" : "student",
"parent" : 2
}
}
},
{
"_index" : "school_student",
"_type" : "_doc",
"_id" : "10",
"_score" : 1.0,
"_routing" : "2",
"_source" : {
"name" : "撒谎布",
"sex" : "male",
"join" : {
"name" : "student",
"parent" : 2
}
}
},
{
"_index" : "school_student",
"_type" : "_doc",
"_id" : "11",
"_score" : 1.0,
"_routing" : "2",
"_source" : {
"name" : "娜美",
"sex" : "female",
"join" : {
"name" : "student",
"parent" : 2
}
}
},
{
"_index" : "school_student",
"_type" : "_doc",
"_id" : "12",
"_score" : 1.0,
"_routing" : "2",
"_source" : {
"name" : "罗宾",
"sex" : "female",
"join" : {
"name" : "student",
"parent" : 2
}
}
}
]
}
}
chidren query:查询不同school下的男女人数
# 查询语句
GET school_student/_search
{
"query": {
"terms": {
"name.keyword": ["火影忍者","海贼王"]
}
},
"aggs": {
"school_count": {
"terms": {
"field": "name.keyword",
"size": 10
},
"aggs": {
"student_count": {
"children": {
"type": "student"
},
"aggs": {
"sex_count": {
"terms": {
"field": "sex.keyword",
"size": 10
}
}
}
}
}
}
}
}
# 查询结果
{
"took" : 11,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "school_student",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"name" : "火影忍者",
"join" : "school"
}
},
{
"_index" : "school_student",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "海贼王",
"join" : "school"
}
}
]
},
"aggregations" : {
"school_count" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "海贼王",
"doc_count" : 1,
"student_count" : {
"doc_count" : 6,
"sex_count" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "male",
"doc_count" : 4
},
{
"key" : "female",
"doc_count" : 2
}
]
}
}
},
{
"key" : "火影忍者",
"doc_count" : 1,
"student_count" : {
"doc_count" : 4,
"sex_count" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "female",
"doc_count" : 2
},
{
"key" : "male",
"doc_count" : 2
}
]
}
}
}
]
}
}
}