0、ES6.X 一对多、多对多的数据该如何存储和实现呢?
引出问题:
“某头条新闻APP”新闻内容和新闻评论是1对多的关系?
在ES6.X该如何存储、如何进行高效检索、聚合操作呢?
相信阅读本文,你就能得到答案!
1、ES6.X 新类型Join 产生背景
Mysql中多表关联,我们可以通过left join 或者Join等实现;
ES5.X版本,借助父子文档实现多表关联,类似数据库中Join的功能;实现的核心是借助于ES5.X支持1个索引(index)下多个类型(type)。
ES6.X版本,由于每个索引下面只支持单一的类型(type)。
- 所以,ES6.X版本如何实现Join成为大家关注的问题。
幸好,ES6.X新推出了Join类型,主要解决类似Mysql中多表关联的问题。
2、ES6.X Join类型介绍
仍然是一个索引下,借助父子关系,实现类似Mysql中多表关联的操作。
3、ES6.X Join类型实战
3.1 ES6.X Join类型 Mapping定义
Join类型的Mapping如下:
核心
- 1) “my_join_field”为join的名称。
- 2)”question”: “answer” 指:qustion为answer的父类。
PUT my_join_index{ "mappings": { "_doc": { "properties": { "my_join_field": { "type": "join", "relations": { "question": "answer" } } } } }}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
3.2 ES6.X join类型定义父文档
直接上以下简化的形式,更好理解些。
如下,定义了两篇父文档。
文档类型为父类型:”question”。
PUT my_join_index/_doc/1?refresh{ "text": "This is a question", "my_join_field": "question" }PUT my_join_index/_doc/2?refresh{ "text": "This is another question", "my_join_field": "question"}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
3.3 ES6.X join类型定义子文档
- 路由值是强制性的,因为父文件和子文件必须在相同的分片上建立索引。
- “answer”是此子文档的加入名称。
- 指定此子文档的父文档ID:1。
PUT my_join_index/_doc/3?routing=1&refresh { "text": "This is an answer", "my_join_field": { "name": "answer", "parent": "1" }}PUT my_join_index/_doc/4?routing=1&refresh{ "text": "This is another answer", "my_join_field": { "name": "answer", "parent": "1" }}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
4、ES6.X Join类型约束
- 每个索引只允许一个Join类型Mapping定义;
- 父文档和子文档必须在同一个分片上编入索引;这意味着,当进行删除、更新、查找子文档时候需要提供相同的路由值。
- 一个文档可以有多个子文档,但只能有一个父文档。
- 可以为已经存在的Join类型添加新的关系。
- 当一个文档已经成为父文档后,可以为该文档添加子文档。
5、ES6.X Join类型检索与聚合
5.1 ES6.X Join全量检索
GET my_join_index/_search{ "query": { "match_all": {} }, "sort": ["_id"]}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
返回结果如下:
{ "took": 1, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 4, "max_score": null, "hits": [ { "_index": "my_join_index", "_type": "_doc", "_id": "1", "_score": null, "_source": { "text": "This is a question", "my_join_field": "question" }, "sort": [ "1" ] }, { "_index": "my_join_index", "_type": "_doc", "_id": "2", "_score": null, "_source": { "text": "This is another question", "my_join_field": "question" }, "sort": [ "2" ] }, { "_index": "my_join_index", "_type": "_doc", "_id": "3", "_score": null, "_routing": "1", "_source": { "text": "This is an answer", "my_join_field": { "name": "answer", "parent": "1" } }, "sort": [ "3" ] }, { "_index": "my_join_index", "_type": "_doc", "_id": "4", "_score": null, "_routing": "1", "_source": { "text": "This is another answer", "my_join_field": { "name": "answer", "parent": "1" } }, "sort": [ "4" ] } ] }}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
5.2 ES6.X 基于父文档查找子文档
GET my_join_index/_search{ "query": { "has_parent" : { "parent_type" : "question", "query" : { "match" : { "text" : "This is" } } } }}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
返回结果:
{ "took": 0, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 2, "max_score": 1, "hits": [ { "_index": "my_join_index", "_type": "_doc", "_id": "3", "_score": 1, "_routing": "1", "_source": { "text": "This is an answer", "my_join_field": { "name": "answer", "parent": "1" } } }, { "_index": "my_join_index", "_type": "_doc", "_id": "4", "_score": 1, "_routing": "1", "_source": { "text": "This is another answer", "my_join_field": { "name": "answer", "parent": "1" } } } ] }}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
5.3 ES6.X 基于子文档查找父文档
GET my_join_index/_search{"query": { "has_child" : { "type" : "answer", "query" : { "match" : { "text" : "This is question" } } } }}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
返回结果:
{ "took": 0, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 1, "hits": [ { "_index": "my_join_index", "_type": "_doc", "_id": "1", "_score": 1, "_source": { "text": "This is a question", "my_join_field": "question" } } ] }}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
5.4 ES6.X Join聚合操作实战
以下操作含义如下:
- 1)parent_id是特定的检索方式,用于检索属于特定父文档id=1的,子文档类型为answer的文档的个数。
- 2)基于父文档类型question进行聚合;
- 3)基于指定的field处理。
GET my_join_index/_search{ "query": { "parent_id": { "type": "answer", "id": "1" } }, "aggs": { "parents": { "terms": { "field": "my_join_field#question", "size": 10 } } }, "script_fields": { "parent": { "script": { "source": "doc['my_join_field#question']" } } }}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
返回结果:
{ "took": 1, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 2, "max_score": 0.13353139, "hits": [ { "_index": "my_join_index", "_type": "_doc", "_id": "3", "_score": 0.13353139, "_routing": "1", "fields": { "parent": [ "1" ] } }, { "_index": "my_join_index", "_type": "_doc", "_id": "4", "_score": 0.13353139, "_routing": "1", "fields": { "parent": [ "1" ] } } ] }, "aggregations": { "parents": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "1", "doc_count": 2 } ] } }}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
6、ES6.X Join 一对多实战
6.1 一对多定义
如下,一个父文档question与多个子文档answer,comment的映射定义。
PUT join_ext_index{ "mappings": { "_doc": { "properties": { "my_join_field": { "type": "join", "relations": { "question": ["answer", "comment"] } } } } }}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
6.2 一对多对多定义
实现如下图的祖孙三代关联关系的定义。
question / \ / \comment answer | | vote
- 1
- 2
- 3
- 4
- 5
- 6
- 7
PUT join_multi_index{ "mappings": { "_doc": { "properties": { "my_join_field": { "type": "join", "relations": { "question": ["answer", "comment"], "answer": "vote" } } } } }}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
孙子文档导入数据,如下所示:
PUT join_multi_index/_doc/3?routing=1&refresh { "text": "This is a vote", "my_join_field": { "name": "vote", "parent": "2" }}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
注意:
- 孙子文档所在分片必须与其父母和祖父母相同- 孙子文档的父代号(必须指向其父亲answer文档)
- 1
- 2
7、小结
虽然ES官方文档已经很详细了,详见:
http://t.cn/RnBBLgp
但手敲一遍,翻译一遍,的的确确会更新认知,加深理解。
和你一起,死磕ELK Stack!
2018年03月31日 23:18 于家中床前
作者:铭毅天下
转载请标明出处,原文地址:
https://blog.csdn.net/laoyang360/article/details/79774481
如果感觉本文对您有帮助,请点击‘顶’支持一下,您的支持是我坚持写作最大的动力,谢谢!
再分享一下我老师大神的人工智能教程吧。零基础!通俗易懂!风趣幽默!还带黄段子!希望你也加入到我们人工智能的队伍中来!https://blog.csdn.net/jiangjunshow