1.简介
elasticsearch不擅长处理关系型数据库中的关联关系,如文章表blog和评论表comment之间通过blog_id关联,在elasticsearch中可以通过nested object或者parent child两种方案来变相解决。
parent child方式使用很繁琐并且性能较低,因此建议使用nested object来解决问题,这两种方式优缺点对比如下。
Nested Object | Parent Child | |
---|---|---|
优点 | 文档存储在一起,因此读取性能高 | 父子文档可以独立更新,互不影响 |
确点 | 更新父或子文档时需要更新整个文档 | 为了维护join的关系,需要占用部分内存,读取性能较差 |
场景 | 子文档偶尔更新,查询频繁 | 子文档更新频繁 |
2.需求说明
在3.6前文的需求基础上,增加评论comment字段,该字段内嵌如下字段。
- 评论人:username
- 评论内容:content
- 评论日期:date
3.Nested Object
(1).创建索引
PUT /blog_comment_index
{
"settings": {
"number_of_shards": 5,
"number_of_replicas": 1
}
}
PUT /blog_comment_index/_mapping
{
"properties": {
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 100
}
}
},
"author": {
"type": "keyword",
"ignore_above": 100
},
"comments": {
"type": "nested",
"properties": {
"username": {
"type": "keyword",
"ignore_above": 100
},
"content": {
"type": "text"
},
"date": {
"type": "date"
}
}
}
}
}
(2).插入文档
POST /blog_comment_index/_doc
{
"title": "hello world",
"author": "steven",
"comments": [{
"username": "lee",
"date": "2020-02-02",
"content": "awesome article!"
},
{
"username": "fax",
"date": "2020-02-03",
"content": "thanks!"
}
]
}
(3).查询
POST /blog_comment_index/_doc/_search
{
"query": {
"bool": {
"must": [{
"match": {
"comments.username": "lee"
}
},
{
"match": {
"comments.content": "thanks"
}
}
]
}
}
}
{
"took" : 10,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
(4).结果分析
上述查询结果与查询条件预期结果一致,原因是在创建索引时将comments类型设置为nested,那么elasticsearch就会将数据按如下结构存储。
{
"title": "hello world",
"author": "steven"
}
{
"comments.username": "lee",
"comments.date": "2020-02-02",
"comments.content": "awesome article!"
}
{
"comments.username": "fax",
"comments.date": "2020-02-03",
"comments.content": "thanks!"
}