elasticsearch2.2之index映射参数的not_analyzed属性

最新推荐文章于 2022-04-09 23:20:54 发布

zhifeng687

最新推荐文章于 2022-04-09 23:20:54 发布

阅读量2.8k

点赞数

分类专栏： ES & lucene

本文链接：https://blog.csdn.net/qq_26222859/article/details/52806698

版权

算法同时被 2 个专栏收录

15 篇文章 0 订阅

订阅专栏

ES & lucene

12 篇文章 0 订阅

订阅专栏

官方文档：index

索引index
这个参数可以控制字段应该怎样建索引，怎样查询。它有以下三个可用值：
· no: 不把此字段添加到索引中，也就是不建索引，此字段不可查询
· not_analyzed:将字段的原始值放入索引中，作为一个独立的term，它是除string字段以外的所有字段的默认值。
· analyzed:string字段的默认值，会先进行分析后，再把分析的term结果存入索引中。

原文
index
The index option controls how field values are indexed and, thus, how they are searchable. It accepts three values:

Do not add this field value to the index. With this setting, the field will not be queryable.

not_analyzed

Add the field value to the index unchanged, as a single term. This is the default for all fields that support this option except for string fields. not_analyzed fields are usually used with term-level queries for structured search.

analyzed

This option applies only to string fields, for which it is the default. The string field value is first analyzed to convert the string into terms (e.g. a list of individual words), which are then indexed. At search time, the query string is passed through (usually) the same analyzer to generate terms in the same format as those in the index. It is this process that enables full text search.

For example, you can create a not_analyzed string field with the following:

PUT /my_index
{
  "mappings": {
    "my_type": {
      "properties": {
        "status_code": {
          "type": "string",
          "index": "not_analyzed"
        }
      }
    }
  }
}

not_analyzed属性跳坑记录

在建立es索引时，如果没有提前创建静态映射，es在插入数据时会为之创建动态映射如下：


"keywords": {
"type": "string"
},
"vtype": {
"type": "long"
},
"source": {
"type": "string"
}

即在判断keywords字段是String类型时，设置"type": "string"，同时默认"index": "analyzed"。同时默认"doc_values":true，即支持排序和聚合。

"keywords": {
"type": "string",
"index": "analyzed",
"doc_values":true
}

即使es中存储的文档为：

 "keywords": [
   "充电"
 ],
.....

 "keywords": [
   "九阳",
   "阳姐"
 ],

但是在对keywords字段进行聚合时，是对词项进行聚合，所以会返回以下中文拆分成一个字一个字的聚合结果：

  {
          "key": "电",
          "doc_count": 31098
        },
        {
          "key": "小",
          "doc_count": 26901
        },
        {
          "key": "阳",
          "doc_count": 26265
        },
........

同样，在对keywords字段进行term查询时，也是对词项进行查询，以下查询语句也能查到keywords字段为"充电"的文档：

{
  "query": {
    "term": {
      "keywords": {
        "value": "电"
      }
    }
  }
}

在创建es索引时不可能会提前创建好所有字段的静态映射，根据业务增长可能会新增新的字段。我们可以使用es的动态映射模板预定义新增字段的属性：

  "dynamic_templates": [
          {
            "template_1": {
              "mapping": {
                "index": "not_analyzed",
                "type": "string",
                "doc_values": true
              },
              "match": "*",
              "match_mapping_type": "string"
            }
          }
        ],

这样，在新增keywords字段时，创建的keywords字段映射为：

          "keywords": {
            "type": "string",
            "index": "not_analyzed",
            "doc_values": true
          },

参考：关于elasticsearch属性not_analyzed，坑

elasticsearch 为“非查询字段”不建索引

zhifeng687

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
elasticsearch2.2之index映射参数的not_analyzed属性

官方文档：index索引index这个参数可以控制字段应该怎样建索引，怎样查询。它有以下三个可用值：· no: 不把此字段添加到索引中，也就是不建索引，此字段不可查询· not_analyzed:将字段的原始值放入索引中，作为一个独立的term，它是除string字段以外的所有字段的默认值。· analyzed:string字段的默认值，会先进行分析后，再把分析的term结果存入...
复制链接

扫一扫