ElasticSearch6.x 之映射参数

本文转载至https://blog.csdn.net/chengyuqiang/article/details/79059958

映射参数概述:

       ElasticSearch提供了丰富的映射参数。

       官网地址:https://www.elastic.co/guide/en/elasticsearch/reference/6.3/mapping-params.html

1、analyzer

指定分词器,对应索引和查询都有效。如下:IK分词配置

第一步:定义索引并定义映射

第二步:向指定索引填充数据

Post http://192.168.1.74:9200/my_index/it/1
{
   "name" : "中国是个幸福美丽的国度"
}

Post http://192.168.1.74:9200/my_index/it/2
{
   "name" : "广东省的省会城市是羊城(广州)"
}

Post http://192.168.1.74:9200/my_index/it/3
{
   "name" : "中国最南的岛屿叫曾母暗沙"
}

第三步:查询

2、normalizer

elasticsearch标准化配置属性,如下:所有字符转换为小写。

第一步:定义索引,并设置elasticsearch 分析标准。

参数内容:

{
"settings": {
    "analysis": {
      "normalizer":{
        "my_normalizer":{
          "type":"custom",
          "char_filter" : [],
          "filter" : ["lowercase", "asciifolding"]
        }
      }
    }
  },
  "mappings": {
    "it" : {
      "properties" : {
        "name" : {
          "type": "keyword",
          "normalizer": "my_normalizer"
        }
      }
    }
  }
}

第二步:插入数据:

Post http://192.168.1.74:9200/my_index/it/1
{
   "name" : "bar"
}

Post http://192.168.1.74:9200/my_index/it/2
{
   "name" : "BAR"
}

Post http://192.168.1.74:9200/my_index/it/3
{
   "name" : "BaR"
}

第三步:查询数据:

3、boost

通过指定一个boost值来控制每个查询子句的相对权重,该值默认为1。一个大于1的boost会增加该查询子句的相对权重。

第一步:定义索引,设置权重比值

第二步:填充数据

Post http://192.168.1.74:9200/my_index/it/1
{
   "title" : "client me"
}

Post http://192.168.1.74:9200/my_index/it/2
{
   "title" : "follow"
}

第三步:查询数据

4、coerce

coerce属性用于清除脏数据,coerce的默认值是true。整型数字5有可能会被写成字符串“5”或者浮点数5.0.coerce属性可以用来清除脏数据:

  1. 字符串会被强制转换为整数
  2. 浮点数被强制转换为整数 

【例子】 
(1)重新创建my_index

DELETE my_index

PUT my_index
{
  "mappings": {
    "my_type": {
      "properties": {
        "number_one": {
          "type": "integer"
        },
        "number_two": {
          "type": "integer",
          "coerce": false
        }
      }
    }
  }
}

(2)写入一条测试文档

PUT my_index/my_type/1
{
  "number_one": "10" 
}
{
  "_index": "my_index",
  "_type": "my_type",
  "_id": "1",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 0,
  "_primary_term": 1
}

(3)写入另一条测试文档

PUT my_index/my_type/2
{
  "number_two": "10" 
}
{
  "error": {
    "root_cause": [
      {
        "type": "mapper_parsing_exception",
        "reason": "failed to parse [number_two]"
      }
    ],
    "type": "mapper_parsing_exception",
    "reason": "failed to parse [number_two]",
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "Integer value passed as String"
    }
  },
  "status": 400
}

5、copy-to

copy_to属性用于配置自定义的_all字段。换言之,就是多个字段可以合并成一个超级字段。比如,first_name和last_name可以合并为full_name字段。 
【例子】 
(1)

DELETE my_index

PUT my_index
{
  "mappings": {
    "my_type": {
      "properties": {
        "first_name": {
          "type": "text",
          "copy_to": "full_name" 
        },
        "last_name": {
          "type": "text",
          "copy_to": "full_name" 
        },
        "full_name": {
          "type": "text"
        }
      }
    }
  }
}

PUT my_index/my_type/1
{
  "first_name": "John",
  "last_name": "Smith"
}

(2)查询

GET my_index/_search
{
  "query": {
    "match": {
      "full_name": { 
        "query": "John Smith",
        "operator": "and"
      }
    }
  }
}
{
  "took": 22,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.5753642,
    "hits": [
      {
        "_index": "my_index",
        "_type": "my_type",
        "_id": "1",
        "_score": 0.5753642,
        "_source": {
          "first_name": "John",
          "last_name": "Smith"
        }
      }
    ]
  }
}

6、doc_values

doc_values是为了加快排序、聚合操作,在建立倒排索引的时候,额外增加一个列式存储映射,是一个空间换时间的做法。默认是开启的,对于确定不需要聚合或者排序的字段可以关闭。

PUT my_index
{
  "mappings": {
    "my_type": {
      "properties": {
        "status_code": { 
          "type":       "keyword"
        },
        "session_id": { 
          "type":       "keyword",
          "doc_values": false
        }
      }
    }
  }
}

7、dynamic

dynamic属性用于检测新发现的字段,有三个取值:

  • true:新发现的字段添加到映射中。(默认)
  • flase:新检测的字段被忽略。必须显式添加新字段。
  • strict:如果检测到新字段,就会引发异常并拒绝文档

【例子】 
(1)新建索引 
取值为strict,非布尔值要加引号

DELETE my_index

PUT my_index
{
  "mappings": {
    "my_type": {
      "dynamic": "strict", 
      "properties": {
        "title": { "type": "text"}
      }
    }
  }
}

(2)插入新文档

PUT my_index/my_type/1
{
  "title": "test",
  "content": "test dynamic"
} 

抛出异常

{
  "error": {
    "root_cause": [
      {
        "type": "strict_dynamic_mapping_exception",
        "reason": "mapping set to strict, dynamic introduction of [content] within [my_type] is not allowed"
      }
    ],
    "type": "strict_dynamic_mapping_exception",
    "reason": "mapping set to strict, dynamic introduction of [content] within [my_type] is not allowed"
  },
  "status": 400
}

 

8、enabled

ELasticseaech默认会索引所有的字段,enabled设为false的字段,es会跳过字段内容,该字段只能从_source中获取,但是不可搜。而且字段可以是任意类型。

【例子】 
(1)新建索引,插入文档

DELETE my_index

PUT my_index
{
  "mappings": {
    "my_type": {
      "properties": {
        "name":{"enabled": false}
      } 
    }
  }
}
PUT my_index/my_type/1
{
  "title": "test enabled",
  "name":"chengyuqiang"
}

(2)查看文档

GET my_index/my_type/1
{
  "_index": "my_index",
  "_type": "my_type",
  "_id": "1",
  "_version": 1,
  "found": true,
  "_source": {
    "title": "test enabled",
    "name": "chengyuqiang"
  }
}

(3)搜索字段

GET my_index/_search
{
  "query": {
    "match": {
      "name": "chengyuqiang"
    }
  }
}
{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

9、ignore_above

ignore_above用于指定字段索引和存储的长度最大值,超过最大值的会被忽略

DELETE my_index

PUT my_index
{
  "mappings": {
    "my_type": {
      "properties": {
        "message": {
          "type": "keyword",
          "ignore_above": 20 
        }
      }
    }
  }
}

PUT my_index/my_type/1 
{
  "message": "Syntax error"
}

PUT my_index/my_type/2 
{
  "message": "Syntax error with some long stacktrace"
}
GET my_index/_search 
{
  "size":0,
  "aggs": {
    "messages": {
      "terms": {
        "field": "message"
      }
    }
  }
}

mapping中指定了ignore_above字段的最大长度为20,第一个文档的字段长小于20,因此索引成功,第二个超过20,因此不索引,返回结果只有”Syntax error”,结果如下

{
  "took": 12,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "messages": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "Syntax error",
          "doc_count": 1
        }
      ]
    }
  }
}

10、ignore_malformed

ignore_malformed可以忽略不规则数据。对于账号userid字段,有人可能填写的是 整数类型,也有人填写的是邮件格式。给一个字段索引不合适的数据类型发生异常,导致整个文档索引失败。如果ignore_malformed参数设为true,异常会被忽略,出异常的字段不会被索引,其它字段正常索引。
 

DELETE my_index

PUT my_index
{
  "mappings": {
    "my_type": {
      "properties": {
        "number_one": {
          "type": "integer",
          "ignore_malformed": true
        },
        "number_two": {
          "type": "integer"
        }
      }
    }
  }
}

PUT my_index/my_type/1
{
  "text":       "Some text value",
  "number_one": "foo" 
}

PUT my_index/my_type/2
{
  "text":       "Some text value",
  "number_two": "foo" 
}

上面的例子中number_one接受integer类型,ignore_malformed属性设为true,因此文档一种number_one字段虽然是字符串但依然能写入成功;number_two接受integer类型,默认ignore_malformed属性为false,因此写入失败。
 

{
  "error": {
    "root_cause": [
      {
        "type": "mapper_parsing_exception",
        "reason": "failed to parse [number_two]"
      }
    ],
    "type": "mapper_parsing_exception",
    "reason": "failed to parse [number_two]",
    "caused_by": {
      "type": "number_format_exception",
      "reason": "For input string: \"foo\""
    }
  },
  "status": 400
}

11、index_options

index_options参数控制将哪些信息添加到倒排索引,用于搜索和突出显示目的。

注意:The index_options parameter has been deprecated for Numeric fields in 6.0.0。6.0.0中的数字字段已弃用index_options参数。

DELETE my_index

PUT my_index
{
  "mappings": {
    "my_type": {
      "properties": {
        "text": {
          "type": "text",
          "index_options": "offsets"
        }
      }
    }
  }
}

PUT my_index/my_type/1
{
  "text": "Quick brown fox"
}

GET my_index/_search
{
  "query": {
    "match": {
      "text": "brown fox"
    }
  },
  "highlight": {
    "fields": {
      "text": {} 
    }
  }
}
{
  "took": 50,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.5753642,
    "hits": [
      {
        "_index": "my_index",
        "_type": "my_type",
        "_id": "1",
        "_score": 0.5753642,
        "_source": {
          "text": "Quick brown fox"
        },
        "highlight": {
          "text": [
            "Quick <em>brown</em> <em>fox</em>"
          ]
        }
      }
    ]
  }
}

12、 index

index属性指定字段是否索引,不索引也就不可搜索,取值可以为true或者false。

13、fields

fields可以让同一文本有多种不同的索引方式,比如一个String类型的字段,可以使用text类型做全文检索,使用keyword类型做聚合和排序。

DELETE my_index

PUT my_index
{
  "mappings": {
    "my_type": {
      "properties": {
        "city": {
          "type": "text",
          "fields": {
            "raw": { 
              "type":  "keyword"
            }
          }
        }
      }
    }
  }
}

PUT my_index/my_type/1
{
  "city": "New York"
}

PUT my_index/my_type/2
{
  "city": "York"
}

GET my_index/_search
{
  "query": {
    "match": {
      "city": "york" 
    }
  },
  "sort": {
    "city.raw": "asc" 
  },
  "aggs": {
    "Cities": {
      "terms": {
        "field": "city.raw" 
      }
    }
  }
}

{
  "took": 31,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": null,
    "hits": [
      {
        "_index": "my_index",
        "_type": "my_type",
        "_id": "1",
        "_score": null,
        "_source": {
          "city": "New York"
        },
        "sort": [
          "New York"
        ]
      },
      {
        "_index": "my_index",
        "_type": "my_type",
        "_id": "2",
        "_score": null,
        "_source": {
          "city": "York"
        },
        "sort": [
          "York"
        ]
      }
    ]
  },
  "aggregations": {
    "Cities": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "New York",
          "doc_count": 1
        },
        {
          "key": "York",
          "doc_count": 1
        }
      ]
    }
  }
}

说明:The city.raw field is a keyword version of the city field (.city.raw字段是城市字段的关键字版本。) 
The city field can be used for full text search.( city字段可用于全文搜索。) 
The city.raw field can be used for sorting and aggregations.( city.raw字段可用于排序和聚合)
 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值