ES8-mapping元字段

最新推荐文章于 2024-06-25 21:00:00 发布

weixin_33694172

最新推荐文章于 2024-06-25 21:00:00 发布

阅读量306

点赞数

文章标签：大数据 python json

原文链接：https://my.oschina.net/u/3100849/blog/1843034

版权

2019独角兽企业重金招聘Python工程师标准>>>

1.元字段概述

官方解释：https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-fields.html#_document_source_meta_fields

mapping元字段是mapping映射中描述文档本身的字段，大致可以分为文档属性元数据、文档元数据、索引元数据、路由元数据和自定义元数据。

2.主要字段解读

_index

多索引查询时，有时候只需要在特地索引名上进行查询，_index字段提供了便利，也就是说可以对索引名进行term查询、terms查询、聚合分析、使用脚本和排序。

_index是一个虚拟字段，不会真的加到Lucene索引中，对_index进行term、terms查询(也包括match、query_string、simple_query_string)，但是不支持prefix、wildcard、regexp和fuzzy查询。

_type

在6.0.0中弃用，此doc的mapping type名, 自动被索引，可被查询，聚合，排序使用，或者脚本里访问

_id

doc的id，建索引时候传入，不被索引，可通过_uid被查询，脚本里使用，不能参与聚合或排序

PUT my_index

PUT my_index/my_type/1
{
  "text":"this is a doc"
}

PUT my_index/my_type/2
{
  "text": "Document with ID 2"
}

GET my_index/_search
{
  "query": {
    "terms": {
      "_id": ["1","2"]
    }
  }
}

创建索引，添加文档，通过_id查询文档

{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
      {
        "_index": "my_index",
        "_type": "my_type",
        "_id": "2",
        "_score": 1,
        "_source": {
          "text": "Document with ID 2"
        }
      },
      {
        "_index": "my_index",
        "_type": "my_type",
        "_id": "1",
        "_score": 1,
        "_source": {
          "text": "this is a doc"
        }
      }
    ]
  }
}

6.0之前的版本并不是这样的，因为它们支持多种类型，所以_type和_id被合并为一个名为_uid的复合主键。

_uid

在6.0.0中弃用。现在，类型已被删除，文档由_id唯一标识，_uid字段仅作为查看_id字段以保持向后兼容。

_source

_source字段包含在索引时传递的原始JSON文档正文。 _source字段本身没有编入索引（因此不可搜索），但它被存储，以便在执行获取请求（如get或search）时可以返回它。
默认_source字段是开启的，也就是说，默认情况下存储文档的原始值。

如果某个字段内容非常多（比如一篇小说），或者查询业务只需要对该字段进行搜索，返回文档id，然后通过其他途径查看文档原文，则不需要保留_source元字段。可以通过禁用_source元字段，在ElasticSearch 中只存储倒排索引，不保留字段原始值。

_source禁用

DELETE my_index

PUT my_index
{
  "mappings": {
    "my_type":{
      "_source": {"enabled": false}
    }
  }
}

PUT my_index/my_type/1
{
  "text":"this is a doc"
}

通过id查询文档

GET my_index/my_type/1

结果中并没有_source字段内容

{
  "_index": "my_index",
  "_type": "my_type",
  "_id": "1",
  "_version": 1,
  "found": true
}

_source包含或者排除字段

DELETE my_index

PUT my_index
{
  "mappings": {
    "blog": {
      "_source": {
        "includes": [ "title", "url" ],
        "excludes": [ "content" ]
      },
      "properties": {
        "title": {
          "type": "text"
        },
        "content": {
          "type": "text"
        },
        "url": {
          "type": "text"
        }
      }
    }
  }
}

定义my_index索引blog文档结构包含三个属性：title、content、url。设置_source属性包含title和url不包含content。

PUT my_index/blog/1
{
  "title":"百度搜索",
  "content":"搜索查询的内容有哪些",
  "url":"http://www.baidu.com"
}

GET my_index/blog/1

查询结果只能看到title和url两个字段

{
  "_index": "my_index",
  "_type": "blog",
  "_id": "1",
  "_version": 1,
  "found": true,
  "_source": {
    "title": "百度搜索",
    "url": "http://www.baidu.com"
  }
}

_field_names

_field_names字段索引文档中每个字段的名称，其中包含除null以外的任何值。

_routing

使用以下公式将文档路由到索引中的特定分片。

shard_num = hash(_routing) % num_primary_shards

自定义路由模式可以通过指定每个文档的自定义路由值来实现。

PUT my_index/my_type/3?routing=user1
{
  "title":"this is 3",
  "body":"this is 3 body"
}

GET my_index/my_type/3?routing=user1

查询结果

{
  "_index": "my_index",
  "_type": "my_type",
  "_id": "3",
  "_version": 2,
  "_routing": "user1",
  "found": true,
  "_source": {
    "title": "this is 3",
    "body": "this is 3 body"
  }
}

查询所有“user1”路由下的文档

GET my_index/_search
{
  "query": {
    "term": {
      "_routing": {
        "value": "user1"
      }
    }
  }
}

查询结果

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "my_index",
        "_type": "my_type",
        "_id": "3",
        "_score": 0.2876821,
        "_routing": "user1",
        "_source": {
          "title": "this is 3",
          "body": "this is 3 body"
        }
      }
    ]
  }
}