_all、_source、store、index的使用

最新推荐文章于 2024-09-12 02:03:00 发布

GatsbyNewton

最新推荐文章于 2024-09-12 02:03:00 发布

阅读量2.7k

点赞数

CC 4.0 BY-SA版权

分类专栏： ElasticSearch 文章标签： ElasticSearch _all _source store index

本文链接：https://blog.csdn.net/u010376788/article/details/51706012

ElasticSearch 专栏收录该内容

3 篇文章

订阅专栏

本文介绍了Elasticsearch中_all、_source和store等字段的作用及配置方法，包括字段的禁用、包含与排除等高级应用。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1._all

1.1_all field

_all字段是一个很少用到的字段，它连接所有字段的值构成一个用空格（space）分隔的大string，该string被analyzed和index，但是不被store。当你不知道不清楚document结构的时候，可以用_all。如，有一document：

curl -XPUT 'http://127.0.0.1:9200/myindex/order/0508' -d '{
    "name": "Scott",
    "age": "24"
}'

用_all字段search：

curl -XGET "http://127.0.0.1:9200/myindex/order/_search?pretty" -d '{
    "query": {
        "match": {
            "_all": "Scott 24"
        }
    }
}'

也可以用query_string：

curl -XGET "http://127.0.0.1:9200/myindex/order/_search?pretty" -d '{
    "query": {
        "query_string": {
            "query": "Scott 24"
        }
    }
}'

输出：

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.2712221,
    "hits" : [ {
      "_index" : "myindex",
      "_type" : "order",
      "_id" : "0508",
      "_score" : 0.2712221
    } ]
  }
}

注意：_all是按空格（space）分隔的，所以，对于date类型就被analyzed为["year", "month", "day"]。如，一document：

{
  "first_name":    "John",
  "last_name":     "Smith",
  "date_of_birth": "1970-10-24"
}

curl -XGET "http://127.0.0.1:9200/myindex/order/_search?pretty" -d '{
    "query": {
        "match": {
            "_all": "john smith 1970"
        }
    }
}'

_all字段将包含["john", "smith", "1970", "10", "24"]。

所以，_all 字段仅仅是一个经过分析的 string 字段。它使用默认的分析器来分析它的值，而不管这值本来所在的字段指定的分析器。而且像所有 string 类型字段一样，你可以配置 _all 字段使用的分析器：

PUT /myindex/order/_mapping
{
    "order": {
        "_all": { "analyzer": "whitespace" }
    }
}

1.2 Disable _all field

_all字段需要额外的CPU周期和更多的磁盘。所以，如果不需要_all，最好将其禁用，如：

curl -XPUT 'http://127.0.0.1:9200/myindex/order/_mapping' -d '{
    "order": {
        "_all": {
            "enabled": true
        },
        "properties": {
            .......
        }
    }
}'

1.3 Excluding fields from _all

你可能不想把_all禁用，而是希望_all包含某些特定的fields。通过include_in_all选项可以控制字段是否要被包含在_all字段中，默认值是true。在一个对象上设置include_in_all可以修改这个对象所有字段的默认行为。如，指定_all包含name：

PUT /myindex/order/_mapping
{
    "order": {
        "include_in_all": false,
        "properties": {
            "name": {
                "type": "string",
                "include_in_all": true
            },
            ...
        }
    }
}

2._source

2.1 Disable _source field

ElasticSearch用JSOn字符串表示document主体，且保存在_source中。像其他保存的字段一样，_source字段也会在写入硬盘前压缩。_source字段不能被index，所以不能被搜索到。但是它却被store，所以_source还是要占用磁盘空间。不过，你可以禁用_source。

curl -XPUT 'http://127.0.0.1:9200/myindex/order/_mapping' -d '{
    "order": {
        "_source": {
            "enabled": false
        },
        "properties": {
			......
        }
    }
}'

不过，禁用_source之后，下面的功能将不再支持：

更新请求不再起作用，
On the fly highlighting，
从ElasticSearch的一个index，重新索引到另一个时，要么改变mapping'或analysis，要么升级index到一个新的版本，
在index阶段，通过view document主体debug查询和聚合，
在以后自动修复index的功能丧失。

如果考虑的磁盘空间，你可以增加 compression level，而不用禁用_source。

2.2 Including / Excluding fields from _source

在_sourcez字段store前，而在document被index之后，你可以减少_source字段的内容。移除_source中的fields和禁用_source有相似的缺点，特别是当你不能从一个ElasticSearch的index重新索引到另一个index。但是你可以用source filtering。如下是官网的一个例子：

PUT logs
{
  "mappings": {
    "event": {
      "_source": {
        "includes": [
          "*.count",
          "meta.*"
        ],
        "excludes": [
          "meta.description",
          "meta.other.*"
        ]
      }
    }
  }
}

PUT logs/event/1
{
  "requests": {
    "count": 10,
    "foo": "bar" 
  },
  "meta": {
    "name": "Some metric",
    "description": "Some metric description", 
    "other": {
      "foo": "one", 
      "baz": "two" 
    }
  }
}

GET logs/event/_search
{
  "query": {
    "match": {
      "meta.other.foo": "one" 
    }
  }
}

当然，即使{"_source": {"enabled": true}}，你也可以通过限定_source来请求指定字段：

GET /_search
{
    "query":   { "match_all": {}},
    "_source": [ "title", "created" ]
}

3.store

store属于field的属性，如：

curl -XPUT 'http://127.0.0.1:9200/myindex/order/_mapping' -d '{
    "order": {
        ......
        "properties": {
            "name": {
				"type": "string", 
				"store": "no", 
				......
			},
			......
        }
    }
}'

被store标记的fields被存储在和index不同的fragment中，以便于快速检索。虽然store占用磁盘空间，但是减少了计算。store的值可以取yes/no或者true/false，默认值是no或者false。

被store标记的fields可以用以下方式search（多个fields时，用fields=f1,f2,f3...）：

curl -XGET 'http://hadoop:9200/myindex/order/0508?fields=age&pretty=true'

4.index

和store一样，index也是fields的属性。它用于配置每个被index的field，且默认值是analyzed。index有三个值：

no：该field将不在被index。这样便于管理不需要被search的fields。
analyzed：该field用配置的analyzer分析。它一般是小写且标记化的，使用ElasticSearch默认的配置StandardAnalyzer。
not_analyzed：该field可以处理和index，但是不能改变其analyzer。默认使用的是ElasticSearch配置的KeywordAnalyzer，它把每个field作为一个标识处理。

curl -XPUT 'http://127.0.0.1:9200/myindex/order/_mapping' -d '{
    "order": {
        ......
        "properties": {
            "name": {
				"type": "string", 
				"index": "no", 
				......
			},
			......
        }
    }
}'