前言
大多数字段在默认情况下都有索引,这使得它们可以被搜索,而ES对文档字段内容进行分词处理,然后通过建立倒排索引的方式对分词进行搜索。
然而如果需要对文档进行排序、聚合或者使用脚本语句时,那么使用倒排索引就无法完成了,因为我们不需要通过分词找到文档,而是需要通过文档找到其中包含的分词信息。
Doc_values
Doc_values是在文档索引时构建的磁盘数据结构,更好的支持排序、聚合以及脚本等使用需求。它们存储与_source相同的值,但以面向列的方式存储,这对于排序和聚合来说更高效。几乎所有字段类型都支持Doc值,除了text和annotated_text字段。
所有支持文档值的字段都默认启用。如果你确定你不需要对字段进行排序或聚合,或者从脚本中访问字段值,你可以禁用文档值以节省磁盘空间。
案例演示
建立以个索引,其中age字段,设置 “doc_values”: false
PUT /emp/
{
"mappings": {
"properties": {
"name":{
"type": "keyword"
},
"age":{
"type": "integer",
"doc_values": false
}
}
}
}
插入3条文档数据
PUT /emp/_doc/1
{
"name":"zhang san",
"age":18
}
PUT /emp/_doc/2
{
"name":"li si",
"age":20
}
PUT /emp/_doc/3
{
"name":"wang wu",
"age":22
}
查看全部文档内容
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "emp",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"name" : "zhang san",
"age" : 18
}
},
{
"_index" : "emp",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "li si",
"age" : 20
}
},
{
"_index" : "emp",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"name" : "wang wu",
"age" : 22
}
}
]
}
}
尝试对age进行聚合查询
GET /emp/_search
{
"aggs": {
"age_bucket": {
"terms": {
"field": "age"
}
}
}
}
出现错误
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Can't load fielddata on [age] because fielddata is unsupported on fields of type [integer]. Use doc values instead."
}
],
"type" : "search_phase_execution_exception",
"reason" : "all shards failed",
"phase" : "query",
"grouped" : true,
"failed_shards" : [
{
"shard" : 0,
"index" : "emp",
"node" : "8XpIzmvZSwqoag95T5e_yg",
"reason" : {
"type" : "illegal_argument_exception",
"reason" : "Can't load fielddata on [age] because fielddata is unsupported on fields of type [integer]. Use doc values instead."
}
}
],
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "Can't load fielddata on [age] because fielddata is unsupported on fields of type [integer]. Use doc values instead.",
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "Can't load fielddata on [age] because fielddata is unsupported on fields of type [integer]. Use doc values instead."
}
}
},
"status" : 400
}
但并不影响普通的查询
GET /emp/_search
{
"query": {
"match": {
"age": "18"
}
}
}
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "emp",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"name" : "zhang san",
"age" : 18
}
}
]
}
}
当然对于没有设置doc_values字段的name来说,默认为true,因此是支持聚合查询的
GET /emp/_search
{
"aggs": {
"name_bucket": {
"terms": {
"field": "name"
}
}
}
}
正常返回结果
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "emp",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"name" : "zhang san",
"age" : 18
}
},
{
"_index" : "emp",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "li si",
"age" : 20
}
},
{
"_index" : "emp",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"name" : "wang wu",
"age" : 22
}
}
]
},
"aggregations" : {
"name_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "li si",
"doc_count" : 1
},
{
"key" : "wang wu",
"doc_count" : 1
},
{
"key" : "zhang san",
"doc_count" : 1
}
]
}
}
}
同样也不支持排序
GET /emp/_search
{
"sort": [
{
"age": {
"order": "desc"
}
}
]
}
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Can't load fielddata on [age] because fielddata is unsupported on fields of type [integer]. Use doc values instead."
}
],
"type" : "search_phase_execution_exception",
"reason" : "all shards failed",
"phase" : "query",
"grouped" : true,
"failed_shards" : [
{
"shard" : 0,
"index" : "emp",
"node" : "8XpIzmvZSwqoag95T5e_yg",
"reason" : {
"type" : "illegal_argument_exception",
"reason" : "Can't load fielddata on [age] because fielddata is unsupported on fields of type [integer]. Use doc values instead."
}
}
],
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "Can't load fielddata on [age] because fielddata is unsupported on fields of type [integer]. Use doc values instead.",
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "Can't load fielddata on [age] because fielddata is unsupported on fields of type [integer]. Use doc values instead."
}
}
},
"status" : 400
}