es统计有多少个分组,ElasticSearch按文档字段分组并计数发生次数

My ElasticSearch 6.5.2 index look likes:

{

"_index" : "searches",

"_type" : "searches",

"_id" : "cCYuHW4BvwH6Y3jL87ul",

"_score" : 1.0,

"_source" : {

"querySearched" : "telecom",

}

},

{

"_index" : "searches",

"_type" : "searches",

"_id" : "cSYuHW4BvwH6Y3jL_Lvt",

"_score" : 1.0,

"_source" : {

"querySearched" : "telecom",

}

},

{

"_index" : "searches",

"_type" : "searches",

"_id" : "eCb6O24BvwH6Y3jLP7tM",

"_score" : 1.0,

"_source" : {

"querySearched" : "industry",

}

And I would like a query that return this result:

"result":

{

"querySearched" : "telecom",

"number" : 2

},

{

"querySearched" : "industry",

"number" : 1

}

I just want to group by occurence and get number of each, limit to ten biggest numbers. I tried with aggregations but bucket is empty.

Thanks!

解决方案

Case your mapping

PUT /index

{

"mappings": {

"doc": {

"properties": {

"querySearched": {

"type": "text",

"fielddata": true

}

}

}

}

}

Your query should looks like

GET index/_search

{

"size": 0,

"aggs": {

"result": {

"terms": {

"field": "querySearched",

"size": 10

}

}

}

}

You should add fielddata:true in order to enable aggregation for text type field more of that

"size": 10, => limit to 10

After a short discussion with @Kamal i feel obligated to let you know that if you choose to enable fielddata:true you must know that

it can consume a lot of heap space.

From the link I've shared:

Fielddata can consume a lot of heap space, especially when loading high cardinality text fields. Once fielddata has been loaded into the heap, it remains there for the lifetime of the segment. Also, loading fielddata is an expensive process which can cause users to experience latency hits. This is why fielddata is disabled by default.

Another alternative (a more efficient one):

PUT /index

{

"mappings": {

"doc": {

"properties": {

"querySearched": {

"type": "text",

"fields": {

"keyword": {

"type": "keyword",

"ignore_above": 256

}

}

}

}

}

}

}

Then your aggregation query

GET index/_search

{

"size": 0,

"aggs": {

"result": {

"terms": {

"field": "querySearched.keyword",

"size": 10

}

}

}

}

Both solutions works but you should take this under consideration.

Hope it helps

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值