python elasticsearch dsl_使用python聚合elasticsearch-dsl中的字段

最新推荐文章于 2024-05-04 13:42:08 发布

weixin_39628498

最新推荐文章于 2024-05-04 13:42:08 发布

阅读量166

点赞数

文章标签： python elasticsearch dsl

小编典典

首先。现在我注意到，我在这里写的内容实际上没有定义聚合。对我来说，有关如何使用它的文档不是很可读。使用我上面写的内容，我将进行扩展。我正在更改索引名称以使其成为一个更好的示例。

from datetime import datetime

from elasticsearch_dsl import DocType, String, Date, Integer

from elasticsearch_dsl.connections import connections

from elasticsearch import Elasticsearch

from elasticsearch_dsl import Search, Q

# Define a default Elasticsearch client

client = connections.create_connection(hosts=['http://blahblahblah:9200'])

s = Search(using=client, index="airbnb", doc_type="sleep_overs")

s = s.execute()

# invalid! You haven't defined an aggregation.

#for tag in s.aggregations.per_tag.buckets:

# print (tag.key)

# Lets make an aggregation

# 'by_house' is a name you choose, 'terms' is a keyword for the type of aggregator

# 'field' is also a keyword, and 'house_number' is a field in our ES index

s.aggs.bucket('by_house', 'terms', field='house_number', size=0)

在上面，我们为每个门牌号创建1个存储桶。因此，存储桶的名称将是门牌号。ElasticSearch(ES)始终会提供适合该存储桶的文档的文档计数。Size =

0表示要使用所有结果，因为ES的默认设置是仅返回10个结果(或您的开发人员设置为执行的任何结果)。

# This runs the query.

s = s.execute()

# let's see what's in our results

print s.aggregations.by_house.doc_count

print s.hits.total

print s.aggregations.by_house.buckets

for item in s.aggregations.by_house.buckets:

print item.doc_count

我之前的错误是认为elasticsearch查询默认具有聚合。您可以自己定义它们，然后执行它们。然后，您的响应可以通过您提到的聚合器进行拆分。

上面的CURL应该看起来像：

注意：我使用SENSE为Google Chrome浏览器提供一个ElasticSearch插件/扩展/附加组件。在SENSE中，您可以使用//注释掉。

POST /airbnb/sleep_overs/_search

{

// the size 0 here actually means to not return any hits, just the aggregation part of the result

"size": 0,

"aggs": {

"by_house": {

"terms": {

// the size 0 here means to return all results, not just the the default 10 results

"field": "house_number",

"size": 0

}

}

}

}

解决方法。DSL的GIT上的某人告诉我忘记翻译，而只是使用这种方法。它更简单，您只需用CURL编写难懂的内容。这就是为什么我称其为变通方法。

# Define a default Elasticsearch client

client = connections.create_connection(hosts=['http://blahblahblah:9200'])

s = Search(using=client, index="airbnb", doc_type="sleep_overs")

# how simple we just past CURL code here

body = {

"size": 0,

"aggs": {

"by_house": {

"terms": {

"field": "house_number",

"size": 0

}

}

}

}

s = Search.from_dict(body)

s = s.index("airbnb")

s = s.doc_type("sleepovers")

body = s.to_dict()

t = s.execute()

for item in t.aggregations.by_house.buckets:

# item.key will the house number

print item.key, item.doc_count

希望这可以帮助。现在，我在CURL中设计所有内容，然后使用Python语句剥离结果以获取所需的内容。这有助于进行多个级别的聚合(子聚合)。

2020-06-22

weixin_39628498

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python elasticsearch dsl_使用python聚合elasticsearch-dsl中的字段

小编典典首先。现在我注意到，我在这里写的内容实际上没有定义聚合。对我来说，有关如何使用它的文档不是很可读。使用我上面写的内容，我将进行扩展。我正在更改索引名称以使其成为一个更好的示例。from datetime import datetimefrom elasticsearch_dsl import DocType, String, Date, Integerfrom elasticsearch_...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。