Python 利用 elasticsearch 操作Elasticsearch对象

最新推荐文章于 2024-06-27 09:22:21 发布

buside

最新推荐文章于 2024-06-27 09:22:21 发布

阅读量653

点赞数

分类专栏： python初步 elasticsearch 文章标签： elasticsearch python

原文链接：https://www.cnblogs.com/Alexephor/p/11398060.html

版权

python初步同时被 2 个专栏收录

22 篇文章 0 订阅

订阅专栏

elasticsearch

9 篇文章 2 订阅

订阅专栏

这篇博客详细介绍了如何使用Elasticsearch进行索引管理，包括创建自定义mappings、集群和节点操作、搜索查询、过滤结果、文档增删改查等。此外，还涵盖了获取集群健康状态、节点信息和统计，以及利用filter_path、_source参数优化返回内容。示例代码展示了如何执行各种操作，帮助读者深入理解Elasticsearch的使用。

摘要由CSDN通过智能技术生成

操作几个方面

结果过滤，对于返回结果做过滤，主要是优化返回内容。
直接操作elasticsearch对象，处理一些简单的索引信息。一下几个方面都是建立在es对象的基础上。
Indices，关于索引的细节操作，比如创建自定义的mappings。
Cluster，关于集群的相关操作。
Nodes，关于节点的相关操作。
Cat API，换一种查询方式，一般的返回都是json类型的，cat提供了简洁的返回结果。
Snapshot，快照相关，快照是从正在运行的Elasticsearch集群中获取的备份。我们可以拍摄单个索引或整个群集的快照，并将其存储在共享文件系统的存储库中，并且有一些插件支持S3，HDFS，Azure，Google云存储等上的远程存储库。
Task Management API，任务管理API是新的，仍应被视为测试版功能。API可能以不向后兼容的方式更改。

结果过滤

filter_path参数用于过滤减少es返回信息，可以指定返回相关的内容，还支持一些通配符的操作*

body = {
    "query": {
        "match": {
            "name": "成都"
        }
    }
}
# print(es.search(index="p1", body=body))
print(es.search(index="p1", body=body, filter_path=["hits.hits"]))
print(es.search(index="p1", body=body, filter_path=["hits.hits._source"]))
print(es.search(index="p1", body=body, filter_path=["hits.hits._source", "hits.total"]))
print(es.search(index="p1", body=body, filter_path=["hits.*"]))
print(es.search(index="p1", body=body, filter_path=["hits.hits._*"]))

Elasticsearch(es对象 )

es.index 向指定索引添加更新文档如果索引不存在就会创建，然后执行添加更新等操作

print(es.index(index="p2", doc_type="doc", id=1, body={"name": "棒槌", "age": "18"}))  # 正常
print(es.index(index="p2", doc_type="doc", id=2, body={"name": "棒棒哒", "age": 20}))  # 正常
print(es.index(index="p2", doc_type="doc", body={"name": "熊大", "age": "10"}))  # 如果添加文档不带id自动会创建一个

es.search 执行搜索查询并获取其匹配它可以跟复杂的查询条件

index要搜索的以逗号分隔的索引名称列表; 使用_all 或空字符串对所有索引执行操作。
doc_type 要搜索的以逗号分隔的文档类型列表; 留空以对所有类型执行操作。
body 使用Query DSL（QueryDomain Specific Language查询表达式）的搜索定义。
_source 返回_source字段的true或false，或返回的字段列表，返回指定字段。
_source_exclude要从返回的_source字段中排除的字段列表，返回的所有字段中，排除哪些字段。
_source_include从_source字段中提取和返回的字段列表，跟_source差不多。

print(es.search(index='p2', body={"query": {"match": {"age": "10"}}}))
print(es.search(index='p2', body={"query": {"match": {"age": "10"}}}, _source=['name', 'age']))
print(es.search(index='p2', body={"query": {"match": {"age": "10"}}}, _source_exclude=['age']))
print(es.search(index='p2', body={"query": {"match": {"age": "10"}}}, _source_include=['age']))

es.get_source，通过索引、类型和ID获取文档的来源，其实，直接返回想要的字典。

print(es.get_source(index='p2', doc_type='doc', id='1'))

es.count，执行查询并获取该查询的匹配数。比如查询年龄是18的文档。

body = {
    "query": {
        "match": {
            "age": 18
        }
    }
}
# print(es.count(index='p2', doc_type='doc', body=body))
# print(es.count(index='p2', doc_type='doc', body=body)['count'])
print(es.count(index='p2')) # 查在p2索引中总的文档个数{'count': 4, '_shards': {'total': 5, 'successful': 5, 'skipped': 0, 'failed': 0}}
print(es.count(index='p2', doc_type='doc')) # {'count': 4, '_shards': {'total': 5, 'successful': 5, 'skipped': 0, 'failed': 0}}

es.create，创建索引（索引不存在的话）并新增一条数据，索引存在仅新增（只能新增，重复执行会报错），还是觉得index好用些

# print(es.create(index='p2', doc_type='doc', id=3, body={"city": "成都", "desc": "旅游的地方颇多， 小吃出名"}))
print(es.get_source(index='p2', doc_type='doc', id=3))

es.delete，删除指定的文档。比如删除文章id为4的文档，但不能删除仅只删除索引，如果想要删除索引，还需要es.indices.delete来处理
es.delete_by_query，删除与查询匹配的所有文档。

print(es.delete(index='p2', doc_type='doc', id=1))
print(es.delete_by_query(index='p2', body={"query": {"match": {"age": 20}}}))
print(es.search(index='p2'))

es.exists，查询elasticsearch中是否存在指定的文档，返回一个布尔值

print(es.exists(index='p2', doc_type='doc', id='1'))

es.info，获取当前集群的基本信息。

print(es.info())

es.ping，如果集群已启动，则返回True，否则返回False。

print(es.ping())

Indices(es.indices 索引相关)　　

创建索引

body = {
    "mappings": {
        "doc": {
            "dynamic": "strict",
            "properties": {
                "title": {
                    "type": "text",
                    "analyzer": "ik_max_word"
                },
                "url": {
                    "type": "text"
                },
                "action_type": {
                    "type": "text"
                },
                "content": {
                    "type": "text"
                }
            }
        }
    }
}

print(es.indices.create(index='p3', body=body))

es.indices.analyze，返回分词结果。

print(es.indices.analyze(body={'analyzer': 'ik_max_word', 'text': "皮特和茱丽当选“年度模范情侣”Brad Pitt and Angelina Jolie"}))

es.indices.delete，在Elasticsearch中删除索引

print(es.indices.delete(index='p3'))
print(es.indices.delete(index='p2'))    # {'acknowledged': True}

es.indices.put_alias，为一个或多个索引创建别名，查询多个索引的时候，可以使用这个别名。

print(es.indices.put_alias(index='p3', name='p3_alias'))  # 为单个索引创建别名
print(es.indices.put_alias(index=['p3', 'p2'], name='p23_alias'))  # 为多个索引创建同一个别名，联查用

es.indices.delete_alias，删除一个或多个别名。

print(es.indices.delete_alias(index='p1'))
print(es.indices.delete_alias(index=['p1, p2']))

es.indices.get_mapping，检索索引或索引/类型的映射定义。

print(es.indices.get_mapping(index='p3'))

es.indices.get_settings，检索一个或多个（或所有）索引的设置。

print(es.indices.get_settings(index='p3'))

es.indices.get，允许检索有关一个或多个索引的信息。

print(es.indices.get(index='p2'))    # 查询指定索引是否存在
print(es.indices.get(index=['p2', 'p3']))

es.indices.get_alias，检索一个或多个别名。

print(es.indices.get_alias(index='p2'))
print(es.indices.get_alias(index=['p2', 'p3']))

es.indices.get_field_mapping，检索特定字段的映射信息。

print(es.indices.get_field_mapping(fields='url', index='p3', doc_type='doc'))
print(es.indices.get_field_mapping(fields=['url', 'title'], index='p3', doc_type='doc'))

es.indices.delete_alias，删除特定别名。
es.indices.exists，返回一个布尔值，指示给定的索引是否存在。
es.indices.exists_type，检查索引/索引中是否存在类型/类型。
es.indices.flus，明确的刷新一个或多个索引。
es.indices.get_field_mapping，检索特定字段的映射。
es.indices.get_template，按名称检索索引模板。
es.indices.open，打开一个封闭的索引以使其可用于搜索。
es.indices.close，关闭索引以从群集中删除它的开销。封闭索引被阻止进行读/写操作。
es.indices.clear_cache，清除与一个或多个索引关联的所有缓存或特定缓存。
es.indices.put_alias，为特定索引/索引创建别名。
es.indices.get_uprade，监控一个或多个索引的升级程度。
es.indices.put_mapping，注册特定类型的特定映射定义。
es.indices.put_settings，实时更改特定索引级别设置。
es.indices.put_template，创建一个索引模板，该模板将自动应用于创建的新索引。
es.indices.rollove，当现有索引被认为太大或太旧时，翻转索引API将别名转移到新索引。API接受单个别名和条件列表。别名必须仅指向单个索引。如果索引满足指定条件，则创建新索引并切换别名以指向新别名。
es.indices.segments，提供构建Lucene索引（分片级别）的低级别段信息。

Cluster（集群相关）

es.cluster.get_settigns，获取集群设置。

print(es.cluster.get_settings())

es.cluster.health，获取有关群集运行状况的非常简单的状态。

print(es.cluster.health())

es.cluster.state，获取整个集群的综合状态信息。

print(es.cluster.state())

es.cluster.state，获取整个集群的综合状态信息

print(es.cluster.stats())

Node（节点相关）

es.nodes.info，返回集群中节点的信息。

print(es.nodes.info())  # 返回所节点
print(es.nodes.info(node_id='node1'))   # 指定一个节点
print(es.nodes.info(node_id=['node1', 'node2']))   # 指定多个节点列表

es.nodes.stats，获取集群中节点统计信息。

print(es.nodes.stats())
print(es.nodes.stats(node_id='node1'))
print(es.nodes.stats(node_id=['node1', 'node2']))

es.nodes.hot_threads，获取指定节点的线程信息。

print(es.nodes.hot_threads(node_id='node1'))
print(es.nodes.hot_threads(node_id=['node1', 'node2']))

es.nodes.usage，获取集群中节点的功能使用信息。

print(es.nodes.usage())
print(es.nodes.usage(node_id='node1'))
print(es.nodes.usage(node_id=['node1', 'node2']))

官方文档：API Documentation — Elasticsearch 8.0.0 documentation

参考链接：Python操作Elasticsearch对象 - MC_Hotdog - 博客园

buside

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
Python 利用 elasticsearch 操作Elasticsearch对象

操作几个方面结果过滤，对于返回结果做过滤，主要是优化返回内容。直接操作elasticsearch对象，处理一些简单的索引信息。一下几个方面都是建立在es对象的基础上。 Indices，关于索引的细节操作，比如创建自定义的mappings。 Cluster，关于集群的相关操作。 Nodes，关于节点的相关操作。 Cat API，换一种查询方式，一般的返回都是json类型的，cat提供了简洁的返回结果。 Snapshot，快照相关，快照是从正在运行的Elasticsearch集群中获取的备份。
复制链接

扫一扫

专栏目录