使用python对ElasticSearch进行简单操作

最新推荐文章于 2024-09-23 11:29:56 发布

SaySeaKing

最新推荐文章于 2024-09-23 11:29:56 发布

阅读量844

点赞数 1

分类专栏： ElasticSearch 文章标签： ElasticSearch python

本文链接：https://blog.csdn.net/SaySeaKing/article/details/81211216

版权

ElasticSearch 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

依赖环境

urllib3 <2.0, >=1.8
elasticsearch 6.2.0

建立连接

无登录认证

from elasticsearch import Elasticsearch

ip = "0.0.0.0"
port = "9200"

es = ElasticSearch(['%s:%s' % (user, pwd, ip, port)])

有登录认证

from elasticsearch import Elasticsearch

ip = "0.0.0.0"
port = "9200"
user = "admin"
pwd = "admin"

es = ElasticSearch(['%s:%s@%s:%s' % (user, pwd, ip, port)])

数据格式设置

接口说明

indices.creat()：创建指定index
indices.exists()：判断指定index是否存在
indices.exists_type()：判断指定type是否存在
indices.get_mapping()：获取mapping
indices.put_mapping()：设置指定type的mapping，默认index为_all
indices.get_settings()：获取settings
indices.put_settings()：设置setting，默认为_all

代码示例

mappings 与 settings 可自行百度用法

doc_index = "indexName"
doc_type = "typeName"
# 设置mappings
mappings = {
    "properties": {
        "rule": {
            "type": "string",
            "index": "not_analyzed"    
        }
    }
}
# 设置settings
settings = {
    "index": {
        "max_result_window": "10000000"
    }
}
# 判断index是否存在, 不存在则创建并设置settings
if not self.conn.indices.exists(index=doc_index):
    self.conn.indices.create(index=doc_index)
    self.conn.indices.put_settings(body=settings, index=doc_index)
# 判断type是否存在，不存在则创建并设置mapping
if not self.conn.indices.exists_type(index=doc_index, doc_type=doc_type):
    self.conn.indices.put_mapping(doc_type=doc_type, body=doc_mappings, index=doc_index)

插入数据

接口说明

create：必须指定待查询的idnex、type、id和查询体body；缺一不可，否则报错  　　
index：相比于create，index的用法就相对灵活很多；id并非是一个必选项，如果指定，则该文档的id就是指定值，若不指定，则系统会自动生成一个全局唯一的id赋给该文档。

批量操作

doc_index = "indexName"
doc_type = "typeName"
doc_body = [      
    {"index": {}},     
    {'name': 'jackaaa', 'age': 2000, 'sex': 'female', 'address': u'北京'},     
    {"index": {}},      
    {'name': 'jackbbb', 'age': 3000, 'sex': 'male', 'address': u'上海'},      
    {"index": {}},      
    {'name': 'jackccc', 'age': 4000, 'sex': 'female', 'address': u'广州'},      
    {"index": {}},      
    {'name': 'jackddd', 'age': 1000, 'sex': 'male', 'address': u'深圳'}
]

es.bulk(index=doc_index, doc_type=doc_type, body=doc_body)

查询数据

接口说明

get：获取指定index、type、id所对应的文档
search：查询满足条件的所有文档，没有id属性，且index，type和body均可为None。body的语法格式必须符合DSL（Domain Specific Language ）格式

单一操作

doc_index = "indexName"
doc_type = "typeName"
doc_id = "idName"

es.get(index=doc_index, doc_type=doc_type, body=doc_body, id=doc_id)

条件操作

doc_index = "indexName"
doc_type = "typeName"
doc_id = "idName"
# 查找所有文档  
doc_body_query = {
    'query': {
        'match_all': {}
    }
}            

allDoc = es.search(index=doc_index, doc_type=doc_type, body=doc_body_query) 
# 返回第一个文档的内容
print allDoc['hits']['hits'][0]

删除数据

接口说明

delete：删除指定index、type、id的文档
delete_by_query：删除满足条件的所有数据，查询条件必须符合DLS格式

单一操作

doc_index = "indexName"
doc_type = "typeName"
doc_id = "idName"

es.delete(index=doc_index, doc_type=doc_type, id=doc_id)

条件操作

doc_index = "indexName"
doc_type = "typeName"
# 查询所有文档
doc_body_query = {
    'query': {
        'match_all': {}
    }
}

es.delete_by_query(index=doc_index, doc_type=doc_type, body=doc_body_query)

更新数据

接口说明

update：跟新指定index、type、id所对应的文档
update_by_query：更新满足条件的所有数据，查询条件必须符合DLS格式

单一操作

doc_index = "indexName"
doc_type = "typeName"
doc_id = "idName"
# 删除age字段
doc_body = {
    'script': "ctx._source.remove('age')"
}  
# 增加字段   
doc_body = {
    'script': "ctx._source.address = '合肥'"
} 
# 修改部分字段
doc_body = {
    "doc": {"name": 'jackaaa'}
}

es.update(index=doc_index,  doc_type=doc_type,  id=doc_id, body=doc_body)

条件操作

doc_index = "indexName"
doc_type = "typeName"
# 对所有文档,删除年龄字段
doc_body_query = {
    'query': {
        'match_all': {}
    },
    'script': "ctx._source.remove('age')"
}
# 对所有文档,增加地址字段
doc_body_query = {
    'query': {
        'match_all': {}
    },
    'script': "ctx._source.address = '合肥'"
} 
# 对所有文档, 修改名字字段
doc_body_query = {
    'query': {
        'match_all': {}
    },
    "doc":  {"name": 'jackaaa'}
}

es.update_by_query(index=doc_index, doc_type=doc_type, body=doc_body_query)

易错点

bulk接口

在用bulk在批量操作的时候，对于不同的操作类型，一定要与之对应一个操作头信息（eg：{“index”: {}}， {‘delete’: {…}}， …），否则会报TransportError（400, u’illegal_argument_exception’）的错误。

参考文档

https://blog.csdn.net/YHYR_YCY/article/details/78882011
https://github.com/YHYR/ElasticSearchUtils/blob/master/utils/elasticsearchUtil.py
https://blog.csdn.net/yhyr_ycy/article/details/78876391