用python脚本更新Elasticsearch数据库

最新推荐文章于 2024-05-30 13:51:18 发布

jia_xue

最新推荐文章于 2024-05-30 13:51:18 发布

阅读量565

点赞数 14

文章标签： elasticsearch 大数据

本文链接：https://blog.csdn.net/jia_xue/article/details/136055312

版权

本文介绍了如何使用Python通过ElasticsearchAPI从一个本地文本文件中读取callid，然后根据callid更新Elasticsearch索引中相关文档的languagename字段。原始代码和针对新版本的优化版本都包含在内，展示了向_doc类型文档的批量更新过程。

摘要由CSDN通过智能技术生成

这段代码的主要功能是通过Elasticsearch与一个索引（index）进行交互，读取本地文本文件中的callid，并根据callid更新索引中相关文档的字段值。以下是详细解释：

# -*- coding: utf-8 -*-
# @Time    : 2023/3/29 11:27
# @Author  : hjcui
# @Site    : 
# @File    : Update_ES.py
# @Software: PyCharm

# 原始的
from elasticsearch import Elasticsearch
import os

es_conn = Elasticsearch(['194.169.55.12:9200'])
index = 'cr-all-2023.03'
type = 'doc'
suffix = '505_'
src_file = r'./wewr_callid_0328.txt'

with open(src_file,'r',encoding='utf-8') as sf:
    for callid in sf:
        guid = suffix + callid.strip()
        es_query = \
            {
                "query": {
                    "term": {
                        "callid.keyword": {
                            "value": callid.strip()
                        }
                    }
                }
            }
        docs = es_conn.search(index=index,body=es_query)
        source = docs['hits']['hits'][0]['_source']
        # 如果要修改的字段值与数据库中的不一致时，显示'successful': 1，否则是'successful': 0
        result = es_conn.update(index=index,doc_type='doc',id=guid,body={'doc':{'languagename':"other"}})
        print(result)



# 针对新版本优化的
# -*- coding: utf-8 -*-
from elasticsearch import Elasticsearch, helpers
import os

es_conn = Elasticsearch(['194.169.55.12:9200'])
index = 'index-2023.03'
doc_type = '_doc'  # 自Elasticsearch 7.x版本起，推荐使用_doc作为默认类型
suffix = '505_'
src_file = r'./wewr_callid_0328.txt'

def update_docs(callids):
    actions = []
    for callid in callids:
        guid = suffix + callid.strip()
        es_query = {
            "_id": guid,
            "_index": index,
            "doc": {"languagename": "other"},
            "doc_as_upsert": True  # 如果文档不存在，则创建新文档（upsert）
        }
        actions.append({"update": es_query})

    if actions:
        # 批量执行更新操作
        result = helpers.bulk(es_conn, actions)
        print(f"成功更新/插入了{result[0]}个文档，失败了{result[1]}个。")

with open(src_file, 'r', encoding='utf-8') as sf:
    callids = [line.strip() for line in sf]
    
    # 对读取到的callids列表进行更新操作
    update_docs(callids)