python使用ElasticSearch7.17.6笔记

最新推荐文章于 2024-03-07 16:20:52 发布

原创最新推荐文章于 2024-03-07 16:20:52 发布 · 1.1k 阅读

2 ·

CC 4.0 BY-SA版权

文章标签：

#elasticsearch #python

应用同时被 2 个专栏收录

48 篇文章

订阅专栏

python

6 篇文章

订阅专栏

数操作系统：windows10

我开始使用最新的版本，8.4.1但是使用过程中kibana启动不了，就索性使用旧版；

下载地址：

es7.17.6 下载地址

kibana7.17.6下载地址

解压到合适的位置，更改elasticsearch.yml

添加配置如下：

cluster.name: robin-es
node.name: node-1
network.host: 0.0.0.0
http.port: 9200
cluster.initial_master_nodes: ["node-1"]

更改kibana.yml配置

i18n.locale: "zh-CN"

到各自的bin目录下启动两个服务bat文件，

在浏览器中执行http:://localhost:9200

可以看到json就对了

{
  "name" : "node-1",
  "cluster_name" : "robin-es",
  "cluster_uuid" : "pAvuRyRESuCHtbTnfdWrvA",
  "version" : {
    "number" : "7.17.6",
    "build_flavor" : "default",
    "build_type" : "zip",
    "build_hash" : "f65e9d338dc1d07b642e14a27f338990148ee5b6",
    "build_date" : "2022-08-23T11:08:48.893373482Z",
    "build_snapshot" : false,
    "lucene_version" : "8.11.1",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

使用python需要添加一下相关的库，我这里使用国内的库，并且使用代理，

注意：建议使用对应版本的库，否则可能不兼容。

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple/   elasticsearch==7.17.6 --proxy="http://127.0.0.1:1081"
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple/   elasticsearch[async]==7.17.6 --proxy="http://127.0.0.1:1081"

连接数据库：

indexName = "student"
client = Elasticsearch(
    ['127.0.0.1:9200'],
    # 在做任何操作之前，先进行嗅探
    # sniff_on_start=True,
    # # 节点没有响应时，进行刷新，重新连接
    sniff_on_connection_fail=True,
    # # 每 60 秒刷新一次
    sniffer_timeout=60
)

写几个增删改查的函数：

需要注意：7.15版本以上使用了新的函数，旧的方式已经不适用了

# 推荐使用  elasticsearch  需要注意版本问题
from queue import Empty
from elasticsearch import Elasticsearch
from elasticsearch import *
import json
# es 7.17


def checkIndexByName(client, indexName):
    try:
        res = client.indices.get(index=indexName)
        # print(res)
        return True
    except Exception as ex:
        return False

# 创建索引
def createIndex(client, name,  doc):
    ret = False
    try:
        # Elasticsearch.options()
        resp = client.indices.create(index=name, mappings=doc["mappings"])
        # print(resp['result'])
        ret = True
    except Exception as ex:
        print(ex)
        return False

    return ret

# 删除索引 
def dropIndex(client, name):
    ret = False
    try:
        # Elasticsearch.options()
        result = client.indices.delete(index=name)
        ret = True
    except:
        return False

    return ret


def addDoc(client, index, doc, id):
    # 重复添加，数据覆盖
    try:
        resp = client.index(index=index, document=doc, id=id)
        print(resp['result'])
        return True
    except Exception as e:
        print("create index error")
        return False


def delDocFromIndex(client, index, id):
    try:
        res = client.delete(index=index, id=id)
        print(res['_shards']['successful'])
        return '1'
    except Exception as e:
        print(e)
        return '0'


def findDocById(client, index, id):
    try:
        res = client.get(index=index, id=id)
        return res['_source']
    except Exception as e:
        print(e)
        return 'nil'

创建索引的过程，可以在外部配置文件中设置相关的参数，

比如我们创建一个学生的相关索引，我们建立一个配置文件student.json

{
    "settings": {
        "index": {
            "number_of_shards": 1,
            "number_of_replicas": 0
        }
    },
    "mappings": {
        "dynamic": "strict",
        "properties": {
            "name": {
                "type": "key"
            },
            "age": {
                "type": "long"
            },
            "birthday": {
                "type": "date"
            }
        }
    }
}

之后，我们创建索引时候这样使用：

def load_json(filePath):
    data = open(filePath, 'r').read()
    return json.loads(data)


docMapping = load_json("./student.json")
# print(docMapping)


#dropIndex(client, indexName, result)
ret = checkIndexByName(client, indexName)
if not ret:
    print("\nindex is exsit = %d" % ret)
    createIndex(client, indexName, docMapping)

如果没有索引，则创建一下；

在kibana的开发工具中可以看到相关的结果：

之后，添加2个记录（文档）试试

doc = {

    "name": "灰太狼",
    "age": 22,
    "birthday": "2000-02-02",
    "tags": ["男"]

}
res = addDoc(client, indexName, doc, 13810500001)
# print(res)

doc = {

    "name": "美羊羊",
    "age": 10,
    "birthday": "2010-01-01",
    "tags": ["女"]

}
res = addDoc(client, indexName, doc, 13810500002)
# print(res)

可以在kibana中看到：

目前位置，基本的增删改查，都实现了，但是还需要复杂的查询：

bodyQueryAll = {
    "query": {
        "match_all": {}
    }
}
res = client.search(index=indexName, query=bodyQueryAll["query"])
print("查询到%d 个" % res['hits']['total']['value'])

items = res["hits"]["hits"]
# print(items)
for item in items:
    print("index=%s, id=%s doc=%s" %
          (item['_index'], item['_id'], item['_source']))

查询到2 个
index=student, id=13810501001 doc={'name': '灰太狼', 'age': 22, 'birthday': '2000-02-02', 'tags': ['男']}
index=student, id=13810501002 doc={'name': '美羊羊', 'age': 21, 'birthday': '2000-01-01', 'tags': ['女']}

在kibana中，是这样的：

知道了查询后返回数据的结构了，就可以提取我们想要的数据了，

再添加2个查询函数：

def queryAll(client, indexName):
    bodyQueryAll = {
        "query": {
            "match_all": {}
        }
    }
    res = client.search(index=indexName, query=bodyQueryAll["query"])
    n = res['hits']['total']['value']
    #print("查询到%d 个" % n)

    items = res["hits"]["hits"]
    # print(items)
    # for item in items:
    #     print("index=%s, id=%s doc=%s" %
    #           (item['_index'], item['_id'], item['_source']))
    return (n, items)


def queryByDoc(client, indexName, query):
    res = client.search(index=indexName, query=query)
    n = res['hits']['total']['value']
    items = res["hits"]["hits"]
    return (n, items)

测试代码如下：

print("查全量：")
res = queryAll(client, indexName)
n = res[0]
items = res[1]
# print(items)
for item in items:
    print("index=%s, id=%s doc=%s" %
          (item['_index'], item['_id'], item['_source']))


queryNames = {
    "bool":
    {
        "should": [
            {"match":
                {"name": "美羊羊"}
             },
            {
                "match": {"name": "喜羊羊"}
            }
        ]
    }
}

print("查名字：")
res = queryByDoc(client, indexName, queryNames)
n = res[0]
items = res[1]
# print(items)
for item in items:
    print("index=%s, id=%s doc=%s" %
          (item['_index'], item['_id'], item['_source']))

输出：

查全量：
index=student, id=13810501001 doc={'name': '灰太狼', 'age': 22, 'birthday': '2000-02-02', 'tags': ['男']}
index=student, id=13810501002 doc={'name': '美羊羊', 'age': 21, 'birthday': '2000-01-01', 'tags': ['女']}
查名字：
index=student, id=13810501002 doc={'name': '美羊羊', 'age': 21, 'birthday': '2000-01-01', 'tags': ['女']}

kibana中这样的：