ElasticSearch练习题---python

最新推荐文章于 2024-06-08 16:44:58 发布

天青如水

最新推荐文章于 2024-06-08 16:44:58 发布

阅读量716

点赞数

分类专栏： python 文章标签： elasticsearch python

本文链接：https://blog.csdn.net/qq_16829085/article/details/107142542

版权

python 专栏收录该内容

26 篇文章 0 订阅

订阅专栏

题目

一、雇员表查询
1.添加以下三条信息到Elasticsearch，index为megacorp,type为employee，id分别为1，2，3
{
“first_name” : “John”,
“last_name” : “Smith”,
“age” : 25,
“about” : “I love to go rock climbing”,
“interests”: [ “sports”, “music” ]
}

{
“first_name” : “Jane”,
“last_name” : “Smith”,
“age” : 32,
“about” : “I like to collect rock albums”,
“interests”: [ “music” ]
}

{
“first_name” : “Douglas”,
“last_name” : “Fir”,
“age” : 35,
“about”: “I like to build cabinets”,
“interests”: [ “forestry” ]
}

2.查看雇员id为1信息。

3.搜索所有雇员信息
4.搜索特定条件的雇员信息：通过非json形式搜索名字为Smith的雇员。
5.通过match的方式搜索名字为Smith的雇员。
6. 搜索名字为 Smith 的雇员，但年龄大于 30 岁的。
7. 搜索下所有喜欢攀岩（rock climbing）的雇员。
8. #仅匹配同时包含 “rock” 和 “climbing”，并且二者以短语 “rock climbing” 的形式紧挨着的雇员记录
9.按照第8题的搜索要求，同时需要高亮显示搜索的内容。

二、商品信息查询。
1.请把以下文档导入ElasticSearch
{“id”: 1, “studentNo”: “TH-CHEM-2016-C001”, “name”: “Jonh Smith”, “major”:“Chemistry”, “gpa”: 4.8, “yearOfBorn”: 2000, “classOf”: 2016, “interest”: “soccer, basketball, badminton, chess”}

{“id”: 2, “studentNo”: “TH-PHY-2018-C001”, “name”: “Isaac Newton”, “major”:“Physics”, “gpa”: 3.6, “yearOfBorn”: 2001, “classOf”: 2018, “interest”: “novel, soccer, cooking”}

{“id”: 3, “studentNo”: “BU-POLI-2016-C001”, “name”: “John Kennedy”, “major”:“Politics”, “gpa”: 4.2, “yearOfBorn”: 2000, “classOf”: 2016, “interest”: “talking, dating, boxing, shooting, chess”}

{“id”: 4, “studentNo”: “BU-POLI-2015-C001”, “name”: “John Kerry”, “major”:“Politics”, “gpa”: 4.1, “yearOfBorn”: 1999, “classOf”: 2015, “interest”: “money, basketball”}

{“id”: 5, “studentNo”: “BU-ARTS-2016-C002”, “name”: “Da Vinci”, “major”:“Arts”, “gpa”: 4.8, “yearOfBorn”: 1995, “classOf”: 2016, “interest”: “drawing, music, wine”}

请查询

同时查询id为1，3，5的文档
名字不叫John的文档
在2016年以前入学的文档
请把id为4文档添加一个兴趣 “poker”

三、index操作

1.查询所有index列表，并将查询到的结果复制到此处。
2.创建website的index，要求为该索引有3个分片，2份副本。
3.删除website的index.

四、商品信息操作。
{ “index”: { “_id”: 1 }}
{ “price” : 10, “productID” : “XHDK-A-1293-#fJ3” }
{ “index”: { “_id”: 2 }}
{ “price” : 20, “productID” : “KDKE-B-9947-#kL5” }
{ “index”: { “_id”: 3 }}
{ “price” : 30, “productID” : “JODL-X-1937-#pV7” }
{ “index”: { “_id”: 4 }}
{ “price” : 30, “productID” : “QQPX-R-3956-#aD8” }
1，将以上信息导入es，index为my_store，type为products。
2，查找价格为20的商品信息,使用 constant_score 查询以非评分模式来执行 term 查询并以一作为统一评分
3，查询具有"XHDK-A-1293-#fJ3"特定商品id的信息。
4，查询价格在20-40之前的商品信息
5，查找商品列表中价格为20或30的商品信息
6，查询商品价格为30或者"productID"为"XHDK-A-1293-#fJ3"的商品信息，但是商品的"productID"不能为"QQPX-R-3956-#aD8"
7，查询productID 为"KDKE-B-9947-#kL5"的商品信息或者 productID为"JODL-X-1937-#pV7" 并且同时 price为 30的商品信息

Python答案

'''
@Descripttion:es练习题
@version: 1.0.0
@Author: blsm
@Date: 2020-07-02 19:08:18
@LastEditTime: 2020-07-03 06:39:06
'''

from elasticsearch import Elasticsearch


def create_mega(es):
    """添加雇员信息到megacorp索引

    Args:
        es ([type]): [description]
    """
    es.indices.create(index='megacorp', ignore=400)

    data = [{"first_name": "John",
             "last_name": "Smith",
             "age": 25,
             "about": "I love to go rock climbing",
             "interests": ["sports", "music"]
             },
            {"first_name": "John",
             "last_name": "Smith",
             "age": 32,
             "about": "I like to collect rock albums",
             "interests": ["music"]
             },
            {"first_name": "Douglas",
             "last_name": "Fir",
             "age": 35,
             "about": "I like to build cabinets",
             "interests": ["forestry"]
             },
            ]

    # 添加雇员信息到Elasticsearch
    for i, item in enumerate(data):
        res = es.create(
            index='megacorp', doc_type='employee', id=i+1, body=item)
        print(res)


def search_id(es, id, opt):
    query_id = {
        'query': {
            'ids': {
                "values": id
            }
        }
    }
    res = es.search(index=opt['index'], doc_type=opt['type'],
                    body=query_id)
    print(res)


def search_all(es):
    res = es.search(index='megacorp', doc_type='employee')
    print(res)


def search_field(es, last_name):
    query = {
        "query": {
            "multi_match": {
                "query": last_name,
                "fields": ["last_name"]
            }
        }
    }
    res = es.search(index='megacorp', body=query)
    # print(res['hits']['total'])
    print(res)


def search_match(es, last_name):
    query = {
        "query": {
            "match": {
                "last_name": last_name}
        }
    }
    res = es.search(index='megacorp', body=query)
    print(res)


def search_name_age(es, name, age):
    query = {
        "query": {
            "bool": {
                "must": {"match": {"last_name": name}
                         },
                "filter": {
                    "range": {"age": {"lte": age}}
                }}
        }
    }
    res = es.search(index='megacorp', body=query)
    print(res)


def search_about(es, about):
    query = {
        "query": {
            "multi_match": {
                "query": about,
                "fields": ["about"]
            }
        }
    }
    res = es.search(index='megacorp', body=query)
    print(res)


def search_phrase(es, str1, highlight=False):
    query = {
        "query": {
            "match_phrase": {
                "about": str1
            }
        }
    }
    if highlight:
        query['highlight'] = {"fields": {"about": {}}}
    res = es.search(index='megacorp', body=query)
    print(res)


def create_stu(es):
    es.indices.create(index='stu', ignore=400)

    data = [{"studentNo": "TH-CHEM-2016-C001",
             "name": "Jonh Smith",
             "major": "Chemistry",
             "gpa": 4.8,
             "yearOfBorn": 2000,
             "classOf": 2016,
             "interest": "soccer, basketball, badminton, chess"
             },
            {"studentNo": "TH-PHY-2018-C001",
             "name": "Isaac Newton",
             "major": "Physics",
             "gpa": 3.6,
             "yearOfBorn": 2001,
             "classOf": 2018,
             "interest": "novel, soccer, cooking"},
            {"studentNo": "BU-POLI-2016-C001",
             "name": "John Kennedy",
             "major": "Politics",
             "gpa": 4.2,
             "yearOfBorn": 2000,
             "classOf": 2016,
             "interest": "talking, dating, boxing, shooting, chess"},
            {"studentNo": "BU-POLI-2015-C001",
             "name": "John Kerry",
             "major": "Politics",
             "gpa": 4.1,
             "yearOfBorn": 1999,
             "classOf": 2015,
             "interest": "money, basketball"},
            {"studentNo": "BU-ARTS-2016-C002",
             "name": "Da Vinci",
             "major": "Arts",
             "gpa": 4.8,
             "yearOfBorn": 1995,
             "classOf": 2016,
             "interest": "drawing, music, wine"},
            ]

    # 添加雇员信息到Elasticsearch
    for i, item in enumerate(data):
        res = es.create(
            index='stu', doc_type='doc', id=i+1, body=item)
        print(res)


def search_not_name(es, name):
    query = {
        "query": {
            "bool": {
                "must_not": {
                    "match": {
                        "name": "John"
                    }
                }
            }
        }
    }

    res = es.search(index='stu', body=query)
    print(res)


def search_classOf(es, year):
    query = {
        "query": {
            "range": {
                "classOf": {
                    "lte": year
                }
            }
        }
    }

    res = es.search(index='stu', body=query)
    print(res)


def update_interest(es, id):
    data = {
        "script": {
            "source": "ctx._source.interest=params.interest",
            "params": {"interest": "money, basketball, poker"},
            "lang": "painless"
        },
        "query": {
            "bool": {
                "must": [
                    {"term": {"_id": id}}
                ],
            }
        }
    }

    result = es.update_by_query(index='stu', body=data)
    print(result)


def search_index(es):
    for index in es.indices.get('*'):
        print(index)


def create_web(es):
    data = {
        "settings": {
            "index": {
                "number_of_shards": 3,  # 数据自动会分成3片存放在不同的节点，提高数据检索速度
                # 创建2个副本集,设置多副本可以增加数据库的安全性,但是插数据的时候，会先向主节点插入数据，之后再向其余副本同步，会降低插入数据速度
                "number_of_replicas": 2
            }
        }
    }

    result = es.indices.create(index='website', body=data)
    print(result)


def delete(es, index):
    result = es.indices.delete(index=index)
    print(result)


def create_goods(es):

    es.indices.create(index='my_store', ignore=400)

    data = [
        {"price": 10, "productID": "XHDK-A-1293 -  # fJ3"},
        {"price": 20, "productID": "KDKE-B-9947 -  # kL5"},
        {"price": 30, "productID": "JODL-X-1937 -  # pV7"},
        {"price": 30, "productID": "QQPX-R-3956 -  # aD8"},
    ]

    # 添加雇员信息到Elasticsearch
    for i, item in enumerate(data):
        res = es.create(
            index='my_store', doc_type='products', id=i+1, body=item)
        print(res)


def search_goods(price=0):
    query = {
        "query": {
            "constant_score": {
                "filter": {
                    "term": {
                        "price": 20
                    }
                }
            }
        }
    }
    res = es.search(index='my_store', body=query)
    print(res)


def search_productID(es, productID='XHDK-A-1293-#fJ3'):
    query = {
        "query": {
            "match": {
                "productID": "XHDK-A-1293-#fJ3"
            }
        }
    }

    res = es.search(index='my_store', body=query)
    print(res)


def search_range_preice(es, low, high):
    query = {
        "query": {
            "range": {
                "price": {
                    "gte": low,
                    "lte": high,
                }
            }
        }
    }

    res = es.search(index='my_store', body=query)
    print(res)


def search_age(es, prices):
    query = {
        "query": {
            "terms": {
                "price": prices
            }
        }
    }

    res = es.search(index='my_store', body=query)
    print(res)


def search_priceORproductID(es, price, productID, notproductID):
    query = {
        "query": {
            "bool": {
                "should": [
                    {
                        "match": {
                            "price": price
                        }},
                    {"match": {
                        "productID": productID
                    }
                    }
                ],
                "must_not": {
                    "match": {
                        "productID": notproductID
                    }
                }
            }
        }
    }

    res = es.search(index='my_store', body=query)
    print(res)


def search_priceANDproductID(es, productIDs, price):
    query = {
        "query": {
            "bool": {
                "should": [
                    {
                        "match": {
                            "productID": productIDs[0]
                        }},
                    {"match": {
                        "productID": productIDs[1]
                    }
                    }
                ],
                "must": {
                    "match": {
                        "price": price
                    }
                }
            }
        }
    }

    res = es.search(index='my_store', body=query)
    print(res)


if __name__ == "__main__":
    es = Elasticsearch()
    # 雇员表
    # 1 添加数据
    # create_mega(es)
    # 2 查看雇员id为1信息
    # opt = {'index': 'megacorp', 'type': 'employee'}
    # search_id(es, 1, opt)
    # 3 搜索所有雇员信息
    # search_all(es)
    # 4 搜索特定条件的雇员信息
    # search_field(es, 'Smith')
    # 5 通过match的方式搜索名字为Smith的雇员
    # search_match(es, 'Smith')
    # 6 搜索名字为 Smith 的雇员，但年龄大于 30 岁的
    # search_name_age(es, 'Smith', 30)
    # 7 搜索下所有喜欢攀岩（rock climbing）的雇员
    # search_about(es, 'climbing')
    # 8 仅匹配同时包含 “rock” 和 “climbing”，并且二者以短语“rock climbing” 的形式紧挨着的雇员记录
    # search_phrase(es, 'rock climbing')
    # 9 高亮
    # search_phrase(es, 'rock climbing', highlight=True)

    # 商品信息查询
    # 1 批量导入
    # create_stu(es)
    # 1) 同时查询id为1,3,5
    # opt = {'index': 'stu', 'type': 'doc'}
    # search_id(es, [1, 3, 5], opt)
    # 2) 名字不叫John的文档
    # search_not_name(es, 'John')
    # 3)在2016年以前入学的文档
    # search_classOf(es, 2015)
    # 4)请把id为4文档添加一个兴趣 “poker”
    # update_interest(es, 4)

    # index操作
    # 1 查询所有index列表
    # search_index(es)
    # 2 创建website的index
    # create_web(es)
    # 3 删除索引
    # delete(es, 'website')

    # 商品信息操作
    # 1 导入信息
    # create_goods(es)
    # 2 查找价格为20的商品信息
    # search_goods(es,price=20)
    # 3 查询具有"XHDK-A-1293-#fJ3"特定商品id的信息
    # search_productID(es, productID='XHDK-A-1293-#fJ3')
    # 4 查询价格在20-40之间的商品信息
    # search_range_preice(es, low=20, high=40)
    # 5 查找商品列表中价格为20或30的商品信息
    # search_age(es, prices=[20, 30])
    # 6查询商品价格为30或者"productID"为"XHDK-A-1293-#fJ3"的商品信息，但是商品的"productID"不能为"QQPX-R-3956-#aD8"
    # search_priceORproductID(
    #     es, price=30, productID="XHDK-A-1293-#fJ3", notproductID="QQPX-R-3956-#aD8")

    # 7 查询productID 为"KDKE-B-9947-#kL5"的商品信息或者 productID为"JODL-X-1937-#pV7" 并且同时 price为 30的商品信息
    search_priceANDproductID(
        es, productIDs=["KDKE-B-9947-#kL5", "JODL-X-1937-#pV7"], price=30)

参考资料

ElasticSearch综合练习题

天青如水

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
ElasticSearch练习题---python

题目一、雇员表查询1.添加以下三条信息到Elasticsearch，index为megacorp,type为employee，id分别为1，2，3{“first_name” : “John”,“last_name” : “Smith”,“age” : 25,“about” : “I love to go rock climbing”,“interests”: [ “sports”, “music” ]}{“first_name” : “Jane”,“last_name” : “Smi
复制链接

扫一扫

专栏目录