Elasticsearch+Python 入门使用（2）

天天天天天天天天d

已于 2022-04-19 14:31:38 修改

阅读量2.1k

点赞数

分类专栏： Elasticsearch 文章标签： elasticsearch 数据库大数据搜索引擎 python

于 2022-04-12 18:11:10 首次发布

本文链接：https://blog.csdn.net/gtd54789/article/details/124119727

版权

Elasticsearch 专栏收录该内容

9 篇文章 0 订阅

订阅专栏

Elasticsearch专栏入口

看前须知

python 模块 elasticsearch 有版本问题，我使用的是8.1.2版本。
如有接口报错请对模块进行升级\降级。适合0 Elasticearch 基础的选手，入门、练习观看。适合想使用Python API 操作ES 的选手。
ES部署在这里

数据准备：

在这页面的数据准备模块中

一、查询

1.1 条件查询

ES：

# 请求指定title的数据

#不推荐URL带数据
GET请求 http://ip:9200/web/_search?q=title:天天搜题

# 推荐使用body携带信息
GET请求 http://ip:9200/web/_search
{
	"query":{
	    "match":{
	        'title':"天天搜题"
	    }
	}
}

Python:

# 请求指定title的数据

query = {
    "match":{
        'title':"天天搜题"
    }
}
response = es.search(index="web",query=query)

1.2 分页查询

ES：
from:起始位置 size:查询数量

# 分页查询数据

GET请求 http://ip:9200/web/_search
{
	"query":{
	    "match_all":{}
	},
	"from":0,
	"size":10
}

Python:

# 分页查询数据

query = {
    "match":{
        'title':"天天搜题"
    }
}
response = es.search(index="web",query=query,from_=0,size=10)

1.3 过滤查询

ES:

# 返回中source只展示title

GET请求 http://ip:9200/web/_search
{
	"query":{
	    "match_all":{}
	},
	"_souce":["title"]
}

Python:

# 在返回中会多一个fields属性值就是你过滤的title

query = {
    "match":{
        'title':"天天搜题"
    }
}
response = es.search(index="web",query=query,fields=['title'])

1.4 排序查询

ES:

# 对查询结果排序 (对url进行排序)

GET请求 http://ip:9200/web/_search
{
	"query":{
	    "match_all":{}
	},
	"sort":{
		"uuId":{
			"order":"desc"
		}
	}
}

Python:

# 查询
query = {
    "match_all":{
    }
}

# 排序
sort = {
    "uuId": {
        "order": "asc"
    }
}

# 提交
response = es.search(index="web",query=query,sort=sort)

# 排序这边我们使用了uuid，如果使用其他的标签且非数字排序大概率会报错。如果没有进行映射我们可以在标签后面添加 .keyword 规避报错。
sort = {
    "title.keyword": {
        "order": "asc"
    }
}

1.5 多条件查询

ES:

# 同时对title和type做出限制

GET请求 http://ip:9200/web/_search
{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "title": "天天搜题"
                    }
                },
                {
                    "match": {
                        "type": "网站"
                    }
                }
            ]
        }
    }
}

Python:

"""
		其实可以看出来查询的规律了，在新版本的elasticSearch模块 8.x版本中只需要把正
	常的查询语句中query提出作为参数传入就可以实现查询。
"""

# 同时对title和type做出限制
query = {
    "bool": {
        "must": [
            {
                "match": {
                    "title": "天天搜题"
                }
            },
            {
                "match": {
                    "type": "网站"
                }
            }
        ]
    }
}
response = es.search(index="web",query=query)

"""
	上面查询语句中 must 还可以替换为 should 在ES原语句中也成立。
		- must :     必须同时满足 类似于 and
		- should :  满足一个即可 类似于 or
"""
# must 换成 should 范例
query = {
    "bool": {
        "should": [
            {
                "match": {
                    "type": "摆烂"
                }
            },
            {
                "match": {
                    "type": "网站"
                }
            }
        ]
    }
}
response = es.search(index="web",query=query)

1.6 范围查询

ES:

# 取出title为天天搜题并且uuid大于3的数据 由于match是分词匹配，所以会将不完全是天天搜题的内容也查询出来。

GET请求 http://ip:9200/web/_search
{
    "query": {
	    "bool": {
	        "must": [
	            {
	                "match": {
	                    "title": "天天搜题"
	                }
	            },
	        ],
	        "filter":{
	            "range":{
	                "uuId":{
	                    "gt":3
	                }
	            }
	        }
	    }
    }
}

Python:

# 取出title为天天搜题并且uuid大于3的数据

query = {
    "bool": {
        "must": [
            {
                "match": {
                    "title": "天天搜题"
                }
            },
        ],
        "filter":{
            "range":{
                "uuId":{
                    "gt":3
                }
            }
        }
    }
}
response = es.search(index="web",query=query)

1.7 全文查询和高亮

ES:

# 全文查询 match是分词匹配所以会查询到title中含有天或者冬的数据 并对title进行高亮显示

GET请求 http://ip:9200/web/_search
{
    "query":{
        "match":{
            "title":"天冬"
        },
        "highlight":{
	        "fields":{
	            "title":{}
	        }
	    }
    }
}

Python:

"""
	match 可以替换为 match_phrase 这样不会拆分，天冬将会作为一个整体模糊查询
"""
# 全文查询 title高亮

# 查询
query = {
    "match":{
        "title":"天冬"
    }
}

# 高亮
highlight = {
    "fields":{
        "title":{}
    }
}

# 提交
response = es.search(index="web",query=query,highlight=highlight)

1.8 聚合查询

ES：

# 对 title 进行聚合操作，分组查询。 terms 可以替换为avg等其他聚合操作。

GET请求 http://ip:9200/web/_search
{
    "aggs":{    // 聚合关键字
        "type_group":{ //   分组名称随意写
            "terms":{	// 分组关键字
                "field":"title.keyword" 	// 不写.keyword我这会报错，如果各位不报错可以去掉
            }
        }
    },
    "size":0	// 原始数据不展示，只展示分组结果
}

Python:

# 对 title 进行聚合操作，分组查询。 terms 可以替换为 avg等其他聚合操作。

aggs = {
    "type_group": {
        "terms": {
            "field": "title.keyword"
        }
    }
}
# size=0 表示原始数据不展示，只展示分组结果 
response = es.search(index="web",aggs=aggs,size=0)

二、映射

ES：

# 创建表web时添加映射(添加约束条件)

PUT请求 http://ip:9200/web/_mapping
{
    "properties":{
        "title": {
            "type": "text",
            "index": "true"
        },
        "type": {
            "type": "keyword",
            "index": "true"
        },
        "uuId": {
            "type":'keyword',
            "index": "true"

        },
        "url": {
            "type":'keyword',
            "index": "false"
        }
    }
}

Python

mappings = {
    "properties":{
        "title": {
            "type": "text",
            "index": "true"
        },
        "type": {
            "type": "keyword",
            "index": "true"
        },
        "uuId": {
            "type":'keyword',
            "index": "true"

        },
        "url": {
            "type":'keyword',
            "index": "false"
        }
    }
}
response = es.indices.create(index='web',mappings=mappings)