2 Elasticsearch8.12.2 DSL搜索

一、数据准备

  • 自定义词库

  •  建立索引dsl_search(名字随意)
  • 手动建立mappings

POST        /dsl_search/_mapping
{
    "properties": {
        "id": {
            "type": "long"
        },
        "age": {
            "type": "integer"
        },
        "username": {
            "type": "keyword"
        },
        "nickname": {
            "type": "text",
            "analyzer": "ik_max_word"
        },
        "money": {
            "type": "float"
        },
        "desc": {
            "type": "text",
            "analyzer": "ik_max_word"
        },
        "sex": {
            "type": "byte"
        },
        "birthday": {
            "type": "date"
        },
        "face": {
            "type": "text",
            "index": false
        }
    }
}

  • 录入数据
  • POST         /dsl_search/_doc/1001
    
    {
        "id": 1001,
        "age": 18,
        "username": "chinanewsAmazing",
        "nickname": "中国新闻网",
        "money": 88.8,
        "desc": "我在中国新闻网到了很多新闻",
        "sex": 0,
        "birthday": "2022-09-01",
        "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/527bc4b462d946be81eb900d7c8e63fe.jpg"
    }
    
    {
        "id": 1002,
        "age": 19,
        "username": "justbuy",
        "nickname": "周杰棍",
        "money": 77.8,
        "desc": "今天上下班都很堵,车流量很大",
        "sex": 1,
        "birthday": "1993-01-24",
        "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg"
    }
    
    {
        "id": 1003,
        "age": 20,
        "username": "bigFace",
        "nickname": "飞翔的巨鹰",
        "money": 66.8,
        "desc": "中国新闻网团队和导游坐飞机去海外旅游,去了新马泰和欧洲",
        "sex": 1,
        "birthday": "1996-01-14",
        "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg"
    }
    
    {
        "id": 1004,
        "age": 22,
        "username": "flyfish",
        "nickname": "水中鱼",
        "money": 55.8,
        "desc": "昨天在学校的池塘里,看到有很多鱼在游泳,然后就去中国新闻网学习了",
        "sex": 0,
        "birthday": "1988-02-14",
        "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg"
    }
    
    {
        "id": 1005,
        "age": 25,
        "username": "gotoplay",
        "nickname": "ps游戏机",
        "money": 155.8,
        "desc": "今年生日,女友送了我一台play station游戏机,非常好玩,非常不错",
        "sex": 1,
        "birthday": "1989-03-14",
        "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg"
    }
    
    {
        "id": 1006,
        "age": 19,
        "username": "missimooc",
        "nickname": "我叫小髦",
        "money": 156.8,
        "desc": "我叫髦髦,今年20岁,是一名律师,我在琦䯲星球做演讲",
        "sex": 1,
        "birthday": "1993-04-14",
        "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg"
    }
    
    {
        "id": 1007,
        "age": 19,
        "username": "msgame",
        "nickname": "gamexbox",
        "money": 1056.8,
        "desc": "明天去进货,最近微软处理很多游戏机,还要买xbox游戏卡带",
        "sex": 1,
        "birthday": "1985-05-14",
        "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg"
    }
    
    {
        "id": 1008,
        "age": 19,
        "username": "muke",
        "nickname": "新闻学习",
        "money": 1056.8,
        "desc": "大学毕业后,可以到i2.chinanews.com.cn进修",
        "sex": 1,
        "birthday": "1995-06-14",
        "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg"
    }
    
    {
        "id": 1009,
        "age": 22,
        "username": "shaonian",
        "nickname": "骚年轮",
        "money": 96.8,
        "desc": "骚年在大学毕业后,考研究生去了",
        "sex": 1,
        "birthday": "1998-07-14",
        "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg"
    }
    
    {
        "id": 1010,
        "age": 30,
        "username": "tata",
        "nickname": "隔壁老王",
        "money": 100.8,
        "desc": "隔壁老外去国外出差,带给我很多好吃的",
        "sex": 1,
        "birthday": "1988-07-14",
        "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg"
    }
    
    {
        "id": 1011,
        "age": 31,
        "username": "sprder",
        "nickname": "皮特帕克",
        "money": 180.8,
        "desc": "它是一个超级英雄",
        "sex": 1,
        "birthday": "1989-08-14",
        "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg"
    }
    
    {
        "id": 1012,
        "age": 31,
        "username": "super hero",
        "nickname": "super hero",
        "money": 188.8,
        "desc": "BatMan, GreenArrow, SpiderMan, IronMan... are all Super Hero",
        "sex": 1,
        "birthday": "1980-08-14",
        "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg"
    }
    

二、入门语法

请求参数的查询(QueryString)

查询[字段]包含[内容]的文档

GET     /dsl_search/_search?q=desc:新闻网
GET     /dsl_search/_search?q=nickname:新&q=age:25

text与keyword搜索对比测试(keyword不会被倒排索引,不会被分词)

GET     /dsl_search/_search?q=nickname:super
GET     /dsl_search/_search?q=username:super
GET     /dsl_search/_search?q=username:super hero

这种方式称之为QueryString查询方式,参数都是放在url中作为请求参数的。

DSL基本语法

QueryString查询方式用的很少,因为一旦参数复杂就很难构建,所以大多数查询都会使用dsl来进行查询更好。

  • Domain Specific Language
  • 特定领域语言
  • 基于JSON格式的数据查询
  • 查询更灵活,有利于复杂查询

DSL格式语法:

# 查询
POST     /dsl_search/_search
{
    "query": {
        "match": {
            "desc": "新闻网"
        }
    }
}
# 判断某个字段是否存在
{
    "query": {
        "exists": {
            "field": "desc"
        }
    }
}

  • 语法格式为一个json object,内容都是key-value键值对,json可以嵌套。
  • key可以是一些es的关键字,也可以是某个field字段,后面会遇到

搜索不合法问题定位

DSL查询的时候经常会出现一些错误查询,出现这样的问题大多数都是json无法被es解析,也会像java一样报一个异常信息,根据异常信息去推断问题所在,比如json格式不对,关键词不存在未注册等等,甚至有时候不能定位问题直接复制错误信息到google里面搜一下就能定位问题了。

三、查询所有与分页

 match_all

在索引中查询所有的文档

GET     /dsl_search/_search

演示:

http://192.168.110.129:9200/dsl_search/_search

 结果:

{
    "took": 2,
    "timed_out": false,
    "_shards": {
        "total": 3,
        "successful": 3,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 13,
            "relation": "eq"
        },
        "max_score": 1,
        "hits": [
            {
                "_index": "dsl_search",
                "_id": "1002",
                "_score": 1,
                "_source": {
                    "id": 1002,
                    "age": 19,
                    "username": "justbuy",
                    "nickname": "周杰棍",
                    "money": 77.8,
                    "desc": "今天上下班都很堵,车流量很大",
                    "sex": 1,
                    "birthday": "1993-01-24",
                    "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg"
                }
            },
            {
                "_index": "dsl_search",
                "_id": "1003",
                "_score": 1,
                "_source": {
                    "id": 1003,
                    "age": 20,
                    "username": "bigFace",
                    "nickname": "飞翔的巨鹰",
                    "money": 66.8,
                    "desc": "中国新闻网团队和导游坐飞机去海外旅游,去了新马泰和欧洲",
                    "sex": 1,
                    "birthday": "1996-01-14",
                    "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg"
                }
            },
            {
                "_index": "dsl_search",
                "_id": "1007",
                "_score": 1,
                "_source": {
                    "id": 1007,
                    "age": 19,
                    "username": "msgame",
                    "nickname": "gamexbox",
                    "money": 1056.8,
                    "desc": "明天去进货,最近微软处理很多游戏机,还要买xbox游戏卡带",
                    "sex": 1,
                    "birthday": "1985-05-14",
                    "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg"
                }
            },
            {
                "_index": "dsl_search",
                "_id": "1008",
                "_score": 1,
                "_source": {
                    "id": 1008,
                    "age": 19,
                    "username": "muke",
                    "nickname": "新闻学习",
                    "money": 1056.8,
                    "desc": "大学毕业后,可以到i2.chinanews.com.cn进修",
                    "sex": 1,
                    "birthday": "1995-06-14",
                    "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg"
                }
            },
            {
                "_index": "dsl_search",
                "_id": "1011",
                "_score": 1,
                "_source": {
                    "id": 1011,
                    "age": 31,
                    "username": "sprder",
                    "nickname": "皮特帕克",
                    "money": 180.8,
                    "desc": "它是一个超级英雄",
                    "sex": 1,
                    "birthday": "1989-08-14",
                    "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg"
                }
            },
            {
                "_index": "dsl_search",
                "_id": "_search",
                "_score": 1,
                "_source": {
                    "query": {
                        "match": {
                            "desc": "新闻网"
                        }
                    }
                }
            },
            {
                "_index": "dsl_search",
                "_id": "1001",
                "_score": 1,
                "_source": {
                    "id": 1001,
                    "age": 18,
                    "username": "chinanewsAmazing",
                    "nickname": "中国新闻网",
                    "money": 88.8,
                    "desc": "我在中国新闻网到了很多新闻",
                    "sex": 0,
                    "birthday": "2022-09-01",
                    "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/527bc4b462d946be81eb900d7c8e63fe.jpg"
                }
            },
            {
                "_index": "dsl_search",
                "_id": "1004",
                "_score": 1,
                "_source": {
                    "id": 1004,
                    "age": 22,
                    "username": "flyfish",
                    "nickname": "水中鱼",
                    "money": 55.8,
                    "desc": "昨天在学校的池塘里,看到有很多鱼在游泳,然后就去中国新闻网学习了",
                    "sex": 0,
                    "birthday": "1988-02-14",
                    "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg"
                }
            },
            {
                "_index": "dsl_search",
                "_id": "1005",
                "_score": 1,
                "_source": {
                    "id": 1005,
                    "age": 25,
                    "username": "gotoplay",
                    "nickname": "ps游戏机",
                    "money": 155.8,
                    "desc": "今年生日,女友送了我一台play station游戏机,非常好玩,非常不错",
                    "sex": 1,
                    "birthday": "1989-03-14",
                    "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg"
                }
            },
            {
                "_index": "dsl_search",
                "_id": "1006",
                "_score": 1,
                "_source": {
                    "id": 1006,
                    "age": 19,
                    "username": "missimooc",
                    "nickname": "我叫小髦",
                    "money": 156.8,
                    "desc": "我叫髦髦,今年20岁,是一名律师,我在琦䯲星球做演讲",
                    "sex": 1,
                    "birthday": "1993-04-14",
                    "face": "https://i2.chinanews.com.cn/simg/cmshd/2022/09/01/ea8d5d4bc6c146239201034cf7731dce.jpg"
                }
            }
        ]
    }
}

POST     /dsl_search/_search
{
    "query": {
        "match_all": {}
    },
    "_source": ["id", "nickname", "age"]
}

演示:

 查询结果和上面一致。

  • Head可视化操作

 分页查询

默认查询是只有10条记录,可以通过分页来展示。

POST     /dsl_search/_search

{
    "query": {
        "match_all": {}
    },
    "from": 0,
    "size": 10
}

演示:

 自定义分页查询

{
    "query": {
        "match_all": {}
    },
    "_source": [
        "id",
        "nickname",
        "age"
    ],
    "from": 5,
    "size": 5
}

演示:

  •  Head可视化操作

四、term/match/match_phrase

 term精确搜索与match分词搜索

搜索的时候会把用户搜索内容,比如"中国新闻网强大"作为一整个关键词去搜索,而不会对其进行分词后在搜索。

POST     /dsl_search/_search
{
    "query": {
        "term": {
            "desc": "新闻网"
        }
    }
}
对比
{
    "query": {
        "match": {
            "desc": "新闻网"
        }
    }
}
  • 注:match会对新闻网先进行分词(其实就是全文检索),再查询,而term则不会,直接把新闻网作为一个整的词汇去搜索。
  • head可视化操作对比

 terms多个词汇匹配检索

相当于是tag标签页查询,比如一些新闻会打上国际/宗教/人文/娱乐这样的标签,可以完全匹配做类似标签的查询。

POST     /dsl_search/_search
{
    "query": {
        "terms": {
            "desc": ["新闻网", "学习", "骚年"]
        }
    }
}

match_phrase短语匹配

match:分词后只要有匹配就返回,match_phrase:分词结果必须在text字段分词中都包含,而且顺序必须相同,而且必须都是连续的。(搜索比较严格)

  • slop:允许词语间跳过的数量
POST     /dsl_search/_search
{
    "query": {
        "match_phrase": {
            "desc": {
            	"query": "大学 毕业 研究生",
            	"slop": 3
            }
        }
    }
}

五、match(operator)/ids

 match扩展

  • operator

        (1) or:搜索内容分词后,只要存在一个词语匹配就展示结果

        (2) and:搜索内容分词后,都要满足词语匹配

POST     /dsl_search/_search
{
    "query": {
        "match": {
            "desc": "xbox游戏机"
        }
    }
}
# 等同于
{
    "query": {
        "match": {
            "desc": {
                "query": "xbox游戏机",
                "operator": "or"
            }
        }
    }
}
# 相当于 select * from shop where desc='xbox' or|and desc='游戏机'

  • minimum_should_match:最低匹配精度,至少有[分词后的词语个数]x百分比,得出一个数据值取整。举个例子:当前属性设置为70,若一个用户查询检索内容分词后有10个词语,那么匹配度按照10x70%=7,则desc中至少需要有7个词语匹配,就展示;若分词有8个,则8x70%=5.6,则desc中至少需要有5个词语匹配,就展示。
  • minimum_should_match 也能设置具体的数字,表示个数
POST     /dsl_search/_search
{
    "query": {
        "match": {
            "desc": {
                "query": "女友生日送我好玩的xbox游戏机",
                "minimum_should_match": "60%"
            }
        }
    }
}

根据文档主键id单个查询

GET /dsl_search/_doc/1001

根据文档主键ids搜索

官网地址:

IDs | Elasticsearch Guide [8.12] | Elastic

POST     /dsl_search/_search

{
    "query": {
        "ids": {
            "values": [
                "1001",
                "1010",
                "1008"
            ]
        }
    }
}

六、multi_match/boost

官网地址:Multi-match query | Elasticsearch Guide [8.12] | Elastic

multi_match

满足使用match在多个字段中进行查询的需求

POST     /dsl_search/_search
{
    "query": {
        "multi_match": {
                "query": "皮特帕克新闻网",
                "fields": ["desc", "nickname"]

        }
    }
}

boost

权重,为某个字段设置权重,权重越高,文档相关性得分就越高。通常来说搜索商品名称要比商品介绍的权重更高。

POST     /dsl_search/_search
{
    "query": {
        "multi_match": {
                "query": "皮特帕克新闻网",
                "fields": ["desc", "nickname^10"]

        }
    }
}

nickname^10代表搜索提升10倍相关性,也就是说用户搜索的时候其实以这个nickname为主,desc为辅,nickname的匹配相关度当然要提高权重比例了。

七、布尔查询

  • must:查询必须匹配搜索条件,譬如 and 
  • should:查询匹配满足1个以上条件,譬如 or
  • must_not:不匹配搜索条件,一个都不要满足

实操1:

POST     /dsl_search/_search

{
    "query": {
        "bool": {
            "must": [
                {
                    "multi_match": {
                        "query": "新闻网",
                        "fields": ["desc", "nickname"]
                    }
                },
                {
                    "term": {
                        "sex": 1
                    }
                },
                {
                    "term": {
                        "birthday": "1996-01-14"
                    }
                }
            ]
        }
    }
}

{
    "query": {
        "bool": {
            "should(must_not)": [
                {
                    "multi_match": {
                        "query": "学习",
                        "fields": ["desc", "nickname"]
                    }
                },
                {
                	"match": {
                		"desc": "游戏"
                	}	
                },
                {
                    "term": {
                        "sex": 0
                    }
                }
            ]
        }
    }
}

实操2:

{
    "query": {
        "bool": {
            "must": [
                {
                	"match": {
                		"desc": "新"
                	}	
                },
                {
                	"match": {
                		"nickname": "新"
                	}	
                }
            ],
            "should": [
                {
                	"match": {
                		"sex": "0"
                	}	
                }
            ],
            "must_not": [
                {
                	"term": {
                		"birthday": "1992-12-24"
                	}	
                }
            ]
        }
    }
}

Head可视化组合查询

 为指定词语加权

特殊场景下,某些词语可以单独加权,这样可以排的更加靠前。

POST     /dsl_search/_search
{
    "query": {
        "bool": {
            "should": [
            	{
            		"match": {
            			"desc": {
            				"query": "律师",
            				"boost": 18
            			}
            		}
            	},
            	{
            		"match": {
            			"desc": {
            				"query": "进修",
            				"boost": 2
            			}
            		}
            	}
            ]
        }
    }
}

八、过滤器

对搜索出来的结果进行数据过滤。不会到es库里去搜,不会去计算文档的相关度分数,所以过滤的性能会比较高,过滤器可以和全文搜索结合在一起使用。
post_filter元素是一个顶层元素,只会对搜索结果进行过滤。不会计算数据的匹配度相关性分数,不会根据分数去排序,query则相反,会计算分数,也会按照分数去排序。
使用场景:

  1. query:根据用户搜索条件检索匹配记录
  2. post_filter:用于查询后,对结果数据的筛选

实操:查询账户金额大于80元,小于160元的用户。并且生日在1998-07-14的用户

  • gte:大于等于
  • lte:小于等于
  • gt:大于
  • lt:小于

(除此以外还能做其他的match等操作也行)

POST     /dsl_search/_search
{
    "query": {
        "match": {
            "desc": "新闻网游戏"
        }
    },
    "post_filter": {
        "range": {
            "money": {
                "gt": 60,
                "lt": 1000
            }
        }
    }
}

九、排序

es的排序同sql,可以desc也可以asc,也支持组合排序。

实操:

POST     /dsl_search/_search
{
	"query": {
		"match": {
			"desc": "新闻网游戏"
		}
    },
    "post_filter": {
    	"range": {
    		"money": {
    			"gt": 55.8,
    			"lte": 155.8
    		}
    	}
    },
    "sort": [
        {
            "age": "desc"
        },
        {
            "money": "desc"
        }
    ]
}

对文本排序

由于文本会被分词,所以往往要去做排序会报错,通常我们可以为这个字段增加额外的一个属性,类型为keyword,用于做排序。

  • 创建新的索引
POST        /dsl_search2/_mapping
{
    "properties": {
        "id": {
            "type": "long"
        },
        "nickname": {
            "type": "text",
            "analyzer": "ik_max_word",
            "fields": {
                "keyword": {
                    "type": "keyword"
                }
            }
        }
    }
}
  • 插入数据
POST         /dsl_search2/_doc
{
    "id": 1001,
    "nickname": "美丽的风景"
}
{
    "id": 1002,
    "nickname": "漂亮的小哥哥"
}
{
    "id": 1003,
    "nickname": "飞翔的巨鹰"
}
{
    "id": 1004,
    "nickname": "完美的天空"
}
{
    "id": 1005,
    "nickname": "广阔的海域"
}

  • 排序
{
    "sort": [
        {
            "nickname.keyword": "desc"
        }
    ]
}

十、高亮highlight

高亮显示

POST     /dsl_search/_search
{
    "query": {
        "match": {
            "desc": "新闻网"
        }
    },
    "highlight": {
        "pre_tags": ["<tag>"],
        "post_tags": ["</tag>"],
        "fields": {
            "desc": {}
        }
    }
}

十一、prefix-fuzzy-wildcard

prefix--根据前缀搜索

场景: 有些英文单词用户记不住,只能记住开头几个字母;

使用match,肯定不行,match只能根据完整词汇;

这个时候可以使用prefix

POST     /dsl_search/_search
{
    "query": {
        "prefix": {
            "desc": "新"
        }
    }
}

fuzzy--模糊搜索

模糊搜索,并不是指的sql的模糊搜索,而是用户在进行搜索的时候的打字错误现象,搜索引擎会自动纠正,然后尝试匹配索引库中的数据。

POST     /dsl_search/_search
{
  "query": {
    "fuzzy": {
      "desc": "i2.chinanews.com.co"
    }
  }
}
# 或多字段搜索
{
  "query": {
    "multi_match": {
      "fields": [ "desc", "nickname"],
      "query": "i2.chinaneww supor",
      "fuzziness": "AUTO"
    }
  }
}

{
  "query": {
    "multi_match": {
      "fields": [ "desc", "nickname"],
      "query": "演说",
      "fuzziness": "1"
    }
  }
}

官方文档:

Fuzzy query | Elasticsearch Guide [8.12] | Elastic

wildcard

占位符查询

  • ?:1个字符
  • *:1个或多个字符
POST     /dsl_search/_search
{
  "query": {
    "wildcard": {
      "desc": "*chinanews.com.c?"
    }
  }
}
{
	"query": {
    	"wildcard": {
    		"desc": "演*"
    	}
	}
}

官方文档:

Wildcard query | Elasticsearch Guide [8.12] | Elastic

  • 2
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

lvdapiaoliang

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值