1 数据准备
PUT student_index
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"properties": {
"birthday": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"age": {
"type": "integer"
},
"desc": {
"type": "text"
},
"nationality": {
"type": "keyword"
}
}
}
}
POST /student_index/_doc
{
"name":"Kobe Bryant",
"age":18,
"desc":"superstar",
"birthday":"1978-08-23",
"nationality":"America"
}
POST /student_index/_doc
{
"name":"LeBron James",
"age":18,
"desc":"superstar",
"birthday":"1984-12-30",
"nationality":"America"
}
POST /student_index/_doc
{
"name":"Michael Jordan",
"age":18,
"desc":"superstar",
"birthday":"1963-02-17",
"nationality":"America"
}
POST /student_index/_doc
{
"name":"James Harden",
"age":18,
"desc":"superstar",
"birthday":"1989-08-26",
"nationality":"America"
}
POST /student_index/_doc
{
"name":"姚明",
"age":18,
"desc":"中国篮球杰出贡献奖",
"birthday":"1980-09-12",
"nationality":"China"
}
POST /student_index/_doc
{
"name":"Giannis Antetokounmpo",
"age":18,
"desc":"superstar",
"birthday":"1994-12-06",
"nationality":"The Greek"
}
2 match
match查询属于高层查询,会根据你查询的字段的类型不一致,采用不同的查询方式。
- 如果查询的是日期或者数值的字段,他会自动将你的字符串查询内容转换成日期或者数值对待。
- 如果查询的内容是一个不能被分词的字段(keyword),match查询不会对你的指定查询关键字进行分词。
- 如果查询的内容是一个可以分词的字段(text),match会将你指定的查询内容根据一定的方式去分词,然后去分词库中匹配指定的内容。(分词打分)
# text类型分词查询
GET student_index/_search
{
"query": {
"match": {
"name": "James"
}
}
}
#keyword查询
GET student_index/_search
{
"query": {
"match": {
"nationality": "The Greek"
}
}
}
3 term(精准匹配)
term的查询是代表完全匹配,搜索之前不会对你搜索的关键字进行分词,如关键字手机,不会分成手和机,再根据关键字去文档分词库中去匹配内容。
#精准匹配查询keyword的值
GET student_index/_search
{
"query": {
"term": {
"nationality": {
"value": "The Greek"
}
}
}
}
#精准匹配text的值
GET student_index/_search
{
"query": {
"term": {
"name": {
"value": "Michael Jordan"
}
}
}
}
GET _analyze
{
"analyzer": "standard",
"text":"Michael Jordan"
}
GET /student_index/_search
{
"query": {
"term": {
"name": {
"value": "michael"
}
}
}
}
4 filter
filter,根据你的查询条件去查询文档,不去计算分数,而且filter会对经常被过滤的数据进行缓存,方便下次快速定位查询;
由于filter不计算分数,所以性能优于计算分数的查询。
#查询年龄在1985-2022年之间的学生
GET /student_index/_search
{
"query": {
"bool": {
"filter": {
"range": {
"birthday": {
"gte": "1985-01-01",
"lte": "2022-01-01"
}
}
}
}
}
}
#查询年龄在1985-2022年之间的supterstar
GET /student_index/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"desc": "superstar"
}
}
],
"filter": {
"range": {
"birthday": {
"gte": "1985-01-01",
"lte": "2022-01-01"
}
}
}
}
}
}
5 bool查询
复合过滤器,将你的多个查询条件,以一定的逻辑组合在一起:
- must:返回的文档必须满足must子句的条件,并且参与计算分值。
- filter:返回的文档必须满足filter子句的条件,不计算相关得分。
- should: 可能满足or子句(查询)应出现在匹配的文档中。
- must_not:返回的文档必须不满足must_not定义的条件。
#查询年龄在1985-2022,国籍不是美国的superstart
GET /student_index/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"desc": "superstar"
}
}
],
"must_not": [
{
"term": {
"nationality": {
"value": "America"
}
}
}
],
"filter": {
"range": {
"birthday": {
"gte": "1985-01-01",
"lte": "2022-01-01"
}
}
}
}
}
}
6 wildcard通配符模糊查询
概念:通配符运算符是匹配一个或多个字符的占位符。
- ?:匹配任意字符
- *:匹配多个字符
注意:通配符匹配是字典匹配,加了keyword以后匹配是值匹配。
GET student_index/_search
{
"query": {
"wildcard": {
"name": {
"value": "jam*s"
}
}
}
}
GET student_index/_search
{
"query": {
"wildcard": {
"name.keyword": {
"value": "jam*s"
}
}
}
}
GET student_index/_search
{
"query": {
"wildcard": {
"name.keyword": {
"value": "姚*"
}
}
}
}
GET student_index/_search
{
"query": {
"wildcard": {
"name.keyword": {
"value": "LeBron Jam*s"
}
}
}
}
7 fuzzy
模糊查询,我们输入字符的大概,ES就可以根据输入的内容去大概匹配一下结果,
同时也支持输入关键字的错别字,所以fuzzy查询本身相对不太精确和稳定,即错别字太多也可能导致查询无结果,需要则中使用。
fuzziness: 编辑距离,(0,1,2)并非越大越好,召回率高但结果不准确。可以设置成AUTO,ES会根据关键字的自动设置fuzziness。如果不设置fuzziness想当设置成ATUO。
match也支持fuzzy,区别是match分词,fuzzy不分词
GET student_index/_search
{
"query": {
"fuzzy": {
"name": {
"value": "James",
"fuzziness": 1
}
}
}
}
GET student_index/_search
{
"query": {
"fuzzy": {
"name": {
"value": "James",
"fuzziness": "AUTO"
}
}
}
}
GET student_index/_search
{
"query": {
"match": {
"name": {
"query": "LeBron Jjmes",
"fuzziness": 1
}
}
}
}