ElasticSearch查询总结

synda@hzy

已于 2024-06-13 09:44:57 修改

阅读量382

点赞数

分类专栏： ElasticSearch 文章标签： elasticsearch java es

于 2020-05-30 12:51:40 首次发布

原文链接：https://xdclass.net/#/index

版权

ElasticSearch 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

注：本文学习自小d课堂Elasticsearch学习
以下的查询均采用kibana工具，写的比较初级，如果有错误的，或者有更方便的，请留言指正修改，谢谢

1. term精确查询

注意：使用term查询的字段的mapping不能为text

（1）—term–每个字段精确匹配查询

查询jerseyNo为23的文档

GET /nba/_search
{
  "query": {
    "term": {
      "jerseyNo": {
        "value": "23"
      }
    }
  },
  "from": 0,
  "size": 20
}

（2）—exists–查找指定字段非空的文档：查找teamNameEn字段值是空的文档

GET /nba/_search
{
  "query": {
    "exists": {
      "field": "teamNameEn"
    }
  }
}

（3）—prefix–前缀匹配查询：查找teamNameEn字段的“前缀”为Rock的文档

GET /nba/_search
{
  "query": {
    "prefix": {
      "displayNameEn": {
        "value": "Clint"
      }
    }
  }
}

（4）–wildcard–通配符查询,查询字段为teamConfrenceEn，值开头为East，末尾为n的文档

GET /nba/_search
{
  "query": {
    "wildcard": {
      "teamConferenceEn": {
        "value": "East*n"
      }
    }
  }
}

（5）–regexp–正则表达式查询

GET /nba/_search
{
  "query": {
    "regexp": {
      "teamNameEn": "Ro.*s"
    }
  }
}

（6）–id/ids–根据ID查询

GET /nba/_search
{
  "query": {
    "ids": {
      "values": [1,2,3,4]
    }
  }
}

2. 根据时间、日期、数字、字符串查询

（1）查询球龄在2-10年的球信息

GET /nba/_search
{
  "query": {
    "range": {
      "playYear": {
        "gte": 2,
        "lte": 20
      }
    }
  }
}

（2）查询在1989-1999年之间出生的球员

注：这里需要注意格式问题，对应修改format，也可以使用毫秒值

GET /nba/_search
{
  "query": {
    "range": {
      "birthDay": {
        "gte": 1989,
        "lte": 1999
        , "format": "dd/MM/yyyy||yyyy"
      }
    }
  }
}

3、布尔查询

布尔	查询解释
must	必须出现
should	应该出现
must_not	必须不出现
filter	必须包含，但是

（1）先来个must的，查询球员英文名是james的，

GET /nba/_search
{
  "query": {
    "bool": {
      "must": [
	        {
	          "match": {
	          "displayNameEn": "james"
	        }  
        }
      ]
    }
  }
}

must是一个数组，里面可能有很多的条件,又比如，我需要查询球员名称为james的，并且teamConferenceEn为Wester的，displayNameEn的mapping为text，所以使用match，teamConferenceEn的mapping为keyword

GET /nba/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "displayNameEn": "james"
          }
        },
        {
          "term": {
            "teamConferenceEn": {
              "value": "Wester"
            }
          }
        }
        ]
    }
  }
}

（2）must_not,与must相反

GET /nba/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "displayNameEn": "displayNameEn"
          }
        },
        {
          "term": {
            "teamConferenceEn": {
              "value": "Wester"
            }
          }
        }
        ],
        "must_not": [
          {
            "match": {
              "displayNameEn": "displayNameEn"
            }
          }
        ]
    }
  }
}

（3）filter查询，与must差不多，但是这个不会打分

GET /nba/_search
{
  "query": {
    "bool": {
      "filter": {
        "match":{
          "displayNameEn":"james"
        }
      }
    }
  }
}

（4）should查询，
需要注意的是，should查询会查询到不满足的，也就是说比如下面，表面的理解手的查询球龄在11-20之间的球员，但是他依旧会查出不在这个范围的球员，具体的原因请自行查阅资料罗，小弟也才初学，暂时不知道怎么解释，所以此时我们会搭配使用minimum_should_match属性使用，就可以做到只查出范围之内的球员，至于minimum_should_match的使用和解释，请参考elasticsearch中minimum_should_match的一些理解。

GET /nba/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "displayNameEn": "james"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "teamConferenceEn": "Eastern"
          }
        }
      ],
      "should": [
        {"range": {
          "playYear": {
            "gte": 11,
            "lte": 20
          }
        }}
      ],
      "minimum_should_match": 1
    }
  }
}

4、排序

排序使用sort对象，可以指定多个字段，具体用法
例如：查出火箭队的球员，根据打球年龄进行降序排序，身高从高到低进行排序

GET /nba/_search
{
  "query": {
    "match": {
      "teamNameEn": "Rockets"
    }
  },
  "sort": [
    {
      "playYear": {
        "order": "desc"
      }
    },
    {
      "heightValue":{
        "order": "desc"
    }
    }
  ]
}

4、聚合查询

使用aggs关键字，标识需要聚合查询
avgAge自定义的名称
avg指定需要的计算方式，如sum、avg、max、min

（1）一般的单个求和、平均值、最大最小值等

GET /nba/_search
{
  "query": {
    "term": {
       "teamNameEn": {
        "value": "Rockets"
      }
    }
  },
  "aggs": {
    "avgAge": {
      "avg": {
        "field": "age"
      }
    }
  },"size": 0
}

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 21,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "avgAge" : { //----------平均值
      "value" : 26.761904761904763
    }
  }
}

（2）value_count查询指定不为空的文档总数

查询年龄不为空的球员总数

GET /nba/_search
{
  "query": {
    "term": {
       "teamNameEn": {
        "value": "Rockets"
      }
    }
  },
  "aggs": {
    "countBy": {
      "value_count": {
        "field": "age"
      }
    }
  },
  "size": 0
}

返回21

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 21,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "countBy" : {
      "value" : 21
    }
  }
}

（3）cardinality去重

查询火箭队中年龄不同的球员有几个
使用cardinality关键字，会将年龄一样的去掉

GET /nba/_search
{
   "query": {
    "term": {
       "teamNameEn": {
        "value": "Rockets"
      }
    }
  },
  "aggs": {
    "countBy": {
      "cardinality": {
        "field": "age"
      }
    }
  },
  "size": 0
}

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 21,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "countBy" : {
      "value" : 13
    }
  }
}

5、一次性获取平均值、和等数据，使用stats关键字


GET /nba/_search
{
  "query": {
    "term": {
      "teamNameEn": {
        "value": "Rockets"
      }
    }
  },
  "aggs": {
    "all": {
      "stats": {
        "field": "age"
      }
    }
  },
  "size": 0
}

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 21,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "all" : {
      "count" : 21,
      "min" : 21.0,
      "max" : 37.0,
      "avg" : 26.761904761904763,
      "sum" : 562.0
    }
  }
}

使用extended_stats关键字返回的数据中会增加平方和、方差、标准差、平均值加减两个标准差的区间等数据
返回值

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 21,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "all" : {
      "count" : 21,
      "min" : 21.0,
      "max" : 37.0,
      "avg" : 26.761904761904763,
      "sum" : 562.0,
      "sum_of_squares" : 15534.0,
      "variance" : 23.5147392290249,
      "std_deviation" : 4.84919985451465,
      "std_deviation_bounds" : {
        "upper" : 36.46030447093406,
        "lower" : 17.063505052875463
      }
    }
  }
}

percentiles统计指定字段不同区间所占的百分比

filed指定字段
percents设置区间，可以自定义

GET /nba/_search
{
   "query": {
    "term": {
      "teamNameEn": {
        "value": "Rockets"
      }
    }
  },
  "aggs": {
    "all": {
      "percentiles": {
        "field": "age",
        "percents": [
          1,
          5,
          25,
          50,
          75,
          95,
          99
        ]
      }
    }
  },
  "size": 0
}

返回解读
年龄小于21的占比1%
小于30.25的占比75%
一次类推也可以返回来说，75%的人年龄小于30.25

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 21,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "all" : {
      "values" : {
        "1.0" : 21.0,
        "5.0" : 21.0,
        "25.0" : 22.75,
        "50.0" : 25.0,
        "75.0" : 30.25,
        "95.0" : 35.349999999999994,
        "99.0" : 37.0
      }
    }
  }
}

6、 query_string查询

query_string 查询，如果熟悉lucene的查询语法，我们可以直接⽤lucene查询语法写⼀个查
询串进⾏查询，ES中接到请求后，通过查询解析器,解析查询串⽣成对应的查询。
也可以匹配多个字段，此处不再细说

GET /nba/_search
{
  "query": {
    "query_string": {
      "default_field": "FIELD",
      "query": "this AND that OR thus"
    }
  }
}

这里我的应用是
我的一个字段存了数组，例如

PUT /hzy
POST /hzy/_mapping
{
  "properties":{
    "labelIds":{
      "type":"text"
    }
  }
}
GET /hzy/_mapping


POST /hzy/_doc/1
{
  "labelIds":["13124312","1234213","12322134213","123423432423"]
}

GET /hzy/_search
{
  "query": {
    "query_string": {
      "default_field": "labelIds",
      "query": "13124312 or 1234213 OR 12322134213"
    }
  }
}

7、分词分析

GET /index_mapping/_analyze
{
  "field": "realname",
  "text":"imooc is very good"
}

{
  "tokens" : [
    {
      "token" : "imooc",
      "start_offset" : 0,
      "end_offset" : 5,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "is",
      "start_offset" : 6,
      "end_offset" : 8,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
      "token" : "very",
      "start_offset" : 9,
      "end_offset" : 13,
      "type" : "<ALPHANUM>",
      "position" : 2
    },
    {
      "token" : "good",
      "start_offset" : 14,
      "end_offset" : 18,
      "type" : "<ALPHANUM>",
      "position" : 3
    }
  ]
}

8. 查询时更新 _update_by_query

POST /synda_receive/_update_by_query
{
   "query": {
    "ids": {
      "values": ["1669241655485595650"]
    }
  },
  "script":{
    "source": "ctx._source.docTitle='关于端午节放假的通知222222';ctx._source.qianFaRen='111----'"
  }
}