Elasticsearch常用查询

最新推荐文章于 2024-04-18 15:49:57 发布

barnett_y

最新推荐文章于 2024-04-18 15:49:57 发布

阅读量163

点赞数

分类专栏：【Elasticsearch】

原文链接：https://www.cnblogs.com/stellar/p/9825598.html

版权

【Elasticsearch】专栏收录该内容

39 篇文章 6 订阅

订阅专栏

query DSL
match 查询
{ 
    "match": { 
        "tweet": "About Search" 
    } 
}
注：match查询只能就指定某个确切字段某个确切的值进行搜索，做精确匹配搜索时，
你最好用过滤语句，因为过滤语句可以缓存数据。

match_phrase 查询
{
    "query": {
        "match_phrase": {
            "title": "quick brown fox"
        }
    }
}
注：与match相比，不会拆分查询条件
参考：https://blog.csdn.net/liuxiao723846/article/details/78365078?locationNum=2&fps=1


multi_match查询：对多个field查询
{ 
    "multi_match": { 
        "query":    "full text search", 
        "fields":   [ "title", "body" ] 
    } 
}

bool 查询
must: 查询指定文档一定要被包含。
filter: 和must类似，但不计分。
must_not: 查询指定文档一定不要被包含。
should: 查询指定文档，满足一个条件就返回。
POST _search
{
  "query": {
    "bool" : {
      "must" : {
        "term" : { "user" : "kimchy" }
      },
      "filter": {
        "term" : { "tag" : "tech" }
      },
      "must_not" : {
        "range" : {
          "age" : { "gte" : 10, "lte" : 20 }
        }
      },
      "should" : [
        { "term" : { "tag" : "wow" } },
        { "term" : { "tag" : "elasticsearch" } }
      ],
      "minimum_should_match" : 1,
      "boost" : 1.0
    }
  }
}

prefix 查询：以什么字符开头
{ 
  "query": { 
    "prefix": { 
      "hostname": "wx" 
    } 
  } 
}

wildcards 查询：通配符查询
{
    "query": {
        "wildcard": {
            "postcode": "W?F*HW" 
        }
    }
}
注：?用来匹配任意字符，*用来匹配零个或者多个字符

regexp 查询：正则表达式查询
{
    "query": {
        "regexp": {
            "postcode": "W[0-9].+" 
        }
    }
}

注：
1. 字段、词条
字段"Quick brown fox" 会产生
词条"quick"，"brown"和"fox"
2. prefix，wildcard以及regexp查询基于词条
---------------------------------------------------------------------

filter DSL
term 过滤：精确匹配
{ 
  "query": { 
    "term": { 
      "age": 26
    } 
  } 
}

terms 过滤：指定多个匹配条件
{ 
  "query": { 
    "terms": { 
      "status": [ 304, 302 ] 
    } 
  } 
}

range 过滤
{ 
    "range": { 
        "age": { 
            "gte":  20, 
            "lt":   30 
        } 
    } 
}

范围操作符包含：
gt :: 大于
gte:: 大于等于
lt :: 小于
lte:: 小于等于

exists/missing 过滤：过滤字段是否存在
{ 
    "exists":   { 
        "field":    "title" 
    } 
}

bool过滤：合并多个过滤条件查询结果的布尔逻辑
{ 
    "bool": { 
        "must":     { "term": { "folder": "inbox" }}, 
        "must_not": { "term": { "tag":    "spam"  }}, 
        "should": [ 
                    { "term": { "starred": true   }}, 
                    { "term": { "unread":  true   }} 
        ] 
    } 
}
注：
must :: 多个查询条件的完全匹配,相当于 and。
must_not :: 多个查询条件的相反匹配，相当于 not。
should :: 至少有一个查询条件匹配, 相当于 or。

复制代码

must例子，多个条件使用[]
{
    "query": {
        "bool": {
            "must": [{
                    "term": {
                        "category": "38"
                    }
                },
                {
                    "range": {
                        "time": {
                            "gte": "1539827880",
                            "lt": "1539827881"
                        }
                    }
                }
            ],
            "must_not": [],
            "should": []
        }
    },
    "from": 0,
    "size": 10,
    "sort": [],
    "aggs": {}
}

复制代码

聚合：

复制代码

1. 筛选出指定时间的记录并求出sum(collection)，其中collection字段为数值
{
    "query" : {
        "match" : {"time":1539827880}
    },
    "aggs" : {
        "connections" : { "sum" : { "field" : "connection" } }
    }
}

复制代码

2. 筛选出时间范围内的数据然后根据category进行分类，每个类别中计算connection的总数,并根据result排序
{
    "query": {
        "range": {
            "time": {
                "gte": 1539827880,
                "lt": 1539827881
            }
        }
    },
    "aggs": {
        "_result": {
            "terms": {
                "field": "category",
                "order": {"result":"desc"}

            },
            "aggs": {
                "result": {
                    "sum": {
                        "field": "connection"
                    }
                }
            }
        }

    }
}

复制代码

3. 多重聚合
{
    "size": 0,
    "query": {
      "range": {
        "time": {
          "gte": 1539741480,
          "lte": 1539827880
        }
      }
    },
    "aggs": {
      "_result": {
        "terms": {
          "field": "category",
          "order": {
            "result": "desc"
          },
          "size": 5
        },
        "aggs": {
          "result": {
            "sum": {
              "field": "connection"
            }
          },
          "IPs": {
            "terms": {
              "field": "ip",
              "order": {
                "ip_result" : "desc"
              },
              "size": 5
            },
            "aggs": {
              "ip_result": {
              "sum": {
                "field": "connection"
              }
            }
          }
        }
      }
    }
  }
}

复制代码

#DE log，先筛选type=2的日志，然后根据source_address统计repeat_times的和（即该source_address的log出现的次数），倒排
{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "type": 2 
          }
        }
      ]
    }
  },
  "aggs": {
    "all_source_address": {
        "terms": {
            "field": "source_address",
            "order": {"repeat_times_total": "desc"}　　#注意：根据聚合后的字段来排序
        },
        "aggs": {
            "repeat_times_total": {
                "sum": {"field": "repeat_times"}
            }
        }
    }
  }
}