Elastic Search 查询基础

最新推荐文章于 2024-04-16 10:51:16 发布

乌索普大神

最新推荐文章于 2024-04-16 10:51:16 发布

阅读量1.3k

点赞数

分类专栏： Elastic 文章标签： Elastic

本文链接：https://blog.csdn.net/ewteunafsew/article/details/51427149

版权

Elastic 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

The Empty Search

GET /_search
GET /_search?timeout=10ms

{
   "hits" : {
      "total" :       14,
      "hits" : [
        {
          "_index":   "us",
          "_type":    "tweet",
          "_id":      "7",
          "_score":   1,
          "_source": {
             "date":    "2014-09-17",
             "name":    "John Smith",
             "tweet":   "The Query DSL is really powerful and flexible",
             "user_id": 2
          }
       },
        ... 9 RESULTS REMOVED ...
      ],
      "max_score" :   1
   },
   "took" :           4,
   "_shards" : {
      "failed" :      0,
      "successful" :  10,
      "total" :       10
   },
   "timed_out" :      false
}

多个索引和类型

endpoint	meaning
/_search	Search all types in all indices
/gb/_search	Search all types in the gb index
/gb,us/_search	Search all types in the gb and us indices
/g,u/_search	Search all types in any indices beginning with g or beginning with u
/gb/user/_search	Search type user in the gb index
/gb,us/user,tweet/_search	Search types user and tweet in the gb and us indices
/_all/user,tweet/_search	Search types user and tweet in all indices

分页

GET /_search?size=5
GET /_search?size=5&from=5
GET /_search?size=5&from=10

To understand why deep paging is problematic, let’s imagine that we are searching within a single index with five primary shards. When we request the first page of results (results 1 to 10), each shard produces its own top 10 results and returns them to the coordinating node, which then sorts all 50 results in order to select the overall top 10.

Now imagine that we ask for page 1,000—results 10,001 to 10,010. Everything works in the same way except that each shard has to produce its top 10,010 results. The coordinating node then sorts through all 50,050 results and discards 50,040 of them!

You can see that, in a distributed system, the cost of sorting results grows exponentially the deeper we page. There is a good reason that web search engines don’t return more than 1,000 results for any query.

Search Lite

有两种形式的查询API：所有的查询参数都包含在查询字符串中；使用JSON request body利用query dsl查询

GET /_all/tweet/_search?q=tweet:elasticsearch

+name:john +tweet:mary // + must match, - must not match
GET /_search?q=%2Bname%3Ajohn+%2Btweet%3Amary

// the _all field
GET /_search?q=mary

+name:(mary john) +date:>2014-09-10 +(aggregations geo)
?q=%2Bname%3A(mary+john)+%2Bdate%3A%3E2014-09-10+%2B(aggregations+geo)

Query DSL

GET /_search
{
    "query": YOUR_QUERY_HERE
}

其中的query使用如下的结构:

{
    QUERY_NAME: {
        ARGUMENT: VALUE,
        ARGUMENT: VALUE,...
    }
}

如果要指定具体的字段：

{
    QUERY_NAME: {
        FIELD_NAME: {
            ARGUMENT: VALUE,
            ARGUMENT: VALUE,...
        }
    }
}

一个实例

{
    "bool": {
        "must": { "match":   { "email": "business opportunity" }},
        "should": [
            { "match":       { "starred": true }},
            { "bool": {
                "must":      { "match": { "folder": "inbox" }},
                "must_not":  { "match": { "spam": true }}
            }}
        ],
        "minimum_should_match": 1
    }
}