3.elasticsearch复杂搜索

最新推荐文章于 2023-08-28 19:23:33 发布

__IProgrammer

最新推荐文章于 2023-08-28 19:23:33 发布

阅读量823

点赞数 1

原文链接：https://www.elastic.co/guide/cn/elasticsearch/guide/cn/distributed-cluster.html

版权

复杂搜索

搜索姓氏为Smith的雇员，并且年龄大于30，使用过滤器filter，它支持高效地执行一个结构化查询。

curl -X GET "localhost:9200/megacorp/employee/_search"  -d'
{
    "query" : {
        "bool": {
            "must": {
                "match" : {
                    "last_name" : "smith" 
                }
            },
            "filter": {
                "range" : {
                    "age" : { "gt" : 30 } 
                }
            }
        }
    }
}
'

range过滤器，上句实现了查找年龄大于30的文档。

相关性得分

搜索下所有喜欢攀岩（rock climbing）的雇员：

curl -X GET "localhost:9200/megacorp/employee/_search" -H 'Content-Type: application/json' -d'
{
    "query" : {
        "match" : {
            "about" : "rock climbing"
        }
    }
}
'

显然我们依旧使用之前的 match 查询在about 属性上搜索 “rock climbing” 。得到两个匹配的文档：

{
   ...
   "hits": {
      "total":      2,
      "max_score":  0.16273327,
      "hits": [
         {
            ...
            "_score":         0.16273327, 
            "_source": {
               "first_name":  "John",
               "last_name":   "Smith",
               "age":         25,
               "about":       "I love to go rock climbing",
               "interests": [ "sports", "music" ]
            }
         },
         {
            ...
            "_score":         0.016878016, 
            "_source": {
               "first_name":  "Jane",
               "last_name":   "Smith",
               "age":         32,
               "about":       "I like to collect rock albums",
               "interests": [ "music" ]
            }
         }
      ]
   }
}

Elasticsearch 默认按照相关性得分排序，即每个文档跟查询的匹配程度。第一个最高得分的结果很明显：John Smith 的 about 属性清楚地写着 “rock climbing” 。

但为什么 Jane Smith 也作为结果返回了呢？原因是她的 about 属性里提到了 “rock” 。因为只有 “rock” 而没有 “climbing” ，所以她的相关性得分低于 John 的。

这是一个很好的案例，阐明了 Elasticsearch 如何在全文属性上搜索并返回相关性最强的结果。Elasticsearch中的 相关性 概念非常重要，也是完全区别于传统关系型数据库的一个概念，数据库中的一条记录要么匹配要么不匹配。

短语搜索

找出一个属性中的独立单词是没有问题的，但有时候想要精确匹配一系列单词或者短语。比如，我们想执行这样一个查询，仅匹配同时包含 “rock” 和 “climbing” ，并且二者以短语 “rock climbing” 的形式紧挨着的雇员记录。

为此对 match 查询稍作调整，使用一个叫做 match_phrase 的查询：

GET /megacorp/employee/_search
{
    "query" : {
        "match_phrase" : {
            "about" : "rock climbing"
        }
    }
}

毫无悬念，返回结果仅有 John Smith 的文档。

高亮搜索

许多应用都倾向于在每个搜索结果中高亮部分文本片段，以便让用户知道为何该文档符合查询条件。在 Elasticsearch 中检索出高亮片段也很容易。

再次执行前面的查询，并增加一个新的 highlight 参数：

curl "http://localhost:9200/marker/goods/_search?pretty=true" -d '{"query":{"match":{"name":"Apple"}},"highlight":{"fields":{"name":{}}}}'

结果如下

"hits" : [ {
      "_index" : "meiduo",
      "_type" : "modelresult",
      "_id" : "5",
      "_score" : 0.3304931,
      "_source" : {
        "django_id" : "5",
        "name" : "Apple iPhone 8 Plus (A1864) 64GB 深空灰色 移动联通电信4G手机",
        "comments" : 0,
        "price" : "6688.00",
        "id" : 5,
        "default_image_url" : "http://image.meiduo.site:8888/group1/M00/00/02/CtM3BVrRa8iAZdz1AAFZsBqChgk2188464",
        "django_ct" : "goods.sku",
        "text" : "Apple iPhone 8 Plus (A1864) 64GB 深空灰色 移动联通电信4G手机\n5"
      },
      "highlight" : {
        "name" : [ "<em>Apple</em> iPhone 8 Plus (A1864) 64GB 深空灰色 移动联通电信4G手机" ]
      }
    }

结果多了一个highlight ，匹配到的词用em标签标记。

分析（聚合）

支持管理者对雇员目录做分析。 Elasticsearch 有一个功能叫聚合（aggregations），允许我们基于数据生成一些精细的分析结果。聚合与 SQL 中的 GROUP BY 类似但更强大。

挖掘所有价格分布

curl -X GET "localhost:9200/marker/goods/_search?pretty=true" -H 'Content-Type: application/json' -d'
{
  "aggs": {
    "all_price": {                  #聚合名 all_price
      "terms": { "field": "price" } #聚合字段 price
    }
  }
}
'

结果如下

"aggregations" : {
    "all_price" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [ {
        "key" : "3388.00",
        "doc_count" : 4
      }, {
        "key" : "3788.00",
        "doc_count" : 4
      }, {
        "key" : "7988.00",
        "doc_count" : 3
      }, {
        "key" : "6688.00",
        "doc_count" : 2
      }, {
        "key" : "11388.00",
        "doc_count" : 1
      }, {
        "key" : "11398.00",
        "doc_count" : 1
      }, {
        "key" : "6499.00",
        "doc_count" : 1
      } ]
    }
  }

也可以先进行限定聚合的元素，如下，查找所有名字包含Apple的所有价格分组。

curl -X GET "localhost:9200/market/goods/_search?pretty=true" -H 'Content-Type: application/json' -d'
{
  "query":
  {
  "match":
  {
	"name":"Apple"
   }
  },
  "aggs": {
    "all_apple_price": {
      "terms": { "field": "price" }
    }
  }
}
'

聚合还支持分级汇总，即对分组结果进一步使用，如求平均值等，如下,搜索名字包含Apple，并通过cate分组，求出分组的平均价格

curl -X GET "localhost:9200/market/goods/_search?pretty=true" -H 'Content-Type: application/json' -d'
{
  "query":
  {
  "match":
  {
	"name":"Apple"
   }
  },
  "aggs": {
    "all_cates": {
      "terms": { "field": "cate" },
	  "aggs":{
	  "avg_price":{
	  "avg":{"field":"price"}
	  }
	  }
    }
  }
}
'

__IProgrammer

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
3.elasticsearch复杂搜索

复杂搜索搜索姓氏为Smith的雇员，并且年龄大于30，使用过滤器filter，它支持高效地执行一个结构化查询。curl -X GET "localhost:9200/megacorp/employee/_search" -d'{ "query" : { "bool": { "must": { "match...
复制链接

扫一扫