Elasticsearch Query DSL:查询上下文和过滤上下文

Elasticsearch Query DSL:查询上下文和过滤上下文

Elasticsearch提供非常完善基于JSON的Query DSL(Domain Specific Language)用于定义查询。主要包括查询上下文即过滤上下文,以及两者组合查询。

1. 查询上下文

在查询上下文中使用的查询子句基于文档相关性原则进行查询,用于回答“文档匹配查询子句的程度”。查询结果列出所有相关文档并按照相关性评分进行排序。相关性评分有查询上下文中的查询子句计算出来,用_score表示,即相对于其他文档的匹配程度。

无论何时将查询参数传给查询子句,查询上下文都有效,如搜索API中的查询参数。下面示例带有查询上下文的查询,返回所有描述包括science单词的课程。

GET /courses/_search
{
  "query": {
    "match": { 
      "course_description": "science" 
    }
  }
}

2. 过滤上下文

过滤上下文可视为结果为0/1的二值工具。查询上下文回答“匹配程度”,过滤上下文简单回答“是/否”。

过滤上下文大多数用于过滤结构化数据,如范围查询(给定日期范围)、状态检查等。elasticsearch会自动缓存频繁使用过滤上下文,从而提升查询性能。

无论何时将过滤参数传给查询子句,过滤上下文都有效,如在bool查询中的filter 或者 must_not 参数,constant_score 查询中的过滤参数,或filter聚集。下面带过滤上下文查询子句返回所有学生得分大于等于33的课程文档。

GET /courses/_search
{
  "query": {
    "bool": {
      "filter": {
          "range":  { "students_enrolled": { "gte": 33 }}
        }
      
    }
  }
}

注意:查询上下文与过滤上下文的基本差异————查询上下文与_score(相关性评分)关联,而过滤上下文与二值(true、false)关联。

3. 查询示例

本节我们通过几个示例加深理解。为了验证查询结果,这里提供一些查询数据,读者可以批量插入至courses进行测试。

{
"_index" : "courses",
"_type" : "_doc",
"_id" : "7G4TN3ABnUeCEegtv7VW",
"_score" : 1.0,
"_source" : {
    "name" : "Marketing 101",
    "room" : "E4",
    "professor" : {
    "name" : "William Smith",
    "department" : "finance",
    "facutly_type" : "part-time",
    "email" : "wills@onuni.com"
    },
    "students_enrolled" : 18,
    "course_publish_date" : "2015-06-21",
    "course_description" : "Mkt 101 is a course from the business school on the introduction to marketing that teaches students the fundamentals of market analysis, customer retention and online advertisements"
}
},
{
"_index" : "courses",
"_type" : "_doc",
"_id" : "7W4TN3ABnUeCEegtv7VW",
"_score" : 1.0,
"_source" : {
    "name" : "Accounting 101",
    "room" : "E3",
    "professor" : {
    "name" : "Thomas Baszo",
    "department" : "finance",
    "facutly_type" : "part-time",
    "email" : "baszot@onuni.com"
    },
    "students_enrolled" : 27,
    "course_publish_date" : "2015-01-19",
    "course_description" : "Act 101 is a course from the business school on the introduction to accounting that teaches students how to read and compose basic financial statements"
}
},
{
"_index" : "courses",
"_type" : "_doc",
"_id" : "7m4TN3ABnUeCEegtv7VW",
"_score" : 1.0,
"_source" : {
    "name" : "Tax Accounting 200",
    "room" : "E7",
    "professor" : {
    "name" : "Thomas Baszo",
    "department" : "finance",
    "facutly_type" : "part-time",
    "email" : "baszot@onuni.com"
    },
    "students_enrolled" : 17,
    "course_publish_date" : "2016-06-15",
    "course_description" : "Tax Act 200 is an intermediate course covering various aspects of tax law"
}
},
{
"_index" : "courses",
"_type" : "_doc",
"_id" : "724UN3ABnUeCEegtkLUq",
"_score" : 1.0,
"_source" : {
    "name" : "Capital Markets 350",
    "room" : "E3",
    "professor" : {
    "name" : "Thomas Baszo",
    "department" : "finance",
    "facutly_type" : "part-time",
    "email" : "baszot@onuni.com"
    },
    "students_enrolled" : 13,
    "course_publish_date" : "2016-01-11",
    "course_description" : "This is an advanced course teaching crucial topics related to raising capital and bonds, shares and other long-term equity and debt financial instrucments"
}
}

1、仅有查询上下文

GET /courses/_search
{
  "query": {
    
    "match": { 
      "course_description": "science" 
    }
  }
}

响应信息包括_score表明文档相关性评分。

2、带过滤占位符的查询上下文
使用bool组合多个匹配子句,这里filter参数为空,filter参数表示过滤上下文。

GET /courses/_search
{
  "query": { 
    "bool": { 
      "must": [
        { "match": { "professor.facutly_type": "part-time" }},
        { "match": { "professor.department": "finance" }}
      ],
      "filter": [ 
         
      ]
    }
  }
}

must内所有子句必须都匹配,相当于and功能。

3、带过滤的查询上下文

在查询基础上增加过滤条件。范围过滤会在结果上删除符合过滤条件的文档。

GET /courses/_search
{
  "query": { 
    "bool": { 
      "must": [
        { "match": { "professor.facutly_type": "part-time" }},
        { "match": { "professor.department": "finance" }}
      ],
      "filter": [ 
         { "range":  { "students_enrolled": { "gte": 16 }}}
      ]
    }
  }
}

4、使用must_not 子句

must_not 子句从结果中删除符合条件文档。

GET /courses/_search
{
  "query": { 
    "bool": { 
      "must": [
        { "match": { "professor.facutly_type": "part-time" }},
        { "match": { "professor.department": "finance" }}
      ],
      "must_not": [
        { "match": { "course_description": "business" }}
      ], 
      "filter": [ 
         { "range":  { "students_enrolled": { "gte": 16 }}}
      ]
    }
  }
}

must_not相当于not功能,表示不匹配。

5、multi_match

多字段匹配:

GET /courses/_search
{
  "query": {
    "multi_match": {
      "query": "computer",
      "fields": ["name","professor.department"]
    }
  }
}

6、multi_phrase

multi_phrase需要完全匹配搜索词组。部分或打断词组将不会匹配。

GET /courses/_search
{
  "query": {
    "match_phrase": {
      "course_description": "computer science introduction teaching"
    }
  }
}

7、match_phase_prefix

match_phase_prefix 部分以查询词组为前缀查询。

GET /courses/_search
{
  "query": {
    "match_phrase_prefix": {
      "course_description": "computer science"
    }
  }
}

8、范围子句

gte表示大于或等于,lte表示小于或等于。其他选项gt(大于),lt(小于)。

GET /courses/_search
{
  "query": {
    "range": {
      "students_enrolled": {
        "gte": 20,
        "lte": 30
      }
    }
  }
}

9、should

Should 子句一般用于查询最相关的文档。如果删除minimum_should_match子句则返回多个文档,反之返回最相关文档。

GET /courses/_search
{
  "query": {
    "bool": {
      "must": [
        {"match": {"name":"101"}}
      ], 
      "must_not": [
        {"match": {"room": "e7"}}
      ],
      "should": [
        {
          "range": {
            "students_enrolled": {
              "gte": 10,
              "lte": 20
            }
          }
        }
      ],
      "minimum_should_match": 1
    }
  }
}

should相当于or功能。minimum_should_match紧跟should后面,用于限定必须满足or条件最小量。

4. 总结

我们一起学习了Elasticsearch Query DSL,并通过示例说明查询上下文和过滤上下文以及两者组合使用。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值