ES12-词项查询

1.词项查询介绍

全文查询将在执行之前分析查询字符串,但词项级别查询将按照存储在倒排索引中的词项进行精确操作。这些查询通常用于数字,日期和枚举等结构化数据,而不是全文本字段。 或者,它们允许您制作低级查询,并在分析过程之前进行。

2.term查询

term查询用于词项搜索,前一章已经介绍过这里不再重复。

3.terms查询

term查询对于查找单个值非常有用,但通常我们可能想搜索多个值。我们只要用单个 terms 查询(注意末尾的 s ), terms 查询好比是 term 查询的复数形式(以英语名词的单复数做比)。

如下查询”title“中包含”河北“,”长生“,”碧桂园“三个词组。

GET telegraph/_search
{
  "query": {
    "terms": {
      "title": ["河北","长生","碧桂园"]
    }
  }
}
{
  "took": 7,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "telegraph",
        "_type": "msg",
        "_id": "A5etp2QBW8hrYY3zGJk7",
        "_score": 1,
        "_source": {
          "title": "碧桂园集团副主席杨惠妍",
          "content": "杨惠妍分别于7月10日、11日买入碧桂园1000万股、1500万股",
          "author": "小财注",
          "pubdate": "2018-07-17T16:12:55"
        }
      },
      {
        "_index": "telegraph",
        "_type": "msg",
        "_id": "Apetp2QBW8hrYY3zGJk7",
        "_score": 1,
        "_source": {
          "title": "长生生物再次跌停 三机构抛售近1000万元",
          "content": "长生生物再次一字跌停,报收19.89元,成交1432万元",
          "author": "长生生物",
          "pubdate": "2018-07-17T10:03:11"
        }
      },
      {
        "_index": "telegraph",
        "_type": "msg",
        "_id": "BJetp2QBW8hrYY3zGJk7",
        "_score": 1,
        "_source": {
          "title": "河北聚焦十大行业推进国际产能合作",
          "content": "河北省政府近日出台积极参与“一带一路”建设推进国际产能合作实施方案",
          "author": "财联社",
          "pubdate": "2018-07-17T14:14:55"
        }
      }
    ]
  }
}

4. terms_set查询

查找与一个或多个指定词项匹配的文档,其中必须匹配的术语数量取决于指定的最小值,应匹配字段或脚本。

5.range查询

range查询用于匹配数值型、日期型或字符串型字段在某一范围内的文档。

日期类型范围查询

上面例子查询发布时间“pubdate”在“2018-07-17T12:00:00”和“2018-07-17T16:30:00”之间的文档数据。

GET telegraph/_search
{
  "query": {
    "range": {
      "pubdate": {
        "gte": "2018-07-17T12:00:00",
        "lte": "2018-07-17T16:30:00"
      }
    }
  }
}

查询结果

{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "telegraph",
        "_type": "msg",
        "_id": "AZetp2QBW8hrYY3zGJk7",
        "_score": 1,
        "_source": {
          "title": "周五召开董事会会议 审议及批准更新后的一季报",
          "content": "以审议及批准更新后的2018年第一季度报告",
          "author": "中兴通讯",
          "pubdate": "2018-07-17T12:33:11"
        }
      },
      {
        "_index": "telegraph",
        "_type": "msg",
        "_id": "A5etp2QBW8hrYY3zGJk7",
        "_score": 1,
        "_source": {
          "title": "碧桂园集团副主席杨惠妍",
          "content": "杨惠妍分别于7月10日、11日买入碧桂园1000万股、1500万股",
          "author": "小财注",
          "pubdate": "2018-07-17T16:12:55"
        }
      },
      {
        "_index": "telegraph",
        "_type": "msg",
        "_id": "BJetp2QBW8hrYY3zGJk7",
        "_score": 1,
        "_source": {
          "title": "河北聚焦十大行业推进国际产能合作",
          "content": "河北省政府近日出台积极参与“一带一路”建设推进国际产能合作实施方案",
          "author": "财联社",
          "pubdate": "2018-07-17T14:14:55"
        }
      }
    ]
  }
}

数值类型范围查询

新建索引添加数据

DELETE my_person

PUT my_person

PUT my_person/stu/1
{
  "name":"sean",
  "age":20
}

PUT my_person/stu/2
{
  "name":"sum",
  "age":25
}

PUT  my_person/stu/3
{
  "name":"dean",
  "age":30
}

PUT my_person/stu/4
{
  "name":"kastel",
  "age":35
}

查询“age”范围在20到30之间的人员

GET my_person/_search
{
  "query": {
    "range": {
      "age": {
        "gte": 20,
        "lte": 30
      }
    }
  }
}

查询结果

{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "sum",
          "age": 25
        }
      },
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "sean",
          "age": 20
        }
      },
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "dean",
          "age": 30
        }
      }
    ]
  }
}

6.exists查询

查询文档中的字段至少包含一个非空值。

创建索引添加数据

DELETE my_person

PUT my_person

PUT my_person/stu/1
{
  "name":"sean",
  "hobby":"running"
}

PUT my_person/stu/2
{
  "name":"Jhon",
  "hobby":""
}

PUT my_person/stu/3
{
  "name":"sum",
  "hobby":["swimming",null]
}

PUT my_person/stu/4
{
  "name":"lily",
  "hobby":[null,null]
}

PUT my_person/stu/5
{
  "name":"lucy"
}

查询“hobby”不为空的文档

GET my_person/_search
{
  "query": {
    "exists":{
      "field":"hobby"
    }
  }
}

查询结果

{
  "took": 12,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "Jhon",
          "hobby": ""
        }
      },
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "sean",
          "hobby": "running"
        }
      },
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "sum",
          "hobby": [
            "swimming",
            null
          ]
        }
      }
    ]
  }
}

匹配说明:

  • "hobby":"running"------值不为空(可以匹配)
  • "hobby":""------值为空字符串,不是空值(可以匹配)
  • "hobby":["swimming",null]------数组中有非空值(可以匹配)
  • "hobby":[null,null]------数组中值都为null(不可以匹配)
  • "name":"lucy"------没有hobby字段(不可以匹配)

7.prefix查询

查询以匹配字符串开头的文档,如下查询”hobby“中以”sw“开头的文档

GET my_person/_search
{
  "query": {
    "prefix": {
      "hobby": {
        "value": "sw"
      }
    }
  }
}

查询结果

{
  "took": 11,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "6",
        "_score": 1,
        "_source": {
          "name": "deak",
          "hobby": "swimming"
        }
      },
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "sum",
          "hobby": [
            "swimming",
            null
          ]
        }
      }
    ]
  }
}

8.wildcard查询

通配符查询,如下查询hobby匹配”*ing“的文档

GET my_person/_search
{
  "query": {
    "wildcard": {
      "hobby": {
        "value": "*ing"
      }
    }
  }
}

查询结果

{
  "took": 27,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "6",
        "_score": 1,
        "_source": {
          "name": "deak",
          "hobby": "swimming"
        }
      },
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "sean",
          "hobby": "running"
        }
      },
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "sum",
          "hobby": [
            "swimming",
            null
          ]
        }
      }
    ]
  }
}

9.regexp查询

正则表达式查询的性能很大程度上取决于所选的正则表达式。 类似.*的匹配任何内容的正则表达式非常缓慢,并且使用了lookaround正则表达式。 如果可以的话,请尝试在正则表达式开始之前使用长前缀。 像.*?+这样的通配符匹配器大多会降低性能。大多数正则表达式引擎允许您匹配字符串的任何部分。 如果你想让正则表达式模式从字符串的开头开始,或者在字符串的末尾完成,那么你必须明确地定位它,使用^表示开始或$表示结束。

元字符语义说明例子
.Match any characterThe period “.” can be used to represent any character
匹配任何一个字符
ab.匹配abc、ab1
+One-or-moreThe plus sign “+” can be used to repeat the preceding shortest pattern once or more times.
加号“+”可以用来重复上一个最短的模式一次或多次。
“aaabbb”匹配a+b+
*Zero-or-moreThe asterisk “*” can be used to match the preceding shortest pattern zero-or-more times.“aaabbb”匹配a*b*
?Zero-or-oneThe question mark “?” makes the preceding shortest pattern optional. It matches zero or one times.“aaabbb”匹配aaa?bbbb?
{m},{m,n}Min-to-maxCurly brackets “{}” can be used to specify a minimum and (optionally) a maximum number of times the preceding shortest pattern can repeat.“aaabbb”匹配a{3}b{3}和a{2,4}b{2,4}
()GroupingParentheses “()” can be used to form sub-patterns.“ababab”匹配(ab)+
|AlternationThe pipe symbol “|” acts as an OR operator.“aabb”匹配aabb|bbaa
[]Character classesRanges of potential characters may be represented as character classes by enclosing them in square brackets “[]”. A leading ^ negates the character class.[abc]匹配 ‘a’ or ‘b’ or ‘c’
~ComplementThe shortest pattern that follows a tilde “~” is negated(否定).“ab~cd”的意思是:以a开头,后跟b,后面跟一个任意长度的字符串,但不是c,以d结尾“abcdef”匹配ab~df或a~(cb)def,不匹配ab~cdef和a~(bc)def
<>Interval间隔The interval option enables the use of numeric ranges, enclosed by angle brackets “<>”.“foo80”匹配foo<1-100>
&IntersectionThe ampersand “&” joins two patterns in a way that both of them have to match.“aaabbb”匹配aaa.+&.+bbb
@Any stringThe at sign “@” matches any string in its entirety.@&~(foo.+)匹配除了以“foo”开头的字符串 “foo”

查询”hobby“字段值与”sw.+“正则匹配的文档

GET my_person/_search
{
  "query": {
    "regexp":{
      "hobby":"sw.+"
    }
  }
}

查询结果

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "6",
        "_score": 1,
        "_source": {
          "name": "deak",
          "hobby": "swimming"
        }
      },
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "sum",
          "hobby": [
            "swimming",
            null
          ]
        }
      }
    ]
  }
}

10.fuzzy查询

模糊查询

GET telegraph/_search
{
  "query": {
    "fuzzy": {
      "title": "十大"
    }
  }
}

查询结果

{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.99277425,
    "hits": [
      {
        "_index": "telegraph",
        "_type": "msg",
        "_id": "BJetp2QBW8hrYY3zGJk7",
        "_score": 0.99277425,
        "_source": {
          "title": "河北聚焦十大行业推进国际产能合作",
          "content": "河北省政府近日出台积极参与“一带一路”建设推进国际产能合作实施方案",
          "author": "财联社",
          "pubdate": "2018-07-17T14:14:55"
        }
      }
    ]
  }
}

11.ids查询

根据跟定的文档id列表查询文档。

GET my_person/_search
{
  "query": {
    "ids": {
      "values": ["1","3","5"]
    }
  }
}

查询结果

{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "5",
        "_score": 1,
        "_source": {
          "name": "lucy"
        }
      },
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "sean",
          "hobby": "running"
        }
      },
      {
        "_index": "my_person",
        "_type": "stu",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "sum",
          "hobby": [
            "swimming",
            null
          ]
        }
      }
    ]
  }
}

 

转载于:https://my.oschina.net/u/3100849/blog/1858871

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值