1.词项查询介绍
全文查询将在执行之前分析查询字符串,但词项级别查询将按照存储在倒排索引中的词项进行精确操作。这些查询通常用于数字,日期和枚举等结构化数据,而不是全文本字段。 或者,它们允许您制作低级查询,并在分析过程之前进行。
2.term查询
term查询用于词项搜索,前一章已经介绍过这里不再重复。
3.terms查询
term查询对于查找单个值非常有用,但通常我们可能想搜索多个值。我们只要用单个 terms
查询(注意末尾的 s ), terms
查询好比是 term
查询的复数形式(以英语名词的单复数做比)。
如下查询”title“中包含”河北“,”长生“,”碧桂园“三个词组。
GET telegraph/_search
{
"query": {
"terms": {
"title": ["河北","长生","碧桂园"]
}
}
}
{
"took": 7,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "telegraph",
"_type": "msg",
"_id": "A5etp2QBW8hrYY3zGJk7",
"_score": 1,
"_source": {
"title": "碧桂园集团副主席杨惠妍",
"content": "杨惠妍分别于7月10日、11日买入碧桂园1000万股、1500万股",
"author": "小财注",
"pubdate": "2018-07-17T16:12:55"
}
},
{
"_index": "telegraph",
"_type": "msg",
"_id": "Apetp2QBW8hrYY3zGJk7",
"_score": 1,
"_source": {
"title": "长生生物再次跌停 三机构抛售近1000万元",
"content": "长生生物再次一字跌停,报收19.89元,成交1432万元",
"author": "长生生物",
"pubdate": "2018-07-17T10:03:11"
}
},
{
"_index": "telegraph",
"_type": "msg",
"_id": "BJetp2QBW8hrYY3zGJk7",
"_score": 1,
"_source": {
"title": "河北聚焦十大行业推进国际产能合作",
"content": "河北省政府近日出台积极参与“一带一路”建设推进国际产能合作实施方案",
"author": "财联社",
"pubdate": "2018-07-17T14:14:55"
}
}
]
}
}
4. terms_set查询
查找与一个或多个指定词项匹配的文档,其中必须匹配的术语数量取决于指定的最小值,应匹配字段或脚本。
5.range查询
range查询用于匹配数值型、日期型或字符串型字段在某一范围内的文档。
日期类型范围查询
上面例子查询发布时间“pubdate”在“2018-07-17T12:00:00”和“2018-07-17T16:30:00”之间的文档数据。
GET telegraph/_search
{
"query": {
"range": {
"pubdate": {
"gte": "2018-07-17T12:00:00",
"lte": "2018-07-17T16:30:00"
}
}
}
}
查询结果
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "telegraph",
"_type": "msg",
"_id": "AZetp2QBW8hrYY3zGJk7",
"_score": 1,
"_source": {
"title": "周五召开董事会会议 审议及批准更新后的一季报",
"content": "以审议及批准更新后的2018年第一季度报告",
"author": "中兴通讯",
"pubdate": "2018-07-17T12:33:11"
}
},
{
"_index": "telegraph",
"_type": "msg",
"_id": "A5etp2QBW8hrYY3zGJk7",
"_score": 1,
"_source": {
"title": "碧桂园集团副主席杨惠妍",
"content": "杨惠妍分别于7月10日、11日买入碧桂园1000万股、1500万股",
"author": "小财注",
"pubdate": "2018-07-17T16:12:55"
}
},
{
"_index": "telegraph",
"_type": "msg",
"_id": "BJetp2QBW8hrYY3zGJk7",
"_score": 1,
"_source": {
"title": "河北聚焦十大行业推进国际产能合作",
"content": "河北省政府近日出台积极参与“一带一路”建设推进国际产能合作实施方案",
"author": "财联社",
"pubdate": "2018-07-17T14:14:55"
}
}
]
}
}
数值类型范围查询
新建索引添加数据
DELETE my_person
PUT my_person
PUT my_person/stu/1
{
"name":"sean",
"age":20
}
PUT my_person/stu/2
{
"name":"sum",
"age":25
}
PUT my_person/stu/3
{
"name":"dean",
"age":30
}
PUT my_person/stu/4
{
"name":"kastel",
"age":35
}
查询“age”范围在20到30之间的人员
GET my_person/_search
{
"query": {
"range": {
"age": {
"gte": 20,
"lte": 30
}
}
}
}
查询结果
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "my_person",
"_type": "stu",
"_id": "2",
"_score": 1,
"_source": {
"name": "sum",
"age": 25
}
},
{
"_index": "my_person",
"_type": "stu",
"_id": "1",
"_score": 1,
"_source": {
"name": "sean",
"age": 20
}
},
{
"_index": "my_person",
"_type": "stu",
"_id": "3",
"_score": 1,
"_source": {
"name": "dean",
"age": 30
}
}
]
}
}
6.exists查询
查询文档中的字段至少包含一个非空值。
创建索引添加数据
DELETE my_person
PUT my_person
PUT my_person/stu/1
{
"name":"sean",
"hobby":"running"
}
PUT my_person/stu/2
{
"name":"Jhon",
"hobby":""
}
PUT my_person/stu/3
{
"name":"sum",
"hobby":["swimming",null]
}
PUT my_person/stu/4
{
"name":"lily",
"hobby":[null,null]
}
PUT my_person/stu/5
{
"name":"lucy"
}
查询“hobby”不为空的文档
GET my_person/_search
{
"query": {
"exists":{
"field":"hobby"
}
}
}
查询结果
{
"took": 12,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "my_person",
"_type": "stu",
"_id": "2",
"_score": 1,
"_source": {
"name": "Jhon",
"hobby": ""
}
},
{
"_index": "my_person",
"_type": "stu",
"_id": "1",
"_score": 1,
"_source": {
"name": "sean",
"hobby": "running"
}
},
{
"_index": "my_person",
"_type": "stu",
"_id": "3",
"_score": 1,
"_source": {
"name": "sum",
"hobby": [
"swimming",
null
]
}
}
]
}
}
匹配说明:
- "hobby":"running"------值不为空(可以匹配)
- "hobby":""------值为空字符串,不是空值(可以匹配)
- "hobby":["swimming",null]------数组中有非空值(可以匹配)
- "hobby":[null,null]------数组中值都为null(不可以匹配)
- "name":"lucy"------没有hobby字段(不可以匹配)
7.prefix查询
查询以匹配字符串开头的文档,如下查询”hobby“中以”sw“开头的文档
GET my_person/_search
{
"query": {
"prefix": {
"hobby": {
"value": "sw"
}
}
}
}
查询结果
{
"took": 11,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "my_person",
"_type": "stu",
"_id": "6",
"_score": 1,
"_source": {
"name": "deak",
"hobby": "swimming"
}
},
{
"_index": "my_person",
"_type": "stu",
"_id": "3",
"_score": 1,
"_source": {
"name": "sum",
"hobby": [
"swimming",
null
]
}
}
]
}
}
8.wildcard查询
通配符查询,如下查询hobby匹配”*ing“的文档
GET my_person/_search
{
"query": {
"wildcard": {
"hobby": {
"value": "*ing"
}
}
}
}
查询结果
{
"took": 27,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "my_person",
"_type": "stu",
"_id": "6",
"_score": 1,
"_source": {
"name": "deak",
"hobby": "swimming"
}
},
{
"_index": "my_person",
"_type": "stu",
"_id": "1",
"_score": 1,
"_source": {
"name": "sean",
"hobby": "running"
}
},
{
"_index": "my_person",
"_type": "stu",
"_id": "3",
"_score": 1,
"_source": {
"name": "sum",
"hobby": [
"swimming",
null
]
}
}
]
}
}
9.regexp查询
正则表达式查询的性能很大程度上取决于所选的正则表达式。 类似.*
的匹配任何内容的正则表达式非常缓慢,并且使用了lookaround正则表达式。 如果可以的话,请尝试在正则表达式开始之前使用长前缀。 像.*?+
这样的通配符匹配器大多会降低性能。大多数正则表达式引擎允许您匹配字符串的任何部分。 如果你想让正则表达式模式从字符串的开头开始,或者在字符串的末尾完成,那么你必须明确地定位它,使用^
表示开始或$
表示结束。
元字符 | 语义 | 说明 | 例子 |
---|---|---|---|
. | Match any character | The period “.” can be used to represent any character 匹配任何一个字符 | ab. 匹配abc、ab1 |
+ | One-or-more | The plus sign “+” can be used to repeat the preceding shortest pattern once or more times. 加号“+”可以用来重复上一个最短的模式一次或多次。 | “aaabbb”匹配a+b+ |
* | Zero-or-more | The asterisk “*” can be used to match the preceding shortest pattern zero-or-more times. | “aaabbb”匹配a*b* |
? | Zero-or-one | The question mark “?” makes the preceding shortest pattern optional. It matches zero or one times. | “aaabbb”匹配aaa?bbbb? |
{m} ,{m,n} | Min-to-max | Curly brackets “{}” can be used to specify a minimum and (optionally) a maximum number of times the preceding shortest pattern can repeat. | “aaabbb”匹配a{3}b{3}和a{2,4}b{2,4} |
() | Grouping | Parentheses “()” can be used to form sub-patterns. | “ababab”匹配(ab)+ |
| | Alternation | The pipe symbol “|” acts as an OR operator. | “aabb”匹配aabb|bbaa |
[] | Character classes | Ranges of potential characters may be represented as character classes by enclosing them in square brackets “[]”. A leading ^ negates the character class. | [abc]匹配 ‘a’ or ‘b’ or ‘c’ |
~ | Complement | The shortest pattern that follows a tilde “~” is negated(否定).“ab~cd”的意思是:以a开头,后跟b,后面跟一个任意长度的字符串,但不是c,以d结尾 | “abcdef”匹配ab~df或a~(cb)def,不匹配ab~cdef和a~(bc)def |
<> | Interval间隔 | The interval option enables the use of numeric ranges, enclosed by angle brackets “<>”. | “foo80”匹配foo<1-100> |
& | Intersection | The ampersand “&” joins two patterns in a way that both of them have to match. | “aaabbb”匹配aaa.+&.+bbb |
@ | Any string | The at sign “@” matches any string in its entirety. | @&~(foo.+) 匹配除了以“foo”开头的字符串 “foo” |
查询”hobby“字段值与”sw.+“正则匹配的文档
GET my_person/_search
{
"query": {
"regexp":{
"hobby":"sw.+"
}
}
}
查询结果
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "my_person",
"_type": "stu",
"_id": "6",
"_score": 1,
"_source": {
"name": "deak",
"hobby": "swimming"
}
},
{
"_index": "my_person",
"_type": "stu",
"_id": "3",
"_score": 1,
"_source": {
"name": "sum",
"hobby": [
"swimming",
null
]
}
}
]
}
}
10.fuzzy查询
模糊查询
GET telegraph/_search
{
"query": {
"fuzzy": {
"title": "十大"
}
}
}
查询结果
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.99277425,
"hits": [
{
"_index": "telegraph",
"_type": "msg",
"_id": "BJetp2QBW8hrYY3zGJk7",
"_score": 0.99277425,
"_source": {
"title": "河北聚焦十大行业推进国际产能合作",
"content": "河北省政府近日出台积极参与“一带一路”建设推进国际产能合作实施方案",
"author": "财联社",
"pubdate": "2018-07-17T14:14:55"
}
}
]
}
}
11.ids查询
根据跟定的文档id列表查询文档。
GET my_person/_search
{
"query": {
"ids": {
"values": ["1","3","5"]
}
}
}
查询结果
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "my_person",
"_type": "stu",
"_id": "5",
"_score": 1,
"_source": {
"name": "lucy"
}
},
{
"_index": "my_person",
"_type": "stu",
"_id": "1",
"_score": 1,
"_source": {
"name": "sean",
"hobby": "running"
}
},
{
"_index": "my_person",
"_type": "stu",
"_id": "3",
"_score": 1,
"_source": {
"name": "sum",
"hobby": [
"swimming",
null
]
}
}
]
}
}