ElasticSearch-Text and Numeric Queries

xueshijun666

已于 2022-10-09 20:43:54 修改

阅读量415

点赞数

文章标签： elasticsearch 大数据搜索引擎

于 2022-10-09 20:40:32 首次发布

本文链接：https://blog.csdn.net/xueshijun666/article/details/127233512

版权

Using a term query

Using a terms query

Using a terms set query

Using a prefix query

Using a wildcard query

Using a regexp query

Using span queries(跨度)

Using a match query

Using a query string query

Using a simple query string query

Using the range query

Using an IDs query

Using the function score query

Using the exists query

Using a pinned query (XPACK)

Using a term query


Term queries work with exact value matches and, generally, are very fast.
1.    We will execute a term query from the command line:
POST /mybooks/_search
{ "query": { "term": { "uuid": "33333" } } }

2.    To execute a term query as a filter (so that the score is skipped to speed up the process without impacting the scoring for simple filtering), we need to use it wrapped in a Boolean query. 
POST /mybooks/_search
{ "query": { "bool": { "filter": { "term": { "uuid": "33333" } } } } }

If the score is not important, opt to use the term filter.

Using a terms query

If you want to search for multiple terms, you can process it in two ways: either using a Boolean query or using a multi-term query.

POST /mybooks/_search
{"query": { "terms": { "uuid": [ "33333", "32222" ]}}}
==> Select * from *** where uuid in ("33333", "22222").
GET /my-index/document/_search
{ "query": {
    "terms": {
      "can_see_groups": {
        "index": "my-index", "type": "user",
        "id": "1bw71LaxSzSp_zV6NB_YGg", "path": "groups"
      } } } }
==> select * from xxx where can_see_group in (select groups from user where user_id='1bw71LaxSzSp_zV6NB_YGg')

extra parameters to control the query behavior:
• minimum_match/minimum_should_match: This controls how many matched terms are required to validate the query:
"terms": {
"color": ["red", "blue", "white"],
"minimum_should_match":2
}

Using a terms set query

the minimum number of terms that should be matched via a related field in the document or via scripting code
1.    We will define an item mapping for an item entity:
PUT /ch05-item
{ "mappings": {
    "properties": {
      "name": { "type": "keyword" },
      "labels": { "type": "keyword" },
      "match_number": { "type": "integer" }
    } } }
2.    We ingest some records via the following bulk command:
POST _bulk
{"index":{"_index":"ch05-item", "_id":"1"}}
{"name":"11111","labels":["one"],"match_number":2}
{"index":{"_index":"ch05-item", "_id":"2"}}
{"name":"22222","labels":["one", "two"],"match_number":2}
{"index":{"_index":"ch05-item", "_id":"3"}}
{"name":"33333","labels":["one", "two", "three"],"match_number":3}
{"index":{"_index":"ch05-item", "_id":"4"}}
{"name":"44444","labels":["one", "two", "four"],"match_number":3}


3.    We want to select all the items that have the "one", "two", and "three" labels with the number of matches defined in the match_number field. The following terms_set query will achieve this:
GET /ch05-item/_search
{ "query": {
    "terms_set": {
      "labels": {
        "terms": [ "one", "two", "three"],
        "minimum_should_match_field": "match_number"
      } } } }

Using a prefix query

1.    We execute a prefix query from the command line, as follows:
POST /mybooks/_search
{ "query": { "prefix": { "uuid": "222" } } }

convert a suffix query into a prefix

example:match a document with filename, with the png ending extension. 

1.    We define reverse_analyzer to the index level and put this into the settings, as follows:
{
  "settings": {
    "analysis": {
      "analyzer": {
        "reverse_analyzer": {
          "type": "custom",
          "tokenizer": "keyword",
          "filter": [ "lowercase", "reverse" ]
        } } } } }
2.    When we define the filename field, we use reverse_analyzer for its subfield, as follows:
  "filename": {
    "type": "keyword",
    "fields": {
      "rev": { "type": "text", "analyzer": "reverse_analyzer" } } }
3.    Now we can search using a prefix query, using a similar query, as follows:
"query": { "prefix": { "filename.rev": ".jpg" } }

Using a wildcard query

• *: This means you need to match zero or more characters.
• ?: This means you need to match one character.

POST /mybooks/_search
{ "query": { "wildcard": { "uuid": "22?2*" } } }

To improve performance, it's suggested that you do not execute a wildcard query that starts with * or ?.

Using a regexp query

The parameters that are used to control this process are listed as follows:
• boost (the default is 1.0): This includes the values used for boosting the score for this query.
• flags: This is a list of one or more flags (pipe | delimiter. The available flags are listed as follows:
a) ALL: This enables all of the optional regexp syntaxes.
b) ANYSTRING: This enables any (@) string.
c) AUTOMATON: This enables named automation (<identifier>).
d) COMPLEMENT: This enables complements (~).
e) EMPTY: This enables empty language (#).
f) INTERSECTION: This enables intersections (&).
g) INTERVAL: This enables numerical intervals (<n-m>).
h) NONE: This enables no optional regexp syntax.


POST /mybooks/_search
{ "query": {
    "regexp": {
      "description": {
        "value": "j.*",
        "flags": "INTERSECTION|COMPLEMENT|EMPTY"
      } } } }

Using span queries(跨度)

Span queries allow you to define several kinds of queries:
•   The exact phrase query.
•   The exact fragment query (that is, take off and give up).
•   Partial exact phrase with a slop (other tokens between the searched terms, that is, the man with slop 2 can also match the strong man, the old wise man, and more).
搜索"the man" 匹配 "the strong man","the old wise man"

1.    The main element in span queries is span_term whose usage is similar to the term of the standard query. It is possible to aggregate more than one span_term value to formulate a span query.
2.    The span_first query defines a query in which the span_term value must either match the first token or be close to it. :
POST /mybooks/_search
{ "query": {
    "span_first": {
      "match": { "span_term": {"description": "joe"}},
      "end": 5
    } } }
3.（同义词）The span_or query is used to define multi-values in a span query. This is very handy for simple synonym searches:
POST /mybooks/_search
{ "query": {
    "span_or": {
      "clauses": [
        { "span_term": { "description": "nice" } },
        { "span_term": { "description": "cool" } },
        { "span_term": { "description": "wonderful"}}
      ] } } }
The list of clauses is the core of the span_or query because it contains the span terms that should match.
4.    Similar to span_or, there is a span_multi query that wraps multi-term queries such as prefix, wildcard, and more.:
POST /mybooks/_search
{ "query": {
    "span_multi": {
      "match": {
        "prefix": { "description": { "value": "jo" } }
      } } } }
5.    Queries can be used to create the span_near query. This allows you to control the token sequence of the query, such as the ordering and amount of distance between the terms (slop), as follows:
POST /mybooks/_search
{ "query": {
    "span_near": {
      "clauses": [
        { "span_term": { "description": "nice" } },
        { "span_term": { "description": "joe" } },
        { "span_term": { "description": "guy" } }
      ],
      "slop": 3, "in_order": false } } }
6.    For complex queries, skipping matching positional tokens is very important. This can be achieved with the span_not query:
POST /mybooks/_search
{ "query": {
    "span_not": {
      "include":{"span_term":{"description": "nice"}},
      "exclude": {
        "span_near": {
          "clauses": [
            { "span_term": { "description": "not" } },
            { "span_term": { "description": "nice" }}
          ],
          "slop": 1, "in_order": true
        } } } } }
The include section contains the span that must be matched, while exclude contains the span that must not be matched. It matches documents with the term nice but not not nice. This can be very useful for excluding negative phrases!
7.    To search with a span query that is surrounded by other terms, we can use the span_containing variable, as follows:
POST /mybooks/_search
{ "query": {
    "span_containing": {
      "little": {
        "span_term": { "description": "nice"}
      },
      "big": {
        "span_near": {
          "clauses": [
            { "span_term": { "description": "not" } },
            { "span_term": { "description": "guy" } }
          ],
          "slop": 5, "in_order": true
        } } } } }
The little section contains the span that must be matched. The big section contains the span that contains the little matches. In the preceding case, the matched expression will be similar to not * nice * guy.
8.    To search with a span query that is enclosed by other span terms, we can use the span_within variable, as follows:
POST /mybooks/_search
{"query": {
    "span_within": {
      "little": {
        "span_term": { "description": "nice" }
      },
      "big": {
        "span_near": {
          "clauses": [
            { "span_term": { "description": "not" } },
            { "span_term": { "description": "guy" } }
          ],
          "slop": 5, "in_order": true
        } } } } }
The little section contains the span that must be matched. The big section contains the span that contains the little matches.

Using a match query


1.    The standard usage of a match query simply requires the field name and the query text. 
POST /mybooks/_search
{ "query": {
    "match": {
      "description": {
        "query": "nice guy", "operator": "and"
      } } } }
2.    If you need to execute the same query as a phrase query, the type changes from match to match_phrase
POST /mybooks/_search
{ "query": { "match_phrase": { "description": "nice guy" } } }

3.    An extension of the previous query that is used in text completion or in search as you type functionality is match_phrase_prefix, as follows:
POST /mybooks/_search
{ "query": { "match_phrase_prefix": {"description": "nice gu" } } }
4.    A common requirement is searching for several fields with the same query call. The multi_match parameter provides this capability, as shown in the following example:
POST /mybooks/_search
{ "query": {
    "multi_match": {
      "fields": [ "description", "name" ],
      "query": "Bill", "operator": "and" } } }

Using a query string query


POST /mybooks/_search
{ "query": {
    "query_string": {
      "query": """"nice guy" -description:not price:{ * TO 5 } """,
      "fields": [ "description^5" ],
      "default_operator": "and" } } }

Using a simple query string query


POST /mybooks/_search
{ "query": {
    "simple_query_string": {
      "query": """"nice guy" -not""",
      "fields": [ "description^5", "_all" ],
      "default_operator": "and" } } }

Using the range query

POST /mybooks/_search
{ "query": {
    "range": {
      "price": {
        "from": 3, "to": 6,
        "include_lower": true, "include_upper": false
      } } } }

Using an IDs query

POST /mybooks/_search
{ "query": { "ids": { "values": [ "1", "2", "3" ] } }}

Using the function score query

The function_score query allows us to define a function that controls the score of the documents that are returned by a query.

The common scenarios used for this query are listed as follows:
•   Creating a custom score function (for example, with the decay function)
•   Creating a custom boost factor, for example, based on another field (that is, boosting a document by distance from a point)
•   Creating a custom filter score function, for example, based on scripting Elasticsearch capabilities
•   Ordering the documents randomly

POST /mybooks/_search
{ "query": {
    "function_score": {
      "query": { "query_string":{ "query": "bill" } },
      "functions": [
        { "linear": {
            "position":{ "origin": "0", "scale": "20"}
          } } ],
      "score_mode": "multiply" } } }

Using the exists query

Due to its schema-less nature, two kinds of queries are required:
• Exists field: This is used to check whether a field exists in a document.
• Missing field: This is used to check whether a field is missing in a document.

1.    To search all of the test-type documents that have a field called description, the query will be as follows:
POST /mybooks/_search
{ "query": { "exists": { "field": "description" } } }

POST /mybooks/_search
{ "query": { "exists": { "field": "address.city" } } }

2.    We can search all the test-type documents that do not have a field called description because there is no missing query, and we can obtain it using the Boolean must_ not query; the query will be as follows:
POST /mybooks/_search
{ "query": { "bool": { "must_not": { "exists": { "field": "description" } } } } }

Using a pinned query (XPACK)

POST /mybooks/_search
{ "query": {
    "pinned": {
      "ids": ["1","2","3"],
      "organic": { "term": { "description": "bill" } }
    } } }

xueshijun666

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
ElasticSearch-Text and Numeric Queries

搜索"the man" 匹配 "the strong man","the old wise man"
复制链接

扫一扫