ElasticSearch操作索引的常用方式

最新推荐文章于 2024-07-01 10:11:57 发布

Charles Yan

最新推荐文章于 2024-07-01 10:11:57 发布

阅读量906

点赞数

分类专栏： ElasticSearch 文章标签： elasticsearch

本文链接：https://blog.csdn.net/weixin_42586723/article/details/104983189

版权

ElasticSearch 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

ElasticSearch操作索引的常用方式

Query String Search
Query DSL
Scoll滚动搜索
Query Filter(过滤搜索)
Full-Text Search(全文索引搜索)
Phrase Search(短语搜索)
HighLight Search(高亮搜索)
FieldSort排序搜索
参考链接

Query String Search

介绍

search参数都是以http请求的query string附带的

适用场景

适用于临时的在命令行使用工具快速发出请求来检索想要的信息，例如Linux环境中使用curl命令，或者借助postman等第三方工具发送http请求。不适合复杂构建复杂查询请求

使用案例

案例一：搜索全部商品

命令行
```
    GET /ecommerce/product/_search
```

案例二：搜索商品名称中包含yagao的商品，并按照售价降序排序

命令行


    GET /ecommerce/product/_search?q=name:yagao&sort=price:desc

查询结果


    {
        "took": 2,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "failed": 0
        },
        "hits": {
            "total": 3,
            "max_score": 1,
            "hits": [
            {
                "_index": "ecommerce",
                "_type": "product",
                "_id": "2",
                "_score": 1,
                "_source": {
                "name": "jiajieshi yagao",
                "desc": "youxiao fangzhu",
                "price": 25,
                "producer": "jiajieshi producer",
                "tags": [
                    "fangzhu"
                ]
                }
            },
            {
                "_index": "ecommerce",
                "_type": "product",
                "_id": "1",
                "_score": 1,
                "_source": {
                "name": "gaolujie yagao",
                "desc": "gaoxiao meibai",
                "price": 30,
                "producer": "gaolujie producer",
                "tags": [
                    "meibai",
                    "fangzhu"
                ]
                }
            },
            {
                "_index": "ecommerce",
                "_type": "product",
                "_id": "3",
                "_score": 1,
                "_source": {
                "name": "zhonghua yagao",
                "desc": "caoben zhiwu",
                "price": 40,
                "producer": "zhonghua producer",
                "tags": [
                    "qingxin"
                ]
                }
            }
            ]
        }
    }

字段解释

took：耗费毫秒数
timed_out：是否超时，这里是没有
_shards：数据拆成了5个分片，所以对于搜索请求，会打到所有的primary shard（或者是它的某个replica shard也可以）
hits.total：查询结果的数量，3个document
hits.max_score：score的含义，就是document对于一个search的相关度的匹配分数，越相关，就越匹配，分数也高
hits.hits：包含了匹配搜索的document的详细数据

其他语法

+和-


    ## 搜索test_field中包含test的数据
    GET /test_index/test_type/_search?q=test_field:test
    ## +是必须包含
    GET /test_index/test_type/_search?q=+test_field:test
    ## -是不包含
    GET /test_index/test_type/_search?q=-test_field:test

_all metadata
1. 语法
```
    ## 直接可以搜索所有的field，任意一个field包含指定的关键字就可以搜索出来
    GET /test_index/test_type/_search?q=test
```
2. 原理
  
  es中的_all元数据，在建立索引的时候，我们插入一条document，它里面包含了多个field，此时，es会自动将多个field的值，全部用字符串的方式串联起来，变成一个长的字符串，作为_all field的值，同时建立索引
  
  后面如果在搜索的时候，没有对某个field指定搜索，就默认搜索_all field，其中是包含了所有field的值的
3. 案例
```
    ## "jack 26 jack@sina.com guangzhou"，作为这一条document的_all field的值，同时进行分词后建立对应的倒排索引
    {
      "name": "jack",
      "age": 26,
      "email": "jack@sina.com",
      "address": "guangzhou"
    }
```

Query DSL

介绍

DSL: Domain Specified Language(特定领域的语言)  

这种方法是通过一个json格式的http request body请求体作为条件，可以完成多种复杂的查询需求

使用案例

案例一：搜索全部商品

命令行


    GET /ecommerce/product/_search
    {
        "query": {
            "match_all": {}
        }
    }

案例二：搜索商品名称中包含yagao的商品，并按照售价降序排序

命令行


    GET /ecommerce/product/_search

    {

        "query" : {

            "match" : {

                "name" : "yagao"

            }

        },

        "sort": [

            { "price": "desc" }

        ]

    }

案例三：分页查询

命令


    ## 分页查询商品，总共3条商品，假设每页就显示1条商品，现在显示第2页，所以就查出来第2个商品。from://从第几个商品开始查
    GET /ecommerce/product/_search

    {

    "query": { "match_all": {} },

    "from": 1,  

    "size": 1

    }

注意事项
1. 在实际项目中，如果有条件查询之后再需要分页，不需要单独查询总条数，ES会返回满足条件的总条数，可以直接使用；
2. ES的分页默认from是从0开始的

案例四：查询特定域(字段)

命令


    ## 只查询name和price字段
    GET /ecommerce/product/_search

    {

    "query": { "match_all": {} },

    "_source": ["name", "price"]

    }

常用语法

boolQuery组合搜索

命令



    GET /website/article/_search

    {

        "query": {

            "bool": {

                "must": [   //title必须包含elasticsearch

                    {

                    "match": {

                        "title": "elasticsearch"

                    }

                    }

                ],

                "should": [  //content可以包含elasticsearch也可以不包含

                    {

                    "match": {

                        "content": "elasticsearch"

                    }

                    }

                ],

                "must_not": [ //author_id必须不为111

                    {

                    "match": {

                        "author_id": 111

                    }

                    }

                ]

            }

        }

    }

matchAllQuery全文匹配搜索

命令



    GET /_search

    {

        "query": {

            "match_all": {}

        }

    }

matchQuery单字段按字符串模糊搜索

命令



    GET /_search

    {

        "query": { 
            
            "match": { 
                
                "title": "my elasticsearch article" 
            }
        }

    }

multiMatchQuery多字段按字符串模糊搜索

命令



    GET /test_index/test_type/_search

    {

        "query": {

            "multi_match": {

            "query": "test",  //搜索的文本

            "fields": ["test_field", "test_field1"]  //多个field上面搜索

            }

        }

    }

rangeQuery某字段范围搜索

命令



    GET /company/employee/_search

    {

        "query": { 
            
            "range" : {
                "age" : {
                    "gte" : 10, // 大于等于10
                    "lte" : 20, // 小于等于20
                    "boost" : 2.0
                }
            }
        }

    }

termQuery单词条完全匹配搜索

命令


    ## 把这个字段当成exact value去查询(前提条件：手动创建mapping的时候需要指定no_analy不分词去建立索引，这样才可以用test hello在term搜到)

    GET /test_index/test_type/_search

    {

        "query": {

            "term": {

            "test_field": "test hello"

            }

        }

    }

termsQuery多词条完全匹配搜索

命令



    GET /_search

    {

        "query": { 
            "terms": { 
                "tag": [ "search", "full_text", "nosql" ]
            }
        }  //对tag字段指定多个搜索词

    }

Scoll滚动搜索

介绍
1. 使用scoll滚动搜索，可以先搜索一批数据，然后下次再搜索一批数据，以此类推，直到搜索出全部的数据来
2. scoll搜索会在第一次搜索的时候，保存一个当时的视图快照，之后只会基于该旧的视图快照提供数据搜索，如果这个期间数据变更，是不会让用户看到的
3. 采用基于_doc进行排序的方式，性能较高
4. 每次发送scroll请求，我们还需要指定一个scoll参数，指定一个时间窗口，每次搜索请求只要在这个时间窗口内能完成就可以了
5. scoll，看起来挺像分页的，但是其实使用场景不一样。分页主要是用来一页一页搜索，给用户看的；scoll主要是用来一批一批检索数据，让系统进行处理的。

命令


    ## 每次取3条
    GET /test_index/test_type/_search?scroll=1m

    {

    "query": {

        "match_all": {}

    },

    "sort": [ "_doc" ],

    "size": 3

    }



    ## 获得的结果会有一个scoll_id，下一次再发送scoll请求的时候，必须带上这个scoll_id
    
    GET /_search/scroll

    {

        "scroll": "1m",

        "scroll_id" : "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAACxeFjRvbnNUWVZaVGpHdklqOV9zcFd6MncAAAAAAAAsYBY0b25zVFlWWlRqR3ZJajlfc3BXejJ3AAAAAAAALF8WNG9uc1RZVlpUakd2SWo5X3NwV3oydwAAAAAAACxhFjRvbnNUWVZaVGpHdklqOV9zcFd6MncAAAAAAAAsYhY0b25zVFlWWlRqR3ZJajlfc3BXejJ3"

    }

Query Filter(过滤搜索)

命令


    ## 搜索商品名称包含yagao，而且售价大于25元的商品

    GET /ecommerce/product/_search

    {

        "query" : {

            "bool" : {

                "must" : {

                    "match" : {

                        "name" : "yagao"

                    }

                },

                "filter" : {

                    "range" : {

                        "price" : { "gt" : 25 }

                    }

                }

            }

        }

    }


    ## 搜索title域必须包含how to make millions，tag域不包含spam，tag域可以包含starred，日期大于等于2014-01-01的数据
    {

        "bool": {

            "must":     { "match": { "title": "how to make millions" }},

            "must_not": { "match": { "tag":   "spam" }},

            "should": [

                { "match": { "tag": "starred" }}

            ],

            "filter": {

            "range": { "date": { "gte": "2014-01-01" }}

            }

        }

    }


    ## 搜索title域必须包含how to make millions，tag域不包含spam，tag域可以包含starred，日期大于等于2014-01-01，价格小于等于29.99，且category域必须是ebooks的数据
    {

        "bool": {

            "must":     { "match": { "title": "how to make millions" }},

            "must_not": { "match": { "tag":   "spam" }},

            "should": [

                { "match": { "tag": "starred" }}

            ],

            "filter": {

                "bool": {

                    "must": [

                        { "range": { "date": { "gte": "2014-01-01" }}},

                        { "range": { "price": { "lte": 29.99 }}}

                    ],

                    "must_not": [

                        { "term": { "category": "ebooks" }}

                    ]

                }

            }

        }

    }



    ## 搜索年龄大于等于30的数据
    GET /company/employee/_search

    {

        "query": {

            "constant_score": {  //constant_score是固定语法单纯使用filter的时候需要加上的

            "filter": {

                "range": {

                    "age": {

                        "gte": 30

                    }

                }

            }

            }

        }

    }

Full-Text Search(全文索引搜索)

命令


    ## 搜索生产厂商字段中包含"yagao producer"的商品记录：
    GET /ecommerce/product/_search

    {

        "query" : {

            "match" : {

                "producer" : "yagao producer"

            }

        }

    }

Phrase Search(短语搜索)

概念
1. 跟全文检索相对应，相反，全文检索会将输入的搜索串拆解开来，去倒排索引里面去一一匹配，只要能匹配上任意一个拆解后的单词，就可以作为结果返回
2. phrase search，要求输入的搜索串，必须在指定的字段文本中，完全包含一模一样的，才可以算匹配，才能作为结果返回

命令


    GET /ecommerce/product/_search

    {

        "query" : {

            "match_phrase" : {

                "producer" : "yagao producer"

            }

        }

    }

HighLight Search(高亮搜索)

命令


    GET /ecommerce/product/_search

    {

        "query" : {

            "match" : {

                "producer" : "producer"

            }

        },

        "highlight": {

            "fields" : {

                "producer" : {}

            }

        }

    }

FieldSort排序搜索

定义

默认情况下，是按照_score降序排序，但某些情况下，借助其他方式进行排序

filter过滤搜索无排序

命令



    ## 查询author_id是1的信息
    GET /_search

    {

        "query" : {

            "bool" : {

                "filter" : {

                    "term" : {

                        "author_id" : 1

                    }

                }

            }

        }

    }


    ## 前面提到的单独使用filter时需要用constant_score
    GET /_search

    {

        "query" : {

            "constant_score" : {

                "filter" : {

                    "term" : {

                        "author_id" : 1

                    }

                }

            }

        }

    }

自定义粗略排序

命令


    GET /company/employee/_search

    {

    "query": {

        "constant_score": {

            "filter": {

                "range": {

                "age": {

                    "gte": 30

                }

                }

            }

        }

    },

    "sort": [

        {

            "join_date": {

                "order": "asc"

            }

        }

    ]

    }

自定义精确排序

定义

如果对一个string field进行排序，结果往往不准确，因为分词后是多个单词，再排序就不是我们想要的结果了

通常解决方案是，将一个string field建立两次索引，一个分词，用来进行搜索；一个不分词，用来进行排序

新增样例命令


    ## 静态索引模板局部
    PUT /website

    {

    "mappings": {

        "article": {

        "properties": {

            "title": {

                "type": "text", //分词索引

                "fields": {

                    "raw": {     //不分词索引

                        "type": "string",

                        "index": "not_analyzed"

                    }

                },

                "fielddata": true  //正排索引

            },

            "content": {

                "type": "text"

            },

            "post_date": {

                "type": "date"

            },

            "author_id": {

                "type": "long"

            }

        }

        }

    }

    }


    ## 新增单条信息

    PUT /website/article/1

    {

        "title": "first article",

        "content": "this is my second article",

        "post_date": "2017-01-01",

        "author_id": 110

    }

排序搜索样例命令


    GET /website/article/_search

    {

    "query": {

        "match_all": {}

    },

    "sort": [

        {

            "title.raw": {  //拿未分词索引的去排，上面有创建了

                "order": "desc"

            }

        }

    ]

    }





    ## 搜索结果

    {

        "took": 2,

        "timed_out": false,

        "_shards": {

            "total": 5,

            "successful": 5,

            "failed": 0

        },

        "hits": {

            "total": 3,

            "max_score": 1,

            "hits": [

            {

                "_index": "website",

                "_type": "article",

                "_id": "2",

                "_score": 1,

                "_source": {

                "title": "first article",

                "content": "this is my first article",

                "post_date": "2017-02-01",

                "author_id": 110

                }

            },

            {

                "_index": "website",

                "_type": "article",

                "_id": "1",

                "_score": 1,

                "_source": {

                "title": "second article",

                "content": "this is my second article",

                "post_date": "2017-01-01",

                "author_id": 110

                }

            },

            {

                "_index": "website",

                "_type": "article",

                "_id": "3",

                "_score": 1,

                "_source": {

                "title": "third article",

                "content": "this is my third article",

                "post_date": "2017-03-01",

                "author_id": 110

                }

            }

            ]

        }

    }