ES进阶语法

最新推荐文章于 2024-08-30 16:23:11 发布

喜欢学习的小棉袄

最新推荐文章于 2024-08-30 16:23:11 发布

阅读量575

点赞数

分类专栏：项目整理 Elasticsearch 文章标签： elasticsearch 大数据 java es 数据库

本文链接：https://blog.csdn.net/qq_42605968/article/details/106183395

版权

项目整理同时被 2 个专栏收录

21 篇文章 3 订阅

订阅专栏

Elasticsearch

4 篇文章 0 订阅

订阅专栏

在练习语法前，先导入官网中的数据，来进行各种的语法测试

导入数据

samples
在这里插入图片描述

Elasticsearch语法学习

官方文档上有详细的操作过程，就根据官方文档来进行操作

两种语法的编写形式（倾向于第一种，也叫作QueryDSL）
分页查询

GET /bank/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "account_number": "asc",
      "age": "desc"
    }
  ],
  "from": 0,
  "size": 10
}

查看部分字段

GET /bank/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "account_number": "asc",
      "age": "desc"
    }
  ],
  "from": 0,
  "size": 10,
  "_source": ["balance","firstname"]
}

match匹配，可以精确也可以模糊，按照max_score评分进行排序（倒排索引）

GET /bank/_search
{
  "query": {
    "match": {
      "address": "Mill"
    }
  },
  "sort": [
    {
      "account_number": "asc",
      "age": "desc"
    }
  ],
  "from": 0,
  "size": 10,
  "_source": ["balance","firstname"]
}

match_phrase短语匹配，查询时需要查的单元是完整的短语，不对短语进行分词，如果使用.keyword的话，那么就要求这个属性中只由这一个短语构成，否则就搜索不到

GET /bank/_search
{
  "query": {
    "match_phrase": {
      "address": "Mill Lane"
    }
  },
  "sort": [
    {
      "account_number": "asc",
      "age": "desc"
    }
  ],
  "from": 0,
  "size": 10
}

多字段匹配，给多个字段进行query

GET /bank/_search
{
  "query": {
    "multi_match": {
      "query": "Mill",
      "fields": ["address","city"]
    }
  }
}

bool复合查询，有must（必须满足），must_not（必须不能满足），should（可满足，可不满足）

GET /bank/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "gender": "M"
          }
        },
        {
          "match": {
            "address": "mill"
          }
        }
      ]
    }
  }
}

filter结果过滤，前面的这些检索都可以使用filter来做到，不同的是，filter不会计算相关性得分，这些操作都可以组合在bool中

GET /bank/_search
{
  "query": {
    "bool": {
      "filter": {
        "range": {
          "age": {
            "gte": 10,
            "lte": 30
          }
        }
      }
    }
  }
}

term查询，对于一些属性的精确查找使用term（match可以完成精确和非精确）

GET /bank/_search
{
  "query": {
    "term": {
      "age":"28"
    }
  }
}

aggreation聚合

# aggreation  address 包含mill的人的年龄分布和平均年龄
GET /bank/_search
{
  "query": {
    "match": {
      "address": "mill"
    }
  },
  "aggs": {
    "allAge": {
      "terms": {
        "field": "age",
        "size": 10
      }
    },
    "aveAge":{
      "avg": {
        "field": "age"
      }
    }
  },
  "size": 0
}

在aggreation中有很多字段，后边的size等于0是为了不出现检索数据，只出现聚合数据。聚合不仅仅和水平，也可以嵌套。

在ES6后不推荐使用Type，而是直接索引下就是数据（为了减少冲突，增加效率），映射就是每个字段的类型，比如keyword，address，long等，不用指定，ES会默认进行猜测。当然也可以在创建索引的时候给一些字段指定每个字段的映射规则

# 创建索引并制定映射关系
PUT /my-index
{
  "mappings": {
    "properties": {
      "age":{"type": "integer"},
      "email":{"type": "keyword"},
      "name":{"type": "text"}
    }
  }
}

添加新的字段，可以通过index为true来设置不被索引（相当于是冗余字段）

# 给已经创建的所有添加映射
PUT /my-index/_mapping
{
  "properties": {
    "employee-id": {
      "type": "keyword",
      "index": false
    }
  }
}

可以通过GET /my-index/_mapping来查看所有的映射

修改映射（没有update）只能进行数据迁移（创建一个新的索引，将以前的数据迁移过去）

PUT /new-index
{
  "mappings": {
     "properties" : {
        "account_number" : {
          "type" : "long"
        },
        "address" : {
          "type" : "keyword"
        },
        "age" : {
          "type" : "integer"
        },
        "balance" : {
          "type" : "long"
        },
        "city" : {
          "type" : "keyword"
        },
        "email" : {
          "type" : "keyword"
        },
        "employer" : {
          "type" : "text"
        },
        "firstname" : {
          "type" : "text"
        },
        "gender" : {
          "type" : "keyword"
        },
        "lastname" : {
          "type" : "text"
        },
        "state" : {
          "type" : "keyword"
        }
      }
  }
}

数据迁移，将老index，type为account的移动到new-index下

POST _reindex
{
  "source": {
    "index": "bank",
    "type": "account"
  },
  "dest": {
    "index": "new-index"
  }
}

在这里插入图片描述

分词器，默认使用标准分词器，可以使用此语法查看分词效果默认的分词器不能满足我们要的效果，因为对中文支持不好，因此需要安装分词器插件
在linux下安装分词器，进入ES的plugins下
wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.4.2/elasticsearch-analysis-ik-7.4.2.zip
unzip elasticsearch-analysis-ik-7.4.2.zip

为所有用户添加所有权限

进入bin目录下查看插件是否安装成功，然后docker restart elasticsearch

测试ik分词器
使用ik的智能分词

使用最大单次组合

也可以自定义词库完成分词，需要在分词器里面进行配置。
我使用nginx服务器完成自定义分词的管理，在nginx的html/es目录下创fenci.txt，并将要规定的词写到文件中