ElasticSearch

最新推荐文章于 2024-06-22 00:28:57 发布

水花反复横跳

最新推荐文章于 2024-06-22 00:28:57 发布

阅读量765

点赞数

文章标签： elasticsearch 大数据

本文链接：https://blog.csdn.net/qq_43585580/article/details/125714287

版权

第1章 ElasticSearch介绍

1.1 什么是搜索

一般指通过某个关键字，检索出和关键字相关的信息，属于模糊查询的一种。

1.2 关系型数据库不适合检索

在这里插入图片描述

模糊查询会导致索引失效，全表扫描，效率低。
关系型数据库查询时，不能分词，联想，得到的不是期望的结果。

1.3 全文检索框架介绍

在这里插入图片描述

1.4 全文检索依赖倒排索引

全文检索最初的含义指提供一个关键字，在整篇文章中，搜索和关键字匹配的片段。在应用开发中一般指提供一个关键字，在整个数据库中，搜索和关键字匹配的数据。
如果要实现全文检索，必须依赖倒排索引。

在这里插入图片描述

1.5 什么是ElasticSearch

ElasticSearch，基于Lucene，隐藏复杂性，提供简单易用的RestfulAPI接口、JavaAPI接口（还有其他语言的API接口）。
关于ElasticSearch的一个传说，有一个程序员失业了，陪着自己老婆去英国伦敦学习厨师课程。程序员在失业期间想给老婆写一个菜谱搜索引擎，觉得Lucene实在太复杂了，就开发了一个封装了Lucene的开源项目：Compass。后来程序员找到了工作，是做分布式的高性能项目的，觉得Compass不够，就写了ElasticSearch，让Lucene变成分布式的系统。
ElasticSearch是一个实时分布式搜索和分析引擎。它用于全文搜索、结构化搜索、分析。

1.6 ElasticSearch的适用场景

1）维基百科，类似百度百科，牙膏，牙膏的维基百科，全文检索，高亮，搜索推荐。
2）The Guardian（国外新闻网站），类似搜狐新闻，用户行为日志（点击，浏览，收藏，评论）+ 社交网络数据（对某某新闻的相关看法），数据分析，给到每篇新闻文章的作者，让他知道他的文章的公众反馈（好，坏，热门，垃圾，鄙视，崇拜）。
3）Stack Overflow（国外的程序异常讨论论坛），IT问题，程序的报错，提交上去，有人会跟你讨论和回答，全文检索，搜索相关问题和答案，程序报错了，就会将报错信息粘贴到里面去，搜索有没有对应的答案。
4）GitHub（开源代码管理），搜索上千亿行代码。
5）国内：站内搜索（电商，招聘，门户，等等），IT系统搜索（OA，CRM，ERP，等等），数据分析（ES热门的一个使用场景）。

1.7 ElasticSearch的特点

ES是分布式。数据写入ES时是分布式存储（分片、副本）
好处：单台集群的存储能力有限，通过分片可以拓展存储能力

每台机器都是500G的磁盘，希望存储600G的数据。
可以将600G的数据切成小片，分布式存储。
可以提升读写单个数据的IO能力
不切片，读写只会发生在一台机器，使用一台机器IO
切片，N台机器IO

天然分片
ES把数据分成多个shard，下图中的P0-P2，多个shard可以组成一份完整的数据，这些shard可以分布在集群中的各个机器节点中。随着数据的不断增加，集群可以增加多个分片，把多个分片放到多个机子上，已达到负载均衡，横向扩展。
天然集群
一台ES实例也可以组成一个集群，方便扩容。在扩容时，只需要在其他节点安装ES，直接启动，新节点会基于配置自动在网段中寻找ES集群，自动申请加入集群。
天然索引
mysql和其他的数据库，需要手动创建索引。 ES在插入数据后自动创建索引。
近实时
从写入数据到数据可以被搜索到有一个小延迟（大概1秒）

1.8 ElasticSearch的核心概念

ElasticSearch与数据库的类比:

关系型数据库（比如Mysql）	非关系型数据库（ElasticSearch）
数据库Database	索引Index
表Table	类型Type(7.0版本Type名称固定且默认为_doc)
数据行Row	文档Document(JSON格式)
数据列Column	字段Field
约束 Schema	映射Mapping

1.9 ElasticSearch存入数据和搜索数据机制

在这里插入图片描述

1）索引对象（blog）：存储数据的表结构，任何搜索数据，存放在索引对象上。
2）映射（mapping）：数据如何存放到索引对象上，需要有一个映射配置，包括：数据类型、是否存储、是否分词等。
3）文档（document）：一条数据记录，存在索引对象上。
4）文档类型（type）：一个索引对象，存放多种类型数据，数据用文档类型进行标识。

1.10 REST

REST是一种思想，推崇简洁和规范的URL表达。
在这里插入图片描述

在这里插入图片描述

第2章 ElasticSearch安装

2.1 安装包下载

ElasticSearch官网： https://www.elastic.co/cn/downloads/elasticsearch

2.2 ElasticSearch安装

2.2.1 解压安装ElasticSearch

1）解压elasticsearch-7.8.0.tar.gz到/opt/module目录下，并更名

[chenyunde@hadoop102 ~]$ tar -zxvf /opt/software/elasticsearch-7.8.0-linux-x86_64.tar.gz -C /opt/module/

[chenyunde@hadoop102 ~]$ mv /opt/module/elasticsearch-7.8.0 /opt/module/elasticsearch

2）在/opt/module/elasticsearch路径下创建data文件夹

[chenyunde@hadoop102 elasticsearch]$ mkdir data

3）修改配置文件/opt/module/elasticsearch/config/elasticsearch.yml

[chenyunde@hadoop102 config]$ pwd

/opt/module/elasticsearch/config

[chenyunde@hadoop102 config]$ vi elasticsearch.yml

#-----------------------Cluster-----------------------
cluster.name: myes
#-----------------------Node-----------------------
node.name: node102
#-----------------------Paths-----------------------
path.data: /opt/module/elasticsearch/data
#-----------------------Memory-----------------------
bootstrap.memory_lock: false
#-----------------------Network-----------------------
network.host: hadoop102 
#-----------------------Discovery-----------------------
discovery.seed_hosts: ["hadoop102", "hadoop103","hadoop104"]
cluster.initial_master_nodes: ["node102", "node103","node104"]

（1）cluster.name
如果要配置集群需要两个节点上的elasticsearch配置的cluster.name相同，都启动可以自动组成集群，这里如果不改cluster.name则默认是cluster.name=my-application，
（2）nodename随意取但是集群内的各节点不能相同
（3）修改后的每行前面不能有空格，修改后的“：”后面必须有一个空格
分发至hadoop103以及hadoop104，分发之后修改配置文件：

[chenyunde@hadoop102 module]$ xsync elasticsearch/

#在hadoop103机器修改以下信息
node.name: node103
network.host: hadoop103

#在hadoop104机器修改以下信息
node.name: node104
network.host: hadoop104

2.2.2 调整linux内核参数

1）切换到root用户，编辑limits.conf 添加类似如下内容
[chenyunde@hadoop102 elasticsearch]# vi /etc/security/limits.conf
添加如下内容:

* soft nofile 65536
* hard nofile 131072
* soft nproc 2048
* hard nproc 4096

注意：“*” 不要省略掉。以上操作为增加Linux文件系统中可以同时打开的文件句柄数。
2）切换到root用户修改配置sysctl.conf
[chenyunde@hadoop102 elasticsearch]# vi /etc/sysctl.conf
添加下面配置：
vm.max_map_count=655360
以上修改的Linux配置需要分发至其他节点
然后，重新启动Linux，必须重启！！！

2.3 ElasticSearch启动

2.3.1 单点启动

[chenyunde@hadoop102 config]$ /opt/module/elasticsearch/bin/elasticsearch

2.3.2 群起脚本

#!/bin/bash
if(($#!=1))
then
        echo 请输入单个start或stop参数！
        exit
fi


if [ $1 = start ]
then
        cmd="nohup /opt/module/elasticsearch/bin/elasticsearch > /dev/null  2>&1 &"
        elif [ $1 = stop ]
        then
                cmd="ps -ef | grep Elasticsearch | grep -v grep | awk  '{print \$2}' | xargs kill "
else
        echo 请输入单个start或stop参数！
fi

for i in hadoop102 hadoop103 hadoop104
do
        echo "--------------$i-----------------"
        ssh $i $cmd
        sleep 3s
done

2.3.3 验证启动

在这里插入图片描述

2.4 安装Kibana

单一视图：连接集群中的任意一个节点，读写的功能是一样的
Kibana是一个用于探索、可视化和分享ES数据的客户端。因此在任一节点安装即可。
将kibana压缩包上传到所安装节点的指定目录:

[chenyunde@hadoop102 ~]$ tar -zxvf /opt/software/kibana-7.8.0-linux-x86_64.tar.gz -C /opt/module/

[chenyunde@hadoop102 ~]$ mv /opt/module/kibana-7.8.0-linux-x86_64 /opt/module/kibana

修改相关配置，连接Elasticsearch

[chenyunde@hadoop102 kibana]$ vi config/kibana.yml

server.host: "hadoop102"
elasticsearch.hosts: ["http://hadoop102:9200"]
i18n.locale: "zh-CN"

启动Kibana

[chenyunde@hadoop102 kibana]$ bin/kibana

浏览器访问hadoop102:5601

在这里插入图片描述

第3章 ElasticSearch基本使用

3.1 数据类型

ES中的数据类型十分丰富，具体可参考:
https://www.elastic.co/guide/en/elasticsearch/painless/7.8/painless-types.html
这里仅仅列出核心数据类型:
字符串型：text(分词)、keyword(不分词)
数值型：long、integer、short、byte、double、float、half_float、scaled_float
日期类型：date
布尔类型：boolean

3.2 集群状态查看

#查看节点状况
GET /_cat/nodes?v

#查看健康状况
GET /_cat/health?v

#查看能查看什么
GET /_cat

#查看所有的index
GET /_cat/indices

3.3 Index操作

3.3.1 新增Index

#手动创建Index  需要在创建index时指定mapping信息
PUT stu
{
  "mappings": {
    "properties": {
      "id": {
        "type": "keyword"
      },
      "name": {
        "type": "text"
      },
      "hobby": {
        "properties": {
          "name": {
            "type": "text"
          },
          "years":{
            "type": "integer"
          }
        }
      }
    }
  }
}

#自动创建  直接向一个不存在的Index插入数据，在插入数据时，系统根据数据的类型，自动推断mapping，自动创建mapping
POST /index3/_doc/1
{
  "name":"jack",
  "age": 20
}

3.3.2 查看Index

#查看所有的index
GET /_cat/indices

#查看某个index的信息
GET /_cat/indices/.kibana_1

#查看某个index的元数据信息
GET /index1

#查看某个index的表结构
GET /.kibana_1/_mapping

#查看某个index的别名
GET /.kibana_1/_alias

3.3.3 删除Index

#删除index
DELETE /index1

3.3.4 判断Index是否存在

#判断是否存在index  404 - Not Found代表不存在 ，200代表存在
HEAD /stu1

3.3.5 修改Index

目前仅仅支持向Index添加新的字段映射，如下:

PUT /stu1/_mapping
{
  "properties" : {
    "sex":{
      "type":"keyword"
    }
  }
}

对于已经存在的映射字段，不支持更新映射，更新只能创建新的index进行数据迁移。

PUT stu3
{
   "mappings" : {
      "properties" : {
        "id" : {
          "type" : "keyword"
        },
        "name" : {
          "type" : "text"
        },
        "gender" : {
          "type" : "keyword"
        }
      }
    }
}

POST _reindex
{
  "source": {
    "index": "stu1",
    "_source": ["id","name"]
  },
  "dest": {
    "index": "stu3"
  }
}

3.4 数据操作

3.4.1 查询操作

#全表查询
GET /stu1/_search

#查询单条记录
GET /stu1/_doc/2

3.4.2 新增操作

#明确指定id新增
POST /stu1/_doc/2
{
  "id": "1002",
  "name":"jack",
  "hobby":[
      {
        "name" : "跳",
        "years" : 3
      },
      {
        "name" : "rap",
        "years" : 3
      }
    ]
}

PUT /stu1/_doc/3
{
  "id": "1002",
  "name":"jack1",
  "hobby":[
      {
        "name" : "跳",
        "years" : 3
      },
      {
        "name" : "rap",
        "years" : 3
      }
    ]
}

#随机指定id
POST /stu1/_doc
{
  "id": "1002",
  "name":"jack",
  "hobby":[
      {
        "name" : "跳",
        "years" : 3
      },
      {
        "name" : "rap",
        "years" : 3
      }
    ]
}

3.4.3 修改操作

#全量修改
POST /stu1/_doc/1
{
  "id": "1002",
  "name":"jack"
  
}

PUT /stu1/_doc/1
{
  "name":"jack1"
  
}

#增量修改，只修改指定的列
POST /stu1/_update/2
{
  "doc":{
    "hobby":[
      {
        "name" : "跳",
        "years" : 5
      }
    ]
  }
}

3.4.4 删除操作

#删
DELETE /stu1/_doc/1

3.4.5 判断存在操作

#判断是否存在
HEAD /stu1/_doc/2

3.5 分词操作

3.5.1 默认分词

# text(允许分词)   keyword(不允许分词)
# 默认的分词器，用来进行英文分词，按照空格分
GET /_analyze
{
  "text":"I am a teacher!"
}


# 汉语按照字切分
GET /_analyze
{
  "text":"我是中国人"
}

ES自带的分词器，对中文的分词效果不明显，因此使用IK分词器。

3.5.2 IK分词器安装

[chenyunde@hadoop102 elasticsearch]$ cd plugins/
[chenyunde@hadoop102 plugins]$ mkdir ik
[chenyunde@hadoop102 plugins]$ cd ik
[chenyunde@hadoop102 ik]$ unzip /opt/software/elasticsearch-analysis-ik-7.8.0.zip

分发分词器到所有节点

[chenyunde@hadoop102 plugins]$ xsync ik

重新启动Elasticsearch

3.5.3 IK分词器测试

IK提供了两个分词算法ik_smart 和 ik_max_word。

#ik_smart：  智能分词。切分的所有单词的总字数等于词的总字数，即输入总字数=输出总字数
GET /_analyze
{
  "text":"我是中国人",
  "analyzer": "ik_smart"
}


#ik_max_word： 最大化分词。 输入总字数 <= 输出总字数
GET /_analyze
{
  "text":"我是中国人",
  "analyzer": "ik_max_word"
}

#没有NLP(自然语言处理，没有人的情感，听不懂人话)功能
GET /_analyze
{
  "text":"我喜欢洗屁股眼儿",
  "analyzer": "ik_max_word"
}

3.5.4 扩展词库

[chenyunde@hadoop102 config]$ pwd

/opt/module/elasticsearch/plugins/ik/config

[chenyunde@hadoop102 config]$ vim myword.dic

在文件中添加自己定义的词语，之后保存

[chenyunde@hadoop102 config]$ vim IKAnalyzer.cfg.xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
        <comment>IK Analyzer 扩展配置</comment>
        <!--用户可以在这里配置自己的扩展字典 -->
        <entry key="ext_dict">myword.dic</entry>
</properties>

之后分发刚刚的词库文件和配置文件，并重启ES。

第4章 ElasticSearchDSL查询

4.1 数据准备

4.1.1 创建测试用例表

#建表
PUT /test
{
    "mappings" : {
        "properties" : {
          "empid" : {
            "type" : "long"
          },
          "age" : {
            "type" : "long"
          },
          "balance" : {
            "type" : "double"
          },
          "name" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
           "gender" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "hobby" : {
            "type" : "text",
            "analyzer":"ik_max_word",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          }
        }
      }
  }

4.1.2 向表中批量导入数据

在这里插入图片描述

#导入数据：
POST /test/_bulk
{"index":{"_id":"1"}}
{"empid":1001,"age":20,"balance":2000,"name":"李三","gender":"男","hobby":"吃饭睡觉"}
{"index":{"_id":"2"}}
{"empid":1002,"age":30,"balance":2600,"name":"李小三","gender":"男","hobby":"吃粑粑睡觉"}
{"index":{"_id":"3"}}
{"empid":1003,"age":35,"balance":2900,"name":"张伟","gender":"女","hobby":"吃,睡觉"}
{"index":{"_id":"4"}}
{"empid":1004,"age":40,"balance":2600,"name":"张伟大","gender":"男","hobby":"打篮球睡觉"}
{"index":{"_id":"5"}}
{"empid":1005,"age":23,"balance":2900,"name":"大张伟","gender":"女","hobby":"打乒乓球睡觉"}
{"index":{"_id":"6"}}
{"empid":1006,"age":26,"balance":2700,"name":"张大喂","gender":"男","hobby":"打排球睡觉"}
{"index":{"_id":"7"}}
{"empid":1007,"age":29,"balance":3000,"name":"王五","gender":"女","hobby":"打牌睡觉"}
{"index":{"_id":"8"}}
{"empid":1008,"age":28,"balance":3000,"name":"王武","gender":"男","hobby":"打桥牌"}
{"index":{"_id":"9"}}
{"empid":1009,"age":32,"balance":32000,"name":"王小五","gender":"男","hobby":"喝酒,吃烧烤"}
{"index":{"_id":"10"}}
{"empid":1010,"age":37,"balance":3600,"name":"赵六","gender":"男","hobby":"吃饭喝酒"}
{"index":{"_id":"11"}}
{"empid":1011,"age":39,"balance":3500,"name":"张小燕","gender":"女","hobby":"逛街,购物,买"}
{"index":{"_id":"12"}}
{"empid":1012,"age":42,"balance":3500,"name":"李三","gender":"男","hobby":"逛酒吧,购物"}
{"index":{"_id":"13"}}
{"empid":1013,"age":42,"balance":3400,"name":"李球","gender":"男","hobby":"体育场,购物"}
{"index":{"_id":"14"}}
{"empid":1014,"age":22,"balance":3400,"name":"李健身","gender":"男","hobby":"体育场,购物"}
{"index":{"_id":"15"}}
{"empid":1015,"age":22,"balance":3400,"name":"Nick","gender":"男","hobby":"坐飞机,购物"}

4.2 DSL语法介绍


关键字	含义	类比SQL
query	查询	select
bool	将多个条件进行组合	selext xxx from xxx where age=20 and gender=male
must	必须符合	=
must_not	必须不符合	!=
should	最好符合，如果满足条件，可以加分
filter	过滤条件	where
term	术语匹配	gender=male
match	全文检索
fuzzy	模糊音匹配
from	从哪一条开始查
size	查询数据的数量	limit x
_source	指定查询的字段	select 字段
multi_match	匹配多个字段中的内容
match_phrase	短语匹配，将输入的查询内容整个作为整体进行查询，不切词

4.3 DSL查询

4.3.1 查询的两种方式

查询所有员工信息，并且按照年龄降序排序:

#第一种: REST   ； GET  /index/_search?参数1=值1&参数2=值2
#全表查询，按照年龄降序排序
#弊端：url的长度是有限的
GET /test/_search?q=*&sort=age:desc


#第二种: DSL(特定领域语言)  ；  GET  /index/type/_search
#                               { 参数  }
GET /test/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "age": {
        "order": "desc"
      }
    }
  ]
}

4.3.2 基本查询

全表查询，按照年龄降序排序，再按照工资降序排序，只取前5条记录的empid，age，balance

GET /test/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "age": {
        "order": "desc"
      }
    },
    {
      "balance": {
        "order": "desc"
      }
    }
  ],
  "from": 0,
  "size": 5,
  "_source": ["empid","age","balance"]
}

4.3.3 全文检索

搜索hobby含有吃饭睡觉的员工
用和hobby一致的切词算法，先对搜索的关键词进行切词，将切完后的吃饭、睡觉到hobby的倒排索引上去匹配

GET /test/_search
{
  "query": {
    "match": {
      "hobby": "吃饭睡觉"
    }
  }
}

搜索工资是2000的员工

#搜索工资是2000的员工
#只有text类型才能切词
GET /test/_search
{
  "query": {
    "match": {
      "balance": 2000
    }
  }
}

#官方不建议
#match一般都是对text类型进行检索
#使用精确匹配term
GET /test/_search
{
  "query": {
    "term": {
      "balance": 2000
    }
  }
}

搜索hobby是“吃饭睡觉”的员工
需求不能被切词

GET /test/_search
{
  "query": {
    "match": {
      "hobby.keyword": "吃饭睡觉"
    }
  }
}

GET /test/_search
{
  "query": {
    "match_phrase": {
      "hobby": "吃饭睡觉"
    }
  }
}

4.3.4 多字段匹配

搜索name或hobby中带球的员工

GET /test/_search
{
  "query": {
   "multi_match": {
     "query": "球",
     "fields": ["name","hobby"]
   }
  }
}

4.3.5 多条件匹配

搜索男性中喜欢购物的员工
在bool中可以写must（必须是），must_not（必须不是），filter（过滤），should（最好是）

GET /test/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "gender": {
              "value": "男"
            }
          }
        },
        {
          "match": {
            "hobby": "购物"
          }
        }
      ]
    }
  }
}

搜索男性中喜欢购物，还不能爱去酒吧的员工

GET /test/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "gender": {
              "value": "男"
            }
          }
        },
        {
          "match": {
            "hobby": "购物"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "hobby": "酒吧"
          }
        }
      ]
    }
  }
}

搜索男性中喜欢购物，还不能爱去酒吧的员工，年龄最好在20-30之间

GET /test/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "gender": {
              "value": "男"
            }
          }
        },
        {
          "match": {
            "hobby": "购物"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "hobby": "酒吧"
          }
        }
      ],
      "should": [
        {
          "range": {
            "age": {
              "gte": 20,
              "lte": 30
            }
          }
        }
      ]
    }
  }
}

搜索男性中喜欢购物，还不能爱去酒吧的员工，最好在20-30之间，不要40岁以上的

GET /test/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "gender": {
              "value": "男"
            }
          }
        },
        {
          "match": {
            "hobby": "购物"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "hobby": "酒吧"
          }
        }
      ],
      "should": [
        {
          "range": {
            "age": {
              "gte": 20,
              "lte": 30
            }
          }
        }
      ],
      "filter": {
        "range": {
          "age": {
            "lt": 40
          }
        }
      }
    }
  }
}

4.3.6 模糊音匹配

搜索Nick

GET /test/_search
{
  "query": {
    "fuzzy": {
      "name": "Dick"
    }
  }
}

4.4 聚合

4.4.1 聚合语法

aggregations|aggs
"aggregations" : 
{
    --aggregation_name：聚合字段名
    "<aggregation_name>" : 
    {
      --聚合运算的类型，类比,sum,avg,count(Term),min,max    sum(）
        "<aggregation_type>" :
        {
            	--num 对什么字段进行聚合
            <aggregation_body>
        }
        -- 对哪些表进行聚合，类比tablea，不写，将meta写在url
        [,"meta" : {  [<meta_data_body>] } ]?

        --子聚合，在当前聚合的基础上，继续聚合
        [,"aggregations" : { [<sub_aggregation>]+ } ]?
    }
    [,"<aggregation_name_2>" : { ... } ]*
}

4.4.2 聚合练习

text类型的字段是无法聚合的，需要使用keyword类型代替。
统计男女员工各多少人

GET /test/_search
{
  "aggs": {
    "gendercount": {
      "terms": {
        "field": "gender.keyword",
        "size": 2
      }
    }
  }
}

统计喜欢购物的男女员工各多少人

GET /test/_search
{
  "query": {
    "match": {
      "hobby": "购物"
    }
  }, 
  "aggs": {
    "gendercount": {
      "terms": {
        "field": "gender.keyword",
        "size": 2
      }
    }
  }
}

统计喜欢购物的男女员工各多少人，及这些人总体的平均年龄:

GET /test/_search
{
  "query": {
    "match": {
      "hobby": "购物"
    }
  }, 
  "aggs": {
    "gendercount": {
      "terms": {
        "field": "gender.keyword",
        "size": 2
      }
    },
    "avgage":{
      "avg": {
        "field": "age"
      }
    }
  }
}

统计喜欢购物的男女员工各多少人，及这些人不同性别的平均年龄

GET /test/_search
{
  "query": {
    "match": {
      "hobby": "购物"
    }
  },
  "aggs": {
    "gendercount": {
      "terms": {
        "field": "gender.keyword",
        "size": 2
      },
      "aggs": {
        "avgage": {
          "avg": {
            "field": "age"
          }
        }
      }
    }
  }
}

第5章 ElasticSearch索引别名

5.1 别名介绍

在这里插入图片描述

索引别名就像一个快捷方式或软连接，可以指向一个或多个索引，也可以给任何一个需要索引名的API来使用。
别名带给我们极大的灵活性，允许我们做下面这些：
1）给多个索引分组
2）给索引的一个子集创建视图
3）在运行的集群中可以无缝的从一个索引切换到另一个索引

5.2 别名作用

5.2.1 给多个索引分组

在这里插入图片描述

5.2.2 为子集创建视图

在这里插入图片描述

5.2.3 索引无缝切换

在这里插入图片描述

5.3 别名操作

5.3.1 创建别名

1）建索引时直接声明

PUT movie_index
{  
  "aliases": {
    "movie1": {},
    "movie2": {}
  }, 
  "mappings": {
      "properties": {
        "id":{
          "type": "long"
        },
        "name":{
          "type": "text",
          "analyzer": "ik_smart"
        }
      }
  }
}

2）为已存在的索引增加别名

POST _aliases
{
  "actions": [
    {
      "add": {
        "index": "movie_index",
        "alias": "movie3"
      }
    }
  ]
}

5.3.2 查询别名

GET _alias
GET _cat/aliases
GET movie_index/_alias

5.3.3 删除别名

POST _aliases
{
  "actions": [
    {
      "remove": {
        "index": "movie_index",
        "alias": "movie3"
      }
    }
  ]
}

5.3.4 修改别名

POST _aliases
{
  "actions": [
    {
      "add": {
        "index": "movie_index",
        "alias": "movie4"
      }
    },
    {
      "remove": {
        "index": "movie_index",
        "alias": "movie2"
      }
    }
  ]
}

5.3.5 建立子集视图

POST _aliases
{
  "actions": [
    {
      "add": {
        "index": "test",
        "alias": "manindex",
        "filter": {
          "term": {
            "gender": "男"
          }
        }
      }
    }
  ]
}

5.3.6 别名无缝切换

POST _aliases
{
  "actions": [
    {
      "add": {
        "index": "movie_index",
        "alias": "movie4"
      }
    },
    {
      "remove": {
        "index": "movie_index",
        "alias": "movie2"
      }
    }
  ]
}

第6章 ElasticSearch模板

6.1 模板介绍

在这里插入图片描述

Index Template 索引模板，顾名思义，就是创建索引的模具，其中可以定义一系列规则来帮助我们构建符合特定业务需求的索引的mappings和 settings，通过使用 Index Template 可以让我们的索引具备可预知的一致性。

6.2 模板操作

6.2.1 查看模板

GET /_cat/templates

GET _template/template_movie

6.2.2 创建模板

PUT _template/template_movie
{
  "index_patterns": ["movie*"],
  "aliases" : { 
    "{index}-query": {},
    "movie-query":{}
  },
  "mappings": { 
      "properties": {
        "id": {
          "type": "keyword"
        },
        "movie_name": {
          "type": "text",
          "analyzer": "ik_smart"
        }
      }
    }
}