【ElasticSearch - 自学笔记】

最新推荐文章于 2024-06-24 10:33:44 发布

你怎么胖嘟嘟的

最新推荐文章于 2024-06-24 10:33:44 发布

阅读量893

点赞数 31

文章标签： elasticsearch 笔记大数据

本文链接：https://blog.csdn.net/m0_73058114/article/details/136474675

版权

ElasticSearch

1、分布式、Restful风格免费开源的搜索引擎

elastic：可伸缩，灵活；search：查询的意思

2、数据分为三大类：结构化数据、非结构化数据、半结构化数据

2.1、结构化数据

对于结构化数据，我们会用特定的结构来组织和管理数据，一般表现为二维表结构。

例如：mysql、oracle，并可以通过sql语句进行查询，为了提高效率，可以采用一些索引的方式优化查询。

优点：方便查询；

缺点：扩展结构比较困难

2.2、非结构化数据

无法用二维表结构来表现数据的数据，维度广，数据量大，一般会将此类数据以key-value结构保存到NoSQL数据库中，比如Redis、MongonDB；

例如：服务器日志，通信记录、工作文档、报表、视频、图片

缺点：存储和查询的数据量非常大，需要专业的人员和统计模型来进行处理

2.3、半结构化数据

将内容和结构混合在一起，例如xml、html类似的文档就是半结构化数据；一般也是保存到Redis、之类的非关系型数据库

3、The Elastic Stack，包括ElasticeSearch（用于存储搜集数据）、Kibana（展示数据）、Beats和Logstash（采集和传输数据），也被称为ELK；

能够安全可靠的获取任何来源、任何格式的数据，然后实时的对数据进行搜索、分析和可视化。

ES是一个开源的、高扩展的、分布式的全文搜索引擎，是整个ELK技术栈的核心。可以近乎实时的存储、检索数据。扩展性也比较好，可拓展到上百台服务器，处理PB级别的数据。

全文搜索，我们可以理解成全站搜索。例如在博客网站，用户可以通过热门词汇、关键字进行搜索，查询所有网站中匹配的文章，并且以列表的形式展现结果。但是在传统数据库利用sql进行搜索时，效率低下，即使有sql优化，但性能依旧很差，所以在生产环境中效果是比较差的，这就要求我们使用全文检索的搜索引擎。

3.1、ES、Solr/SolrCloud区别

特征	Solr/SolrCloud	ElasticSearch
社区和开发者	Apache软件基金会和社区支持	单一商业实体及其员工
节点发现	Apache Zookeeper 在大量项目中成熟且经过实战测试	Zen内置于ElsshSearch本身，需要专用的节点才能进行分裂脑保护
碎片放置	本质上是静态，需要手工工作来迁移分片，从Solr7开始，AutoScaling API允许一些动态操作	动态，可以根据集群状态按需移动分片
高速缓存	全局，每个端段更改无效	每段，更适合动态更改数据
分析引擎性能	非常适合精确计算的静态字段	结果的准确性取决于数据放置
全文搜索功能	基于Lucene的语言分析，多建议：拼写检查，丰富的高亮显示支持	基于Lucene的语言分析，单一建议API是实现，高亮显示重新计算
DevOps支持	尚未完成	非常好的API
非平面数据处理	嵌套文档和父-子支持	嵌套和对象类型类型的自然支持允许几乎无限的嵌套和父-子支持
查询DSL	JSON（有限）、XML（有限）、或者URL参数	JSON
索引/收集领导控制	领导者安置控制和领导着重新平衡甚至可以节点上的负载	不可能
机器学习	内置-在流聚合之上，专注于逻辑回归和学习排名贡献模块	商业功能，专注于异常和异常值以及时间序列数据

3.2、为什么选择ES？

3.2.1、易于使用，一个下载和一个命令就可以启动一切

3.2.2、除了搜索文本之外还可以用来处理分析查询

3.2.3、可以满足分布式索引、良好可伸缩性以及性能分布式环境的要求

3.2.4、在开源日志管理用例中占据主导地位

3.2.5、暴露了更多的关键指标

4、ES入门

4.1、下载软件

ES官网地址：https://www.elastic.co/cn/

ES下载地址：https://www.elastic.co/cn/downloads/past-releases#elasticsearch

4.2、9300端口为ES集群间组件的通信端口，9200端口为浏览器访问的http协议RESTful端口；

启动可以直接打开bin/elasticsearch的终端窗口

当ES软件启动之后，可以通过访问http://localhost:9200,如果出现以下信息，则代表启动成功

{
  "name" : "MacBook-Air.local",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "5DTqNfehQk2kXns6pQpFzw",
  "version" : {
    "number" : "7.8.0",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "757314695644ea9a1dc2fecd26d1a43856725e65",
    "build_date" : "2020-06-14T19:35:50.234439Z",
    "build_snapshot" : false,
    "lucene_version" : "8.5.1",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

4.3、问题解决

4.3.1、ES是使用java开发的，且7.8版本的ES需要使用JDK1.8以上，默认安装包带有jdk环境，如果系统配置JAVA_HOME，那么使用系统默认的JDK，如果没有使用自带的JDK，一般建议使用系统配置的JDK
在这里插入图片描述

4.3.2、双击启动，窗口闪退，通过路径访问追踪错误，如果是“空间不足”，请修改config/jvm.options配置文件

#设置JVM初始内存为1G。此值可以设置与-Xmx相同，以避免每次垃圾回收完成后JVM重新分配内存
#Xmx represents the initial size of total heap space
#设置JVM最大内存为1G

-Xms1g
-Xmx1g

4.3.3、ES软件的发送和返回结构都是JSON格式的

4.3.4、如果直接通过浏览器向ES发送请求，那么需要在发送的请求中包含HTTP标准的方法，而HTTP的大部分特性且仅支持GET和POST方法

5、数据格式

5.1、ES是面向文档型数据库，一条数据就是一个文档。

ES -> MySQL
index（索引） -> Database（数据库）
Type（类型） -> Table（表）
Documents（文档） -> Row（行）
Fields（字段） -> Column（列）

5.2、正排索引和倒排索引

正排索引
id			content
-------------------
1001		my name is zhang san
1002		my name is li si


倒排索引
keyword			id
---------------------
name				1001、1002
zhang				1001

//Type的概念在被弱化，ES6.X中，一个index下只能包含一个type
//ES7.X中，Type的概念已经被删除了

6、ES基础操作

6.1、创建索引

想要对ES中的数据进行操作，首先要有索引（类比关系型数据库就等同于创建数据库）

在APIfox中，向ES服务器发送PUT请求：
//http://127.0.0.1:9200 ：代表ES服务器地址
//shopping ：代表索引
http://127.0.0.1:9200/shopping


//响应结果
{
    "acknowledged": true,
    "shards_acknowledged": true,
    "index": "shopping"
}

//再次发送PUT请求，显示该索引已经存在
{
    "error": {
        "root_cause": [
            {
                "type": "resource_already_exists_exception",
                "reason": "index [shopping/bNlj8fcKTbyLOdOCowOSvw] already exists",
                "index_uuid": "bNlj8fcKTbyLOdOCowOSvw",
                "index": "shopping"
            }
        ],
        "type": "resource_already_exists_exception",
        "reason": "index [shopping/bNlj8fcKTbyLOdOCowOSvw] already exists",
        "index_uuid": "bNlj8fcKTbyLOdOCowOSvw",
        "index": "shopping"
    },
    "status": 400
}

6.2、获取索引信息

在APIfox中，向ES服务器发送GET请求：
//http://127.0.0.1:9200 ：代表ES服务器地址
//shopping ：代表索引，获取指定索引信息
http://127.0.0.1:9200/shopping

//响应结果
{
    "shopping": {
        "aliases": {},
        "mappings": {},
        "settings": {
            "index": {
                "creation_date": "1699861966435",
                "number_of_shards": "1",
                "number_of_replicas": "1",
                "uuid": "bNlj8fcKTbyLOdOCowOSvw",
                "version": {
                    "created": "7080099"
                },
                "provided_name": "shopping"
            }
        }
    }
}


//获取ES中所有的索引信息，仍然使用GET请求
http://127.0.0.1:9200/_cat/indices?v

//响应结果
health status index    	uuid                   pri rep 
yellow open 	shopping	bNlj8fcKTbyLOdOCowOSvw 1 	 1
-------------------------------------------------------
docs.count docs.deleted store.size pri.store.size
0          0       			208b       208b

6.3、删除索引

在APIfox中，向ES服务器发送DELETE请求：
//http://127.0.0.1:9200 ：代表ES服务器地址
//shopping ：代表索引，删除指定索引信息
http://127.0.0.1:9200/shopping

//响应结果
{
    "acknowledged": true
}

7、文档操作

7.1、创建文档

这里的文档类似于数据库中的表数据，添加的数据格式为JSON格式

在APIfox中，向ES服务器发送POST请求
//_doc : 表示ES索引中添加文档
http://127.0.0.1:9200/shopping/_doc

请求体内容
{
	"title":"小米手机",
  "category":"小米",
  "images":"http://www.gulixueyuan.com/xm.jpg",
  "price":3999.00
}


//如果没有请求体，则会报错
{
    "error": {
        "root_cause": [
            {
                "type": "parse_exception",
                "reason": "request body is required"
            }
        ],
        "type": "parse_exception",
        "reason": "request body is required"
    },
    "status": 400
}

//正确发送后的响应结果
{
    "_index": "shopping",
    "_type": "_doc",
    "_id": "3GLHx4sBqo4WFUWVgkxS",
    "_version": 1,
    "result": "created",
    "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
    },
    "_seq_no": 0,
    "_primary_term": 1
}

7.1.1、创建文档不能使用PUT请求，只能使用POST请求，响应结果中有一个"_id": “3GLHx4sBqo4WFUWVgkxS”，这就是数据的唯一性标识，类似我们的主键，同样的请求，多次执行之后，响应结果是不同的，这里的请求不是幂等性的，但是PUT请求必须是幂等性的，所以在此只能使用POST请求。

7.1.2、ES生成的唯一性标识不太好记，所以我们如果想自定义唯一标识，需要修改一下请求路径，在_doc后面添加id信息

在APIfox中，向ES服务器发送POST请求
//_doc : 表示ES索引中添加文档
http://127.0.0.1:9200/shopping/_doc/1001

请求体内容
{
	"title":"小米手机",
  "category":"小米",
  "images":"http://www.gulixueyuan.com/xm.jpg",
  "price":3999.00
}

//响应结果
{
    "_index": "shopping",
    "_type": "_doc",
    "_id": "1001",
    "_version": 1,
    "result": "created",
    "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
    },
    "_seq_no": 1,
    "_primary_term": 1
}

7.1.3、加上自定义id之后，每次POST请求的返回结果都是相同的，所以这个时候再去使用PUT请求，就可以发送了

在APIfox中，向ES服务器发送POST请求
//_doc : 表示ES索引中添加文档
http://127.0.0.1:9200/shopping/_create/1001

7.2、查询文档

在APIfox中，向ES服务器发送GET请求
//_doc : 表示查询ES指定索引中的文档
http://127.0.0.1:9200/shopping/_doc/1001

//响应结果，found为true，表示找到该文档
{
    "_index": "shopping",
    "_type": "_doc",
    "_id": "1001",
    "_version": 1,
    "_seq_no": 1,
    "_primary_term": 1,
    "found": true,
    "_source": {
        "title": "小米手机",
        "category": "小米",
        "images": "http://www.gulixueyuan.com/xm.jpg",
        "price": 3999.00
    }
}

//如果查询一个不存在的id，响应结果会说found未找到
{
    "_index": "shopping",
    "_type": "_doc",
    "_id": "1002",
    "found": false
}

//查询全部的文档信息，发送GET请求
http://127.0.0.1:9200/shopping/_search

7.3、修改：全量修改&局部修改

7.3.1、全量修改：不管发送多少次请求，ES文档数据都会被完全覆盖，所以可以使用PUT请求

在APIfox中，向ES服务器发送GET请求
//_doc : 表示ES指定索引中的文档
http://127.0.0.1:9200/shopping/_doc/1001

//请求体
{
	"title":"华为手机",
    "category":"华为",
    "images":"http://www.gulixueyuan.com/xm.jpg",
    "price":3999.00
}

//响应结果，result为updated
{
    "_index": "shopping",
    "_type": "_doc",
    "_id": "1001",
    "_version": 2,
    "result": "updated",
    "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
    },
    "_seq_no": 7,
    "_primary_term": 1
}

7.3.2、局部修改

因为是局部修改，所以每次的返回结果都不是相同的，所以不能使用PUT方式，只能使用POST方式

在APIfox中，向ES服务器发送GET请求
//_doc : 表示新增，如果使用_doc相当于做全量修改，_update表示局部修改
http://127.0.0.1:9200/shopping/_update/1001

//请求体
{
    "doc":{
        "title":"三星手机"
    }
}

//响应体
{
    "_index": "shopping",
    "_type": "_doc",
    "_id": "1002",
    "_version": 2,
    "result": "updated",
    "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
    },
    "_seq_no": 10,
    "_primary_term": 1
}

//再次请求
{
    "_index": "shopping",
    "_type": "_doc",
    "_id": "1002",
    "_version": 2,
    "_seq_no": 10,
    "_primary_term": 1,
    "found": true,
    "_source": {
        "title": "三星手机",
        "category": "小米",
        "images": "http://www.gulixueyuan.com/xm.jpg",
        "price": 3999.0
    }
}

7.4、删除文档

//在APIfox中，向ES服务器发送GET请求
http://127.0.0.1:9200/shopping/_doc/1001

//响应结果
{
    "_index": "shopping",
    "_type": "_doc",
    "_id": "1001",
    "_version": 5,
    "result": "deleted",
    "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
    },
    "_seq_no": 11,
    "_primary_term": 1
}

//再次对同一id发送删除请求，结果会显示not_found
{
    "_index": "shopping",
    "_type": "_doc",
    "_id": "1001",
    "_version": 1,
    "result": "not_found",
    "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
    },
    "_seq_no": 12,
    "_primary_term": 1
}

8、多条件查询

8.1、请求体添加条件查询参数

//在APIfox发送GET请求
http://127.0.0.1:9200/shopping/_search?q=category:小米

但是这样发送查询请求，容易造成乱码情况，所以一般把条件参数写在请求体中，推荐将条件查询数据写入到请求体中

//在APIfox中发送GET请求
http://127.0.0.1:9200/shopping/_search

//请求体
{
  	//代表条件查询
    "query":{
      	//match：匹配
        "match":{
          	//具体条件
            "title":"三星"
        }
    }
}

//全量查询的请求体
{
  "query":{
    "match_all":{
      
    }
  }
}

//分页查询的请求体
{
  "query":{
    "match_all":{
      
    }
  },
  "from":0,
  "size":2
}
//想查询第几页，就把from的值改一下：（页码-1）*每夜数据条数

//对source数据进行过滤
{
  "query":{
    "match_all":{
      
    }
  },
  "from":0,
  "size":2,
  "_source":["title"]
}

//对查询结果进行排序
{
  "query":{
    "match_all":{
      
    }
  },
  "from":0,
  "size":2,
  "_source":["title"],
  "sort":{
    "price":{
      "order":"asc/desc"
    }
  }
}

8.2、多个条件组合在一起

//多条件同时成立，查询 价格为5999的小米手机
{
    "query" : {
        "bool" : {
            "must" : [
                {
                    "match" : {
                        "category" : "小米"
                    }
                },{
                    "match" :{
                        "price" : 5999
                    }
                }
            ]
        }
    }
}
//must意味着必须满足，类似于and


//满足多个条件中的其中一个 
{
    "query" : {
        "bool" : {
            "should" : [
                {
                    "match" : {
                        "category" : "小米"
                    }
                },{
                    "match" :{
                        "category" : "华为"
                    }
                }
            ]
        }
    }
}


//范围查询，价格大于5000
{
    "query" : {
        "bool" : {
            "should" : [
                {
                    "match" : {
                        "category" : "小米"
                    }
                },{
                    "match" :{
                        "category" : "华为"
                    }
                }
            ],
            "filter" : {
                "range" : {
                    "price" : {
                        "gt" : 5000
                    }
                }
            }
        }
    }
}

8.2.1、全查询(全文检索) & 精准匹配 & 高亮显示

在ES保存文档数据的时候，ES会将文字进行分词拆解处理，并将拆解之后的数据保存到倒排索引中，这样的话，即使使用文字的一部分，也可以查询到数据，这种检索方式我们称为全文检索。

ES也会对查询内容进行分词处理，在倒排索引中进行匹配

//使用category的一部分进行查询，也可以查询到数据
{
    "query" : {
        "match" : {
            "category" : "为"
        }
    }
}


//精准匹配
{
    "query" : {
        "match_phrase" : {
            "category" : "小华"
        }
    }
}

//对查询结果进行高亮显示
{
    "query" : {
        "match":{
            "category" : "小米"
        }
    },
    "highlight" : {
        "fields" : {
            "category" : {}
        }
    }
}

8.2.2、对查询结果进行分组或统计分析

//对价格字段进行分组
{
    "aggs" : { //聚合操作
        "price_group" : { //分组名称，随意起名，无特殊含义
            "terms" : {  //分组
                "field" : "price"   //分组字段
            }
        }
    },
    "size" : 0  //表示不需要获取原始数据
}

//求平均值操作
{
    "aggs" : { //聚合操作
        "price_avg" : { //分组名称，随意起名，无特殊含义
            "avg" : {  //分组
                "field" : "price"   //分组字段
            }
        }
    },
    "size" : 0  //表示不需要获取原始数据
}

9、映射关系

ES的查询中有的查询可以分词查询，有的却不可以分词查询，必须要全部匹配；

分词还是不分词该如何界定呢？

在mysql中，一张表的字段、类型、长度信息都属于表的结构信息，在ES中也有类似的概念，我们称为映射

//1、首先创建一个索引
PUT http://localhost:9200/user

//response
{
  "acknowledged": true,
  "shards_acknowledged": true,
  "index": "user"
}

//2、创建user的结构信息
PUT http://localhost:9200/user/_mapping

//body
{
    "properties" : {
        "name" : {
            "type" : "text", //text类型可以进行分词
            "index" : true    //表示这个字段可以进行索引查询
        },
        "sex" : {
            "type" : "keyword", //keyword类型不可以进行分词，只能完整匹配
            "index" : true    //表示这个字段可以进行索引查询
        },
        "tel" : {
            "type" : "keyword", //keyword类型不可以进行分词，只能完整匹配
            "index" : false    //表示这个字段不可以进行索引查询
        }
    }
}

//response
{
  "acknowledged": true
}

//3、再进行查询创建的user结构信息
GET http://localhost:9200/user/_mapping

//response
{
    "user": {
        "mappings": {
            "properties": {
                "name": {
                    "type": "text"
                },
                "sex": {
                    "type": "keyword"
                },
                "tel": {
                    "type": "keyword",
                    "index": false
                }
            }
        }
    }
}

//4、进行查询操作
GET http://127.0.0.1:9200/user/_search
//4.1、查询name
//body
{
    "query":{
        "match":{
            "name":"张三"
        }
    }
}
//发现无论是“张“，”三“，还是”张三“，都可以查到数据，因为type是text，可以进行分词处理

//4.2、查询sex性别
//body
{
    "query":{
        "match":{
            "sex":"男的"
        }
    }
}
//发现只能完全匹配的时候才可以查询到数据，因为type为keyword，不能进行分词处理

//4.3、查询电话号
//body
{
    "query":{
        "match":{
            "tel":"1111"
        }
    }
}

//response
{
    "error": {
        "root_cause": [
            {
                "type": "query_shard_exception",
                "reason": "failed to create query: Cannot search on field [tel] since it is not indexed.",
                "index_uuid": "aCS_ON0TR1GxNZNVlwX4pA",
                "index": "user"
            }
        ],
        "type": "search_phase_execution_exception",
        "reason": "all shards failed",
        "phase": "query",
        "grouped": true,
        "failed_shards": [
            {
                "shard": 0,
                "index": "user",
                "node": "zhbdoDsoRBKoxxqJbwT-Cg",
                "reason": {
                    "type": "query_shard_exception",
                    "reason": "failed to create query: Cannot search on field [tel] since it is not indexed.",
                    "index_uuid": "aCS_ON0TR1GxNZNVlwX4pA",
                    "index": "user",
                    "caused_by": {
                        "type": "illegal_argument_exception",
                        "reason": "Cannot search on field [tel] since it is not indexed."
                    }
                }
            }
        ]
    },
    "status": 400
}

//发现不能使用tel进行查询，因为index=false，是没有建立索引查询的，所以不能使用tel进行查询

10、Java API操作

Elasticsearch软件是由Java语言开发的，所以也可以通过Java API的方式对ES服务进行访问

10.1、创建新的maven工程

//使用IDEA工具创建新的maven工程，到导入一下依赖
<!--添加依赖关系-->
    <dependencies>
        <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>7.8.0</version>
        </dependency>
        <!--es的客户端-->
        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
            <version>7.8.0</version>
        </dependency>
        <!--es依赖的2.X的log4j  -->
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-api</artifactId>
            <version>2.8.2</version>
        </dependency>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-core</artifactId>
            <version>2.8.2</version>
        </dependency>
        <dependency>
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-databind</artifactId>
            <version>2.9.9</version>
        </dependency>
        <!--junit单元测试-->
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.12</version>
            <scope>test</scope>
        </dependency>
    </dependencies>

10.2、客户端对象

创建一个package：com.td.es.test，并创建一个class：ESTest_client，代码中创建Elasticsearch客户端对象，因为早起版本的客户端对象已经不推荐使用，且在未来版本中会删除，所以我们采用高级REST客户端对象：RestHighLevelClient

public class ESTest_client {
    public static void main(String[] args) throws Exception{

        //创建ES客户端
        RestHighLevelClient EsClient = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );

        //关闭客户端
        EsClient.close();
    }
}

10.3、创建索引

public class ESTest_index_create {
    public static void main(String[] args) throws Exception{

        //创建ES客户端
        RestHighLevelClient esClient = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );

        //创建索引
        CreateIndexRequest request = new CreateIndexRequest("newuser");
        CreateIndexResponse response = esClient.indices().create(request , RequestOptions.DEFAULT);
        boolean acknowledged = response.isAcknowledged();
        System.out.println("索引创建 ："+ acknowledged);

        //关闭客户端
        esClient.close();
    }
}

//需要注意的是，索引名称必须是小写，否则会报错
"error":{"root_cause":[{"type":"invalid_index_name_exception","reason":"Invalid index name [newUser], must be lowercase","index_uuid":"_na_","index":"newUser"}],"type":"invalid_index_name_exception","reason":"Invalid index name [newUser], must be lowercase","index_uuid":"_na_","index":"newUser"},"status":400}

10.4、查询索引

public class ESTest_index_search {
    public static void main(String[] args) throws Exception{

        //创建ES客户端
        RestHighLevelClient esClient = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );

        //查询索引
        GetIndexRequest request = new GetIndexRequest("newuser");
        GetIndexResponse getIndexResponse = esClient.indices().get(request, RequestOptions.DEFAULT);

        System.out.println(getIndexResponse.getAliases());
        System.out.println(getIndexResponse.getMappings());
        System.out.println(getIndexResponse.getSettings());

        //关闭客户端
        esClient.close();
    }
}

10.5、删除索引

public class ESTest_index_delete {
    public static void main(String[] args) throws Exception{

        //创建ES客户端
        RestHighLevelClient esClient = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );

        //删除索引
        DeleteIndexRequest request = new DeleteIndexRequest("newuser");
        AcknowledgedResponse response = esClient.indices().delete(request, RequestOptions.DEFAULT);

        System.out.println(response.isAcknowledged());

        //关闭客户端
        esClient.close();
    }
}

10.6、创建文档对象

public class NewUser implements Serializable {
    private String name;
    private String sex;
    private String tel;
		
  //get set 方法
}

10.7、添加文档对象

public class ESTest_doc_create {
    public static void main(String[] args) throws Exception{

        //创建ES客户端
        RestHighLevelClient esClient = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );

        //创建文档
        IndexRequest request = new IndexRequest();
        //添加编号
        request.index("newuser").id("1001");
        //创建user对象
        NewUser user = new NewUser();
        user.setName("zhangsan");
        user.setSex("男");
        user.setTel("1111");
        //向ES插入数据，数据格式必须是JSON格式
        ObjectMapper mapper = new ObjectMapper();
        String userJson = mapper.writeValueAsString(user);
        request.source(userJson , XContentType.JSON);
        //获取响应结果
        IndexResponse response = esClient.index(request, RequestOptions.DEFAULT);

        System.out.println(response.getResult());

        //关闭客户端
        esClient.close();
    }
}

10.8、修改doc

public class ESTest_doc_update {
    public static void main(String[] args) throws Exception{

        //创建ES客户端
        RestHighLevelClient esClient = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );

        //创建文档
        UpdateRequest request = new UpdateRequest();
        //获取想要修改的doc
        request.index("newuser").id("1002");
        //局部修改的内容
        request.doc(XContentType.JSON,"sex","女");
        //获取响应结果
        UpdateResponse response = esClient.update(request, RequestOptions.DEFAULT);

        System.out.println(response.getResult());

        //关闭客户端
        esClient.close();
    }
}

10.9、查询doc

public class ESTest_doc_search {
    public static void main(String[] args) throws Exception{

        //创建ES客户端
        RestHighLevelClient esClient = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );

        //创建请求
        GetRequest request = new GetRequest();
        //获取想要search的doc
        request.index("newuser").id("1002");
        //获取响应结果
        GetResponse response = esClient.get(request, RequestOptions.DEFAULT);
        System.out.println(response.getSourceAsString());

        //关闭客户端
        esClient.close();
    }
}

10.10、删除doc

public class ESTest_doc_delete {
    public static void main(String[] args) throws Exception{

        //创建ES客户端
        RestHighLevelClient esClient = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );

        //创建请求
        DeleteRequest request = new DeleteRequest();
        //获取想要删除的doc
        request.index("newuser").id("1002");
        //获取响应结果
        DeleteResponse response = esClient.delete(request, RequestOptions.DEFAULT);

        System.out.println(response.getResult());

        //关闭客户端
        esClient.close();
    }
}

11、Java API高级用法

11.1、批量添加doc

public class ESTest_doc_batch_create {
    public static void main(String[] args) throws IOException {
        //1.创建ES连接
        RestHighLevelClient esClient = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );

        //2.创建批量请求
        BulkRequest request = new BulkRequest();
        //批量请求中添加单个请求
        request.add(new IndexRequest().index("newuser").id("1001").source(XContentType.JSON , "name" , "张三"));
        request.add(new IndexRequest().index("newuser").id("1002").source(XContentType.JSON , "name" , "李四"));
        request.add(new IndexRequest().index("newuser").id("1003").source(XContentType.JSON , "name" , "王五"));

        //3.发起请求，获取响应
        BulkResponse response = esClient.bulk(request, RequestOptions.DEFAULT);

        //4.打印响应结果
        System.out.println("花费的时间："+response.getTook());
        System.out.println("一共添加了"+response.getItems().length+"条数据：");

        //关闭连接
        esClient.close();
    }
}

11.2、批量删除数据

public class ESTest_doc_batch_delete {
    public static void main(String[] args) throws IOException {
        //1.创建ES连接
        RestHighLevelClient esClient = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );

        //2.创建批量删除请求
        BulkRequest request = new BulkRequest();
        //批量请求中添加单个请求
        request.add(new DeleteRequest().index("newuser").id("1001"));
        request.add(new DeleteRequest().index("newuser").id("1002"));
        request.add(new DeleteRequest().index("newuser").id("1003"));

        //3.发起请求，获取响应
        BulkResponse response = esClient.bulk(request, RequestOptions.DEFAULT);

        //4.打印响应结果
        System.out.println("花费的时间："+response.getTook());
        System.out.println("一共删除了"+response.getItems().length+"条数据：");

        //关闭连接
        esClient.close();
    }
}

11.3、全量查询数据

public class ESTest_doc_matchAll {
    public static void main(String[] args) throws IOException {
        //1.创建ES连接
        RestHighLevelClient esClient = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );
        //2.创建请求
        SearchRequest request = new SearchRequest();
        request.indices("newuser");
        //3.添加查询条件
        request.source(new SearchSourceBuilder().query(QueryBuilders.matchAllQuery()));
        //4.获取响应
        SearchResponse response = esClient.search(request, RequestOptions.DEFAULT);

        SearchHits hits = response.getHits();
        System.out.println("命中的数据条数"+hits.getTotalHits());
        System.out.println("花费的时间："+response.getTook());
        //打印查询出来的数据
        for (SearchHit hit:hits) {
            System.out.println(hit.getSourceAsString());
        }

        //关闭连接
        esClient.close();
    }
}

11.4、条件查询

public class ESTest_doc_termQuery {
    public static void main(String[] args) throws IOException {
        //1.创建ES连接
        RestHighLevelClient esClient = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );
        //2.创建请求
        SearchRequest request = new SearchRequest();
        request.indices("newuser");
        //3.添加查询条件
        request.source(new SearchSourceBuilder().query(QueryBuilders.termQuery("sex","女")));
        //4.获取响应
        SearchResponse response = esClient.search(request, RequestOptions.DEFAULT);
        //打印查询出来的数据
        SearchHits hits = response.getHits();
        System.out.println("花费的时间："+response.getTook());
        System.out.println("命中的条数："+hits.getTotalHits());
        for(SearchHit hit : hits){
            System.out.println(hit);
        }
        //关闭连接
        esClient.close();
    }
}

11.5、分页查询+结果排序+结果过滤

public class ESTest_doc_pageQuery {
    public static void main(String[] args) throws IOException {
        //1.创建ES连接
        RestHighLevelClient esClient = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost",9200,"Http")));
        //2.创建请求
        SearchRequest request = new SearchRequest();
        request.indices("newuser");
        //3.添加查询条件
        SearchSourceBuilder condition = new SearchSourceBuilder().query(QueryBuilders.matchAllQuery());
        //添加分页信息
        condition.from(0);
        condition.size(2);
        //添加排序规则
        condition.sort("tel",SortOrder.DESC);
        //添加过滤信息
        String[] incloud = {};
        String[] excloud = {"name"};
        condition.fetchSource(incloud , excloud);
        request.source(condition);
        //4.获取响应
        SearchResponse response = esClient.search(request, RequestOptions.DEFAULT);
        //打印查询出来的数据
        SearchHits hits = response.getHits();
        System.out.println("花费的时间："+response.getTook());
        System.out.println("命中的条数："+hits.getTotalHits());
        for(SearchHit hit : hits){
            System.out.println(hit);
        }
        //关闭连接
        esClient.close();
    }
}

11.6、组合查询

public class ESTest_doc_combination_query {
    public static void main(String[] args) throws IOException {
        //创建ES连接
        RestHighLevelClient esClient = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost",9200,"http")));

        //创建请求
        SearchRequest request = new SearchRequest();
        request.indices("newuser");
        //添加条件
        SearchSourceBuilder condition = new SearchSourceBuilder();
        BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();

        //must 必须满足
        // boolQueryBuilder.must(QueryBuilders.matchQuery("sex","男"));
        // boolQueryBuilder.must(QueryBuilders.matchQuery("tel","1111"));
        //mustNot 必须不满足
        // boolQueryBuilder.mustNot(QueryBuilders.matchQuery("tel","1111"));
        //should 相当于or，可以是...也可以是...
        boolQueryBuilder.should(QueryBuilders.matchQuery("tel","1111"));
        boolQueryBuilder.should(QueryBuilders.matchQuery("tel","2222"));
        condition.query(boolQueryBuilder);

        //发送请求
        request.source(condition);
        SearchResponse response = esClient.search(request, RequestOptions.DEFAULT);

        //打印查询出来的数据
        SearchHits hits = response.getHits();
        System.out.println("花费的时间："+response.getTook());
        System.out.println("命中的条数："+hits.getTotalHits());
        for(SearchHit hit : hits){
            System.out.println(hit);
        }
        //关闭连接
        esClient.close();

    }
}

11.7、范围查询

public class ESTest_doc_range_query {
    public static void main(String[] args) throws IOException {
        //创建连接
        RestHighLevelClient esClient = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));

        //创建请求
        SearchRequest request = new SearchRequest();
        request.indices("newuser");

        //创建条件
        SearchSourceBuilder condition = new SearchSourceBuilder();
        RangeQueryBuilder rangedQuery = QueryBuilders.rangeQuery("age");

        //添加条件 大于等于23，小于40
        rangedQuery.gte(22);
        rangedQuery.lt(40);
        condition.query(rangedQuery);

        request.source(condition);
        //发送请求，获取响应
        SearchResponse response = esClient.search(request, RequestOptions.DEFAULT);
        //打印查询出来的数据
        SearchHits hits = response.getHits();
        System.out.println("花费的时间："+response.getTook());
        System.out.println("命中的条数："+hits.getTotalHits());
        for(SearchHit hit : hits){
            System.out.println(hit);
        }
        //关闭连接
        esClient.close();
    }
}

11.8、模糊查询

public class ESTest_doc_fuzzy_query {
    public static void main(String[] args) throws IOException {
        //创建连接
        RestHighLevelClient esClient = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost" , 9200 , "http")));

        //创建请求
        SearchRequest request = new SearchRequest();
        request.indices("newuser");

        //添加条件
        SearchSourceBuilder condition = new SearchSourceBuilder();
        //模糊查询 查询zhangsan，偏差可以有两个字符
        condition.query(QueryBuilders.fuzzyQuery("name","zhangsan").fuzziness(Fuzziness.TWO));

        request.source(condition);
        //发起请求，获取响应
        SearchResponse response = esClient.search(request, RequestOptions.DEFAULT);

        //打印查询出来的数据
        SearchHits hits = response.getHits();
        System.out.println("花费的时间："+response.getTook());
        System.out.println("命中的条数："+hits.getTotalHits());
        for(SearchHit hit : hits){
            System.out.println(hit);
        }
        //关闭连接
        esClient.close();

    }
}

11.9、分组查询

public class ESTest_doc_agg_query {
    public static void main(String[] args) throws IOException {
        //创建连接
        RestHighLevelClient esClient = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost" ,9200, "http")));

        //创建请求
        SearchRequest request = new SearchRequest();
        request.indices("newuser");

        //添加条件
        SearchSourceBuilder builder = new SearchSourceBuilder();
        //求年龄最大值
        // AggregationBuilder aggregationBuilder = AggregationBuilders.max("maxAge").field("age");
        //根据年龄进行分组
        AggregationBuilder aggregationBuilder = AggregationBuilders.terms("ageGroup").field("age");
        builder.aggregation(aggregationBuilder);

        request.source(builder);
        //发送请求，获取响应
        SearchResponse response = esClient.search(request, RequestOptions.DEFAULT);
        //打印查询出来的数据
        SearchHits hits = response.getHits();
        System.out.println("花费的时间："+response.getTook());
        System.out.println("命中的条数："+hits.getTotalHits());
        for(SearchHit hit : hits){
            System.out.println(hit);
        }
        //关闭连接
        esClient.close();
    }
}

12、Elasticsearch环境

12.1、相关概念

12.1.1、单机 & 集群

单台Elasticsearch服务器提供服务，往往都有最大的负载能力，超过这个阈值，服务器的性能就会大大降低甚至不可用，所以生产环境中，一般都是运行在指定服务器集群中。

除了负载能力，单点服务器还存在其他问题：

单台机器存储容量有限
但服务器容易出现单点故障，无法实现高可用
单服务器的并发处理能力有限

集群是把多个节点整体作为一个服务

配置服务器集群时，集群中节点数量没有限制，大于等于2个节点就可以看作是集群了。一般出于高性能及高可用方面来考虑集群中节点数量都是3个以上。

12.1.2、集群 Cluster

一个集群就是由一个或者多个服务器节点组织在一起，共同持有整个的数据，并一起提供索引和搜索功能。一个Elasticsearch集群有一个唯一的名字表示，这个名字默认就是"elasticsearch"。这个名字是比较重要的，因为一个节点只能通过指定某个集群的名字来加入这个集群。

12.1.3、节点Node

集群中包含很多服务器，一个节点就是其中的一个服务器。作为集群的一部分，它存储数据，参与集群的索引和搜索功能。

一个节点也是有一个名字来标识，默认情况下，这个名字是一个随机的漫威漫画角色的名字，这个名字会在启动的时候赋予节点。这个名字对于管理工作来说挺重要的，因为在这个管理过程中，你会去确定网络中的哪些服务器对应于Elasticsearch集群中的哪些节点。

一个节点可以通过配置集群名称的方式来加入一个指定的集群。默认情况下，每个节点都会被安排加入到一个叫做"elasticsearch"的集群中，这意味着，如果你在你的网络中启动了若干个节点，并假定他们能够相互发现彼此，它们将会自动地形成并加入到一个叫做"elasticsearch"的集群中。

12.2、Windows集群

12.2.1、部署集群

创建elasticsearch-cluster文件夹，在内部复制三个elasticsearch服务（但是因为es之前使用过，所以在复制之前，需要把data目录和logs目录清空）

修改集群文件目录中每个节点的config/elasticsearch.yml配置文件

Node-1001节点

#节点1的配置信息
#集群名称，节点之间要保持一致
cluster.name: my-application
#节点名称，集群内要唯一
node.name: node-1
node.master: true     //可以是主节点
node.data: true     //可以是数据节点
#IP地址
network.host: localhost
#http端口
http.port: 9201
#tcp端口，通信端口
transport.tcp.port: 9301
#discovery.seed_hosts: ["localhost:9301","localhost:9302","localhost:9303"]
#discovery.zen.fd.ping_timeout: 1m
#discovery.zen.fd.ping_retries: 5
#集群内的可以被选为主节点的列表
#cluster.initial_master_nodes: ["node-1001","node-1002","node-1003"]
#跨域配置
#action.destructive_requires_name: true
http.cors.enabled: true
http.cors.allow-origin: "*"

访问集群状态
- http://localhost:9201/_cluster/health
节点为1001的时候，启动报错java.net.BindException
- 原因是在linux系统中只有root用户才可以访问1024以下的端口号，其余用户不可访问
node-1002节点

#节点1的配置信息
#集群名称，节点之间要保持一致
cluster.name: my-application
#节点名称，集群内要唯一
node.name: node-2
node.master: true     //可以是主节点
node.data: true     //可以是数据节点
#IP地址
network.host: localhost
#http端口
http.port: 9202
#tcp端口，通信端口
transport.tcp.port: 9302
#ES中特殊的查找模块，用来查找master节点的模块
discovery.seed_hosts: ["localhost:9301"]
discovery.zen.fd.ping_timeout: 1m
discovery.zen.fd.ping_retries: 5
#集群内的可以被选为主节点的列表
#cluster.initial_master_nodes: ["node-1001","node-1002","node-1003"]
#跨域配置
#action.destructive_requires_name: true
http.cors.enabled: true
http.cors.allow-origin: "*"

node-1003节点

#节点1的配置信息
#集群名称，节点之间要保持一致
cluster.name: my-application
#节点名称，集群内要唯一
node.name: node-3
node.master: true     //可以是主节点
node.data: true     //可以是数据节点
#IP地址
network.host: localhost
#http端口
http.port: 9203
#tcp端口，通信端口
transport.tcp.port: 9303
#ES中特殊的查找模块，用来查找master节点的模块
discovery.seed_hosts: ["localhost:9301","localhost:9302"]
discovery.zen.fd.ping_timeout: 1m
discovery.zen.fd.ping_retries: 5
#集群内的可以被选为主节点的列表
#cluster.initial_master_nodes: ["node-1001","node-1002","node-1003"]
#跨域配置
#action.destructive_requires_name: true
http.cors.enabled: true
http.cors.allow-origin: "*"

启动之前先删除每个节点中的data和logs目录中所有内容（如果存在）；分别双击执行bin/elasticsearch.bat,启动节点服务器，启动之后，会自动加入指定名称的集群
查询集群状态

GET		http://localhost:9201/_cluster/health
//response
{
    "cluster_name": "my-application",
    "status": "green",
    "timed_out": false,
    "number_of_nodes": 3,
    "number_of_data_nodes": 3,
    "active_primary_shards": 0,
    "active_shards": 0,
    "relocating_shards": 0,
    "initializing_shards": 0,
    "unassigned_shards": 0,
    "delayed_unassigned_shards": 0,
    "number_of_pending_tasks": 0,
    "number_of_in_flight_fetch": 0,
    "task_max_waiting_in_queue_millis": 0,
    "active_shards_percent_as_number": 100.0
}

12.3、Linux单机

12.3.1、软件下载

软件下载地址：https://www.elastic.co/cn/downloads/past-releases/elasticsearch-7-8-0
在这里插入图片描述

12.3.2、软件安装

1、解压软件

//1、先将下载的软件上传至服务器
//2、解压缩
tar -axvf elasticsearch-7.8.0-linux-x86_64.tar.gz -C /opt/module
//3、改名
mv elasticsearch-7.8.0 es

2、创建用户

因为安全问题，Elasticsearch不允许root用户直接运行，所以要创建新用户，在root用户中创建新用户

//新增es用户
useradd es
//为es用户设置密码
passwd es
//如果错了，可以删除再加
userdel -r es
//文件夹所有者
chown -R es:es /opt/module/es

3、修改配置文件

修改/opt/module/es/config/elasticsearch.yml文件

//加入如下配置
cluster.name: elasticsearch
node.name: nade-1
network.host: 0.0.0.0
http.port: 9200
cluster.initial_master_nodes: ["node-1"]

修改/etc/security/limits.conf

//在文件末尾中增加下面内容
//每个进程可以打开的文件数的限制
es soft nofile 65536
es hard nofile 65536

修改/etc/security/limits.d/20-nproc.conf

//在文件末尾中增加下面内容
//每个进程可以打开的文件数的限制
es soft nofile 65536
es hard nofile 65536
//操作系统级别对每个用户创建的进程数的限制
* hard nproc 4096
//* 代表Linux所有用户名称

修改/etc/sysctl.conf

//在文件中增加下面的内容
//一个进程可以拥有的VMA（虚拟内存区域）的数量，默认值为65536
vm.max_map_count=655360

重新加载

sysctl -p

12.3.3、启动软件

使用ES用户启动

cd /opt/module/es/
//启动
bin/elasticsearch
//后台启动
bin/elasticsearch -d
  
//直接启动则会报错
java.lang.RuntimeException:can not run elasticsearch as root
//不能使用root用户直接进行操作，需要切换到刚刚创建的es用户进行启动
su es
//此时还是会报错
Exception in thread "main" org.elasticsearch.bootstrap.BootstrapException: java.nio.file.AccessDeniedException: /opt/module/es/config/elasticsearch.keystore
Likely root cause : java.nio.file.AccessDeniedException: /opt/module/es/config/elasticsearch.keystore
//这个问题的主要原因是我们在启动的过程中，会动态生成一些文件目录，这些目录的内容权限和ES没有关系
//我们可以切换到root用户，再执行一遍chown -R es:es /opt/module/es
su root
chown -R es:es /opt/module/es
su es
//然后再进行启动

12.4、Linux 集群

下载、安装、创建用户和单节点的操作是一样的

3、修改配置文件

修改/opt/module/es/config/elasticsearch.yml文件，分发文件

//加入如下配置
//集群名称
cluster.name: cluster-es
//节点名称。每个节点的名称不能重复
node.name: node-1
//IP地址。每个节点的地址不能重复
network.host: linux1
//是不是有资格成为主节点
node.master: true
//是不是数据节点
node.data: true
http.port: 9200
//head 插件需要打开这两个配置
http.cors.allow-origin: "*"
http.cors.enabled: true
http.max_content_length: 200mb
//es7.x之后新增的配置，初始化一个新的集群时需要此配置来选举maxter
cluster.initial_master_nodes: ["mode-1"]
//es7.x 之后新增的配置，节点发现
discovery.seed_hosts: ["linux1:9300","linux2:9300","linux3:9300"]
gateway.recover_after_nodex: 2
network.tcp.keep_alive: true
network.tcp.no_delay: true
transport.tcp.compress: true
//集群内同时启动的数据任务个数，默认是2个
cluster.routing.allocation.cluster_concurrent_rebalance: 16
//添加或删除节点及负载均衡时并发恢复的线程个数，默认是4个
cluster.routing.allocation.node_concurrent_recoveries: 16
//初始化数据恢复时，并发挥恢复线程的个数，默认是4个
  cluster.routing.allocation.node_initial_primaries_recoveries: 16