1.Elasticsearch的安装和入门学习，了解集群、倒排索引、索引的操作、文档的操作、文档乐观锁、分词器以及自定义分词器等

最新推荐文章于 2023-04-13 16:23:32 发布

置顶 _又菜又爱学

最新推荐文章于 2023-04-13 16:23:32 发布

阅读量396

点赞数 2

分类专栏： elasticsearch 文章标签： elasticsearch

本文链接：https://blog.csdn.net/qq_39200831/article/details/116036485

版权

elasticsearch 专栏收录该内容

6 篇文章 1 订阅

订阅专栏

1.Elasticsearch概述

Elasticsearch 是一个分布式的免费开源搜索和分析引擎，适用于包括文本、数字、地理空间、结构化和非结构化数据等在内的所有类型的数据。Elasticsearch 在 Apache Lucene 的基础上开发而成，由 Elasticsearch N.V.（即现在的 Elastic）于 2010 年首次发布。Elasticsearch 以其简单的 REST 风格 API、分布式特性、速度和可扩展性而闻名，是 Elastic Stack 的核心组件；Elastic Stack 是一套适用于数据采集、扩充、存储、分析和可视化的免费开源工具。人们通常将 Elastic Stack 称为 ELK Stack（代指 Elasticsearch、Logstash 和 Kibana），目前 Elastic Stack 包括一系列丰富的轻量型数据采集代理，这些代理统称为 Beats，可用来向 Elasticsearch 发送数据。

2.Elasticsearch的用途是什么

Elasticsearch 在速度和可扩展性方面都表现出色，而且还能够索引多种类型的内容，这意味着其可用于多种用例：

应用程序搜索
网站搜索
企业搜索
日志处理和分析
基础设施指标和容器监测
应用程序性能监测
地理空间数据分析和可视化
安全分析
业务分析

3.Elasticsearch 的工作原理是什么？

原始数据会从多个来源（包括日志、系统指标和网络应用程序）输入到 Elasticsearch 中。数据采集指在 Elasticsearch 中进行索引之前解析、标准化并充实这些原始数据的过程。这些数据在 Elasticsearch 中索引完成之后，用户便可针对他们的数据运行复杂的查询，并使用聚合来检索自身数据的复杂汇总。在 Kibana 中，用户可以基于自己的数据创建强大的可视化，分享仪表板，并对 Elastic Stack 进行管理。

4.Elasticsearch 索引是什么？

Elasticsearch 索引指相互关联的文档集合。Elasticsearch 会以 JSON 文档的形式存储数据。每个文档都会在一组键（字段或属性的名称）和它们对应的值（字符串、数字、布尔值、日期、数值组、地理位置或其他类型的数据）之间建立联系。

Elasticsearch 使用的是一种名为倒排索引的数据结构，这一结构的设计可以允许十分快速地进行全文本搜索。倒排索引会列出在所有文档中出现的每个特有词汇，并且可以找到包含每个词汇的全部文档。

在索引过程中，Elasticsearch 会存储文档并构建倒排索引，这样用户便可以近实时地对文档数据进行搜索。索引过程是在索引 API 中启动的，通过此 API 您既可向特定索引中添加 JSON 文档，也可更改特定索引中的 JSON 文档。

5.为何使用 Elasticsearch？

**Elasticsearch 很快。**由于 Elasticsearch 是在 Lucene 基础上构建而成的，所以在全文本搜索方面表现十分出色。Elasticsearch 同时还是一个近实时的搜索平台，这意味着从文档索引操作到文档变为可搜索状态之间的延时很短，一般只有一秒。因此，Elasticsearch 非常适用于对时间有严苛要求的用例，例如安全分析和基础设施监测。
**Elasticsearch 具有分布式的本质特征。**Elasticsearch 中存储的文档分布在不同的容器中，这些容器称为分片，可以进行复制以提供数据冗余副本，以防发生硬件故障。Elasticsearch 的分布式特性使得它可以扩展至数百台（甚至数千台）服务器，并处理 PB 量级的数据。
**Elasticsearch 包含一系列广泛的功能。**除了速度、可扩展性和弹性等优势以外，Elasticsearch 还有大量强大的内置功能（例如数据汇总和索引生命周期管理），可以方便用户更加高效地存储和搜索数据。
**Elastic Stack 简化了数据采集、可视化和报告过程。**通过与 Beats 和 Logstash 进行集成，用户能够在向 Elasticsearch 中索引数据之前轻松地处理数据。同时，Kibana 不仅可针对 Elasticsearch 数据提供实时可视化，同时还提供 UI 以便用户快速访问应用程序性能监测 (APM)、日志和基础设施指标等数据。

6.Elasticsearch 支持哪些编程语言？

Elasticsearch 支持多种编程语言，目前提供针对下列编程语言的官方客户端：

Java
JavaScript (Node.js)
Go
.NET (C#)
PHP
Perl
Python
Ruby

7.ES核心术语

ES -> 数据库
索引 index —> 索引库表
类型 type —> 表逻辑类型
文档document —> 行
字段fields —> 列
映射mapping —> 表结构定义
近实时NRT —>Near real time 接近实时的时间
节点node —> 每一个服务器
shard replica —> 数据分片与备份 shard = primary shard（主分片） replica = replica shard（备份节点）

集群相关：

分片（shard）：把索引库拆分为多份，分别放在不同的节点上，比如有3个节点，3个节点的所有数据内容加在一起是一个完整的索引库。分别保存到三个节点上水平扩展，提高吞吐量。
备份（replica）：每个shard的备份。

解释说明：当不使用集群时，每个shard分片只有1T的数据量，如果面对数据量较大的访问，可能应付不了

当使用集群时，比如三个shard分片来处理请求数据，就会有3T的数据量，当一个shard分片挂掉的时候，replica就会调用当前分片的replica shard（备份的分片）

8.倒排索引

elasticserach02

正排索引：一个未经处理的数据库中，一般是以文档ID作为索引，以文档内容作为记录。

而倒排索引：将单词或记录作为索引，将文档ID作为记录

9.安装elasticsearch

下载地址：https://www.elastic.co/cn/downloads/elasticsearch

选择需要的版本，这里选择Linux版本

下载完成后，上传到服务器中，这里放在/home/software文件下

//解压
tar -zxvf elasticsearch-7.12.0-linux-x86_64.tar.gz 

//查看
[root@xing software]# ls
elasticsearch-7.12.0  elasticsearch-7.12.0-linux-x86_64.tar.gz

//移动到/usr/local文件下
mv elasticsearch-7.12.0 /usr/local/

进入到/usr/local文件下，修改配置文件

//进入到/usr/local文件
[root@xing software]# cd /usr/local/

[root@xing local]# ll
drwxr-xr-x  9 root root 4096 3月  18 14:21 elasticsearch-7.12.0

[root@xing local]# cd elasticsearch-7.12.0/

[root@xing elasticsearch-7.12.0]# ll
总用量 576
drwxr-xr-x  2 root root   4096 3月  18 14:21 bin
drwxr-xr-x  3 root root   4096 4月  21 13:54 config
drwxr-xr-x  9 root root   4096 3月  18 14:21 jdk
drwxr-xr-x  3 root root   4096 3月  18 14:21 lib
-rw-r--r--  1 root root   3860 3月  18 14:15 LICENSE.txt
drwxr-xr-x  2 root root   4096 3月  18 14:19 logs
drwxr-xr-x 60 root root   4096 3月  18 14:22 modules
-rw-r--r--  1 root root 545323 3月  18 14:19 NOTICE.txt
drwxr-xr-x  2 root root   4096 3月  18 14:19 plugins
-rw-r--r--  1 root root   7263 3月  18 14:14 README.asciidoc

//创建data文件夹，用来存储数据
[root@xing elasticsearch-7.12.0]# mkdir data

//进入到config目录下，修改配置文件
[root@xing elasticsearch-7.12.0]# cd config/

[root@xing config]# ll
总用量 40
-rw-rw---- 1 root root  2739 3月  18 14:15 elasticsearch.yml
-rw-rw---- 1 root root  3110 3月  18 14:15 jvm.options
drwxr-x--- 2 root root  4096 3月  18 14:19 jvm.options.d
-rw-rw---- 1 root root 18612 3月  18 14:19 log4j2.properties
-rw-rw---- 1 root root   473 3月  18 14:19 role_mapping.yml
-rw-rw---- 1 root root   197 3月  18 14:19 roles.yml
-rw-rw---- 1 root root     0 3月  18 14:19 users
-rw-rw---- 1 root root     0 3月  18 14:19 users_roles

[root@xing config]# vim elasticsearch.yml

修改elasticsearch.yml 文件下的：

//1.修改集群的名称Cluster
cluster.name: imooc-elasticsearch

//2.修改节点名称
node.name: es-node1

//3.修改索引存储的位置
path.data: /usr/local/elasticsearch-7.12.0/data

//4.修改日志存储的位置
path.logs: /usr/local/elasticsearch-7.12.0/logs

//5.修改绑定网卡的IP 123.57.129.206
network.host: 0.0.0.0 

//6.初始化节点
cluster.initial_master_nodes: ["es-node1"]

修改 jvm.options虚拟机的配置文件

//设置虚拟机最大值和最小值
-Xms128m
-Xmx128m

注意：root用户权限下是不准启动Elasticsearch的，所以需要新建一个用户来启动Elasticsearch

新建用户：

//查看当前用户
[root@xing config]# whoami
root

//创建一个用户，并为其es目录下进行授权
[root@xing config]# useradd esuser

[root@xing config]# pwd
/usr/local/elasticsearch-7.12.0/config

[root@xing config]# cd ..

//授权访问
[root@xing elasticsearch-7.12.0]# chown -R esuser /usr/local/elasticsearch-7.12.0/

[root@xing elasticsearch-7.12.0]# ll
总用量 580
drwxr-xr-x  2 esuser root   4096 3月  18 14:21 bin
drwxr-xr-x  3 esuser root   4096 4月  21 15:38 config
drwxr-xr-x  2 esuser root   4096 4月  21 13:58 data
drwxr-xr-x  9 esuser root   4096 3月  18 14:21 jdk
drwxr-xr-x  3 esuser root   4096 3月  18 14:21 lib
-rw-r--r--  1 esuser root   3860 3月  18 14:15 LICENSE.txt
drwxr-xr-x  2 esuser root   4096 3月  18 14:19 logs
drwxr-xr-x 60 esuser root   4096 3月  18 14:22 modules
-rw-r--r--  1 esuser root 545323 3月  18 14:19 NOTICE.txt
drwxr-xr-x  2 esuser root   4096 3月  18 14:19 plugins
-rw-r--r--  1 esuser root   7263 3月  18 14:14 README.asciidoc

[root@xing elasticsearch-7.12.0]# chear

进入到bin/目录下，切换到esuser用户

[root@xing elasticsearch-7.12.0]# cd bin/
[root@xing bin]# ll
总用量 21112
-rwxr-xr-x 1 esuser root     2896 3月  18 14:15 elasticsearch
-rwxr-xr-x 1 esuser root      501 3月  18 14:19 elasticsearch-certgen
-rwxr-xr-x 1 esuser root      493 3月  18 14:19 elasticsearch-certutil
-rwxr-xr-x 1 esuser root      996 3月  18 14:15 elasticsearch-cli
-rwxr-xr-x 1 esuser root      443 3月  18 14:19 elasticsearch-croneval
-rwxr-xr-x 1 esuser root     4856 3月  18 14:15 elasticsearch-env
-rwxr-xr-x 1 esuser root     1828 3月  18 14:15 elasticsearch-env-from-file
-rwxr-xr-x 1 esuser root      184 3月  18 14:15 elasticsearch-keystore
-rwxr-xr-x 1 esuser root      450 3月  18 14:19 elasticsearch-migrate
-rwxr-xr-x 1 esuser root      126 3月  18 14:15 elasticsearch-node
-rwxr-xr-x 1 esuser root      172 3月  18 14:15 elasticsearch-plugin
-rwxr-xr-x 1 esuser root      441 3月  18 14:19 elasticsearch-saml-metadata
-rwxr-xr-x 1 esuser root      448 3月  18 14:19 elasticsearch-setup-passwords
-rwxr-xr-x 1 esuser root      118 3月  18 14:15 elasticsearch-shard
-rwxr-xr-x 1 esuser root      483 3月  18 14:19 elasticsearch-sql-cli
-rwxr-xr-x 1 esuser root 21529276 3月  18 14:19 elasticsearch-sql-cli-7.12.0.jar
-rwxr-xr-x 1 esuser root      436 3月  18 14:19 elasticsearch-syskeygen
-rwxr-xr-x 1 esuser root      436 3月  18 14:19 elasticsearch-users
-rwxr-xr-x 1 esuser root      356 3月  18 14:19 x-pack-env
-rwxr-xr-x 1 esuser root      364 3月  18 14:19 x-pack-security-env
-rwxr-xr-x 1 esuser root      363 3月  18 14:19 x-pack-watcher-env

[root@xing bin]# ./elasticsearch
warning: usage of JAVA_HOME is deprecated, use ES_JAVA_HOME
Future versions of Elasticsearch will require Java 11; your Java version from [/www/wwwroot/xyl/jdk/java-se-8u41-ri/jre] does not meet this requirement. Consider switching to a distribution of Elasticsearch with a bundled JDK. If you are already using a distribution with a bundled JDK, ensure the JAVA_HOME environment variable is not set.
warning: usage of JAVA_HOME is deprecated, use ES_JAVA_HOME
Future versions of Elasticsearch will require Java 11; your Java version from [/www/wwwroot/xyl/jdk/java-se-8u41-ri/jre] does not meet this requirement. Consider switching to a distribution of Elasticsearch with a bundled JDK. If you are already using a distribution with a bundled JDK, ensure the JAVA_HOME environment variable is not set.
[2021-04-21T16:20:41,307][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [es-node1] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.RuntimeException: can not run elasticsearch as root

注意：root用户下启动会抛异常：RuntimeException: can not run elasticsearch as root

//切换用户权限
[root@xing bin]# su esuser

[esuser@xing bin]$ ./elasticsearch
warning: usage of JAVA_HOME is deprecated, use ES_JAVA_HOME
Future versions of Elasticsearch will require Java 11; your Java version from [/www/wwwroot/xyl/jdk/java-se-8u41-ri/jre] does not meet this requirement. Consider switching to a distribution of Elasticsearch with a bundled JDK. If you are already using a distribution with a bundled JDK, ensure the JAVA_HOME environment variable is not set.
warning: usage of JAVA_HOME is deprecated, use ES_JAVA_HOME
Future versions of Elasticsearch will require Java 11; your Java version from [/www/wwwroot/xyl/jdk/java-se-8u41-ri/jre] does not meet this requirement. Consider switching to a distribution of Elasticsearch with a bundled JDK. If you are already using a distribution with a bundled JDK, ensure the JAVA_HOME environment variable is not set.
Exception in thread "main" java.nio.file.AccessDeniedException: /usr/local/elasticsearch-7.12.0/config/elasticsearch.keystore

注意：这里报的java.nio.file.AccessDeniedException: /usr/local/elasticsearch-7.12.0/config/elasticsearch.keystore错误，需要先切回root用户下，继续授权

[esuser@xing bin]$ su root
密码：
[root@xing bin]# pwd
/usr/local/elasticsearch-7.12.0/bin
[root@xing bin]# cd ..
[root@xing elasticsearch-7.12.0]# pwd
/usr/local/elasticsearch-7.12.0
[root@xing elasticsearch-7.12.0]# chown -R esuser:esuser /usr/local/elasticsearch-7.12.0
[root@xing elasticsearch-7.12.0]# ll
总用量 580
drwxr-xr-x  2 esuser esuser   4096 3月  18 14:21 bin
drwxr-xr-x  3 esuser esuser   4096 4月  21 16:20 config
drwxr-xr-x  2 esuser esuser   4096 4月  21 13:58 data
drwxr-xr-x  9 esuser esuser   4096 3月  18 14:21 jdk
drwxr-xr-x  3 esuser esuser   4096 3月  18 14:21 lib
-rw-r--r--  1 esuser esuser   3860 3月  18 14:15 LICENSE.txt
drwxr-xr-x  2 esuser esuser   4096 4月  21 16:20 logs
drwxr-xr-x 60 esuser esuser   4096 3月  18 14:22 modules
-rw-r--r--  1 esuser esuser 545323 3月  18 14:19 NOTICE.txt
drwxr-xr-x  2 esuser esuser   4096 3月  18 14:19 plugins
-rw-r--r--  1 esuser esuser   7263 3月  18 14:14 README.asciidoc

再次切换esuser用户下，启动elasticsearch

[root@xing elasticsearch-7.12.0]# su esuser
[esuser@xing elasticsearch-7.12.0]$ ll
总用量 580
drwxr-xr-x  2 esuser esuser   4096 3月  18 14:21 bin
drwxr-xr-x  3 esuser esuser   4096 4月  21 16:20 config
drwxr-xr-x  2 esuser esuser   4096 4月  21 13:58 data
drwxr-xr-x  9 esuser esuser   4096 3月  18 14:21 jdk
drwxr-xr-x  3 esuser esuser   4096 3月  18 14:21 lib
-rw-r--r--  1 esuser esuser   3860 3月  18 14:15 LICENSE.txt
drwxr-xr-x  2 esuser esuser   4096 4月  21 16:20 logs
drwxr-xr-x 60 esuser esuser   4096 3月  18 14:22 modules
-rw-r--r--  1 esuser esuser 545323 3月  18 14:19 NOTICE.txt
drwxr-xr-x  2 esuser esuser   4096 3月  18 14:19 plugins
-rw-r--r--  1 esuser esuser   7263 3月  18 14:14 README.asciidoc
[esuser@xing elasticsearch-7.12.0]$ cd bin/
[esuser@xing bin]$ ./elasticsearch
warning: usage of JAVA_HOME is deprecated, use ES_JAVA_HOME
Future versions of Elasticsearch will require Java 11; your Java version from [/www/wwwroot/xyl/jdk/java-se-8u41-ri/jre] does not meet this requirement. Consider switching to a distribution of Elasticsearch with a bundled JDK. If you are already using a distribution with a bundled JDK, ensure the JAVA_HOME environment variable is not set.
warning: usage of JAVA_HOME is deprecated, use ES_JAVA_HOME
Future versions of Elasticsearch will require Java 11; your Java version from [/www/wwwroot/xyl/jdk/java-se-8u41-ri/jre] does not meet this requirement. Consider switching to a distribution of Elasticsearch with a bundled JDK. If you are already using a distribution with a bundled JDK, ensure the JAVA_HOME environment variable is not set.
[2021-04-21T16:31:03,715][INFO ][o.e.n.Node               ] [es-node1] version[7.12.0], pid[19550], build[default/tar/78722783c38caa25a70982b5b042074cde5d3b3a/2021-03-18T06:17:15.410153305Z], OS[Linux/3.10.0-1160.11.1.el7.x86_64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_41/25.40-b25]
[2021-04-21T16:31:03,716][INFO ][o.e.n.Node               ] [es-node1] JVM home [/www/wwwroot/xyl/jdk/java-se-8u41-ri/jre], using bundled JDK [false]
[2021-04-21T16:31:03,716][INFO ][o.e.n.Node               ] [es-node1] JVM arguments [-Xshare:auto, -Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dio.netty.allocator.numDirectArenas=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.locale.providers=SPI,JRE, -Xms128m, -Xmx128m, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Djava.io.tmpdir=/tmp/elasticsearch-4313313928995394595, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -Xloggc:logs/gc.log, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=32, -XX:GCLogFileSize=64m, -XX:MaxDirectMemorySize=67108864, -Des.path.home=/usr/local/elasticsearch-7.12.0, -Des.path.conf=/usr/local/elasticsearch-7.12.0/config, -Des.distribution.flavor=default, -Des.distribution.type=tar, -Des.bundled_jdk=true]

报错信息如下：

ERROR: [1] bootstrap checks failed. You must address the points described in the following [1] lines before starting Elasticsearch.

bootstrap check failure [1] of [1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

ERROR: Elasticsearch did not exit normally - check the logs at /usr/local/elasticsearch-7.12.0/logs/imooc-elasticsearch.log

需要进入到/etc/security/limits.conf 修改环境配置vm.max_map_count：这里需要切换到root用户下去修改环境变量

[root@xing bin]$ vim /etc/security/limits.conf 

//在文件末尾添加以下配置
* soft nofile 65535
* hard nofile 131072
* soft nproc 2048
* hard nproc 4096

//保存并退出
:wq!
//强制退出
:q!

修改 /etc/sysctl.conf 文件：

[root@xing bin]# vim /etc/sysctl.conf 

//添加以下配置
vm.max_map_count=262145

//刷新配置，进行加载
[root@xing bin]# sysctl -p

然后再次切换到esuser用户下，去启动elasticsearch

[esuser@xing bin]$ ./elasticsearch

//会看到如下信息
[2021-04-21T16:58:53,139][INFO ][o.e.t.TransportService   ] [es-node1] publish_address {172.30.96.138:9300}, bound_addresses {[::]:9300}

[2021-04-21T16:58:54,710][INFO ][o.e.h.AbstractHttpServerTransport] [es-node1] publish_address {172.30.96.138:9200}, bound_addresses {[::]:9200}

停止的话，使用ctrl+c就可以停止

注意：这里需要开发服务器的9200端口，这样服务器IP：9200才能访问到，如果无法访问请检查防火墙端口是否打开和是否存在jdk

举例检查防火墙端口：

//查看防火墙开放端口。
firewall-cmd --list-all

//firewall-cmd --zone=public --add-port=端口号/tcp --permanent
[root@xing ~]# firewall-cmd --zone=public --add-port=9200/tcp --permanent
success
[root@xing ~]# firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: eth0
  sources: 
  services: dhcpv6-client ssh
  ports: 20/tcp 21/tcp 22/tcp 80/tcp 8888/tcp 39000-40000/tcp 3306/tcp
  protocols: 
  masquerade: no
  forward-ports: 
  source-ports: 
  icmp-blocks: 
  rich rules: 
  //这里发现9200还没有开启，所以需要重启防火强
firewall-cmd --reload
[root@xing ~]# firewall-cmd --reload
success
[root@xing ~]# firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: eth0
  sources: 
  services: dhcpv6-client ssh
  ports: 20/tcp 21/tcp 22/tcp 80/tcp 8888/tcp 39000-40000/tcp 3306/tcp 9200/tcp
  protocols: 
  masquerade: no
  forward-ports: 
  source-ports: 
  icmp-blocks: 
  rich rules: 
	
[root@xing ~]#

在自己的电脑打开http://123.57.129.206:9200/看看能不能访问成功！

10.elasticsearch-head插件

github地址：https://github.com/mobz/elasticsearch-head

Running with built in server

git clone git://github.com/mobz/elasticsearch-head.git
cd elasticsearch-head
npm install
npm run start
open http://localhost:9100/

如果在使用npm run报错的时，

Failed at the phantomjs-prebuilt@2.1.16 install script.

可以使用npm install phantomjs-prebuilt@2.1.16 --ignore-scripts忽略那个错误

然后使用elasticsearch-head访问elasticsearch时，发现访问不了，这是因为跨域问题，需要去修改elasticsearch.yml文件

//在network下添加以下
//开启跨域
http.cors.enabled: true
//允许跨域的来源 *代表所有
http.cors.allow-origin: "*"

重新启动elastic search发现就能成功连接了。

11.集群的健康

可以使用get方式请求：http://123.57.129.206:9200/_cluster/health/，就能查看集群的健康程度
删除节点：可以使用postman去delete请求方式，去请求http://123.57.129.206:9200/节点名称 /的方式进行删除
比如http://123.57.129.206:9200/index-demo, 当返回值为"acknowledged": true时表示删除成功

新建索引：postman请求的方式

elasticsearch05

查看索引

GET _cat/indices?v

删除索引

DELETE /index_test

12.索引的mappings映射

0. 索引分词概念

index：默认true，设置为false的话，那么这个字段就不会被索引

引用官文：https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-index.html

index

The index option controls whether field values are indexed. It accepts true or false and defaults to true. Fields that are not indexed are not queryable.

1. 创建索引的同时创建****mappings

PUT /index_str 

{ 
  "mappings": { 
      "properties": { 
        "realname": { 
          "type": "text", 
          "index": true 
        },
        "username": { 
          "type": "keyword", 
          "index": false 
        } 
      } 
  } 
}

**2.**查看分词效果

GET /index_mapping/_analyze 

{ 
    "field": "realname", 
    "text": "imooc is good" 
} 


**3.** **尝试修改**  报错
POST /index_str/_mapping 
{ 
    "properties": { 
    "name": { 
    "type": "long" 
    } 
  } 
}

4. 为已存在的索引创建或创建****mappings

POST /index_str/_mapping 

{ 
  "properties": { 
    "id": { 
    "type": "long" 
  },
    "age": {
    "type": "integer" 
  },
  }
}

13.文档的基本操作

1.添加文档数据

POST /my_doc/_doc/1 -> {索引名}/_doc/{索引ID}（是指索引在es中的id，而不是这条记录的id，比如记录的id从数据库来是 

{ 
    "id": 1001, 
    "name": "imooc-1", 
    "desc": "imooc is very good, 慕课网非常牛！", 
    "create_date": "2019-12-24" 
}

{ 
    "id": 1002, 
    "name": "imooc-2", 
    "desc": "imooc is fashion, 慕课网非常时尚！", 
    "create_date": "2019-12-25" 
}

{ 
    "id": 1003, 
    "name": "imooc-3", 
    "desc": "imooc is niubility, 慕课网很好很强大！", 
    "create_date": "2019-12-26" 
}

{ 
    "id": 1004, 
    "name": "imooc-4", 
    "desc": "imooc is good~！", 
    "create_date": "2019-12-27" 
}

{ 
  "id": 1005, 
  "name": "imooc-5", 
  "desc": "慕课网 is 强大！", 
  "create_date": "2019-12-28" 
}

{ 
    "id": 1006, 
    "name": "imooc-6", 
    "desc": "慕课是一个强大网站！", 
    "create_date": "2019-12-29" 
}

{ 
    "id": 1007, 
    "name": "imooc-7", 
    "desc": "慕课网是很牛网站！", 
    "create_date": "2019-12-30" 
}

{ 
    "id": 1008, 
    "name": "imooc-8", 
    "desc": "慕课网是很好看！", 
    "create_date": "2019-12-31"
}
{ 
    "id": 1009, 
    "name": "imooc-9", 
    "desc": "在慕课网学习很久了", 
    "create_date": "2020-01-01"
}

2.删除文档

DELETE /my_doc/_doc/1

注：文档删除不是立即删除，文档还是保存在磁盘上，索引增长越来越多，才会把那些曾经标识过删除的，进行清理，从磁盘上移出去

3.修改文档

局部：

POST /my_doc/_doc/1/_update 

{ 
  "doc": { 
  "name": "慕课" 
  } 
}

全量替换：

PUT /my_doc/_doc/1 

{ 
  "id": 1001, 
  "name": "imooc-1", 
  "desc": "imooc is very good, 慕课网非常牛！", 
  "create_date": "2019-12-24" 
}

注：每次修改后，version会更改

4.查询文档

常规查询

GET /index_demo/_doc/1 

GET /index_demo/_doc/_search

查询结果

{
    "_index": "my_doc",
    "_type": "_doc",
    "_id": "1",
    "_version": 4,
    "_seq_no": 10,
    "_primary_term": 1,
    "found": true,
    "_source": {
        "id": 1,
        "name": "imooc-1",
        "desc": "我爱学习",
        "create_date": "2020-01-02"
    }
}

元数据

_index：文档数据所属那个索引，理解为数据库的某张表即可。
_type：文档数据属于哪个类型，新版本使用 _doc 。
_id：文档数据的唯一标识，类似数据库中某张表的主键。可以自动生成或者手动指定。
_score：查询相关度，是否契合用户匹配，分数越高用户的搜索体验越高。
_version：版本号。
_source：文档数据，json格式。

定制结果集

GET /index_demo/_doc/1?_source=id,name 

GET /index_demo/_doc/_search?_source=id,name

判断文档是否存在

HEAD /index_demo/_doc/1

注意：成功的话，会返回状态码响应码200，找不到会报404

14.文档乐观锁控制 if_seq_no与if_primary_term

1.观察操作

插入新数据

POST /my_doc/_doc 

{ "id": 1010, 
  "name": "imooc-1010", 
  "desc": "imoocimooc！", 
  "create_date": "2019-12-24" 
}
# 此时 _version 为 1

修改数据

POST /my_doc/_doc/{_id}/_update 

{ 
  "doc": { 
  		"name": "慕课" 
  } 
}
# 此时 _version 为 2

模拟两个客户端操作同一个文档数据，_version都携带为一样的数值

# 操作1 POST /my_doc/_doc/{_id}/_update?if_seq_no={数值}&if_primary_term={数值} 
{ 
	"doc": { 
		"name": "慕课1" 
	} 
}


# 操作2 POST /my_doc/_doc/{_id}/_update?if_seq_no={数值}&if_primary_term={数值} 
{ 
	"doc": { 
		"name": "慕课2" 
	} 
}

2.版本元数据

_seq_no：文档版本号，作用同_version（相当于学生编号，每个班级的班主任为学生分配编号，效率要比学校教务处分配来的更加高来更方便）
_primary_term：文档所在位置（相当于班级）

官文地址：https://www.elastic.co/guide/en/elasticsearch/reference/current/optimistic-concurrency-control.html

15.分词与内置分词器

什么是分词？

把文本转换为一个个的单词，分词称之为analysis。es默认只对英文语句做分词，中文不支持，每个中文字都会被拆分为独立的个体。

英文分词：I study in imooc.com
中文分词：我在慕课网学习

POST /_analyze 
{ 
  "analyzer": "standard", 
  "text": "text文本" 
}

POST /my_doc/_analyze 
{ 
  "analyzer": "standard", 
  "field": "name", 
  "text": "text文本" 
}

es****内置分词器

standard：默认分词，单词会被拆分，大小会转换为小写。
simple：按照非字母分词。大写转为小写。
whitespace：按照空格分词。忽略大小写。
stop：去除无意义单词，比如 the / a / an / is …
keyword：不做分词。把整个文本作为一个单独的关键词。

{ 
	"analyzer": "standard", 
	"text": "My name is Peter Parker,I am a Super Hero. I don't like the Criminals." 
}

16.建立IK中文分词器

IK****中文分词器

Github：https://github.com/medcl/elasticsearch-analysis-ik

在官网的下载，上传到服务器的/home/software下，然后使用zip进行解压,然后重启elasticsearch即可

zip解压： unzip xxx.zip -d 指定文件夹地址/plugins/ ik

[root@xing software]# unzip elasticsearch-analysis-ik-7.12.0.zip -d /usr/local/elasticsearch-7.12.0/plugins/ik

注意：最好是IK分词器的版本要与elasticsearch的版本一致

两种分词器：ik_max_word和ik_smart

测试中文分词效果

POST /_analyze 
{ 
  "analyzer": "ik_max_word", 
  "text": "上下班车流量很大" 
}

分词器：ik_max_word 和 ik_smart 什么区别?

ik_max_word: 会将文本做最细粒度的拆分，比如会将“中华人民共和国国歌”拆分为“中华人民共和国,中华人民,中华,华人,人民共和国,人民,人,民,共和国,共和,和,国国,国歌”，会穷尽各种可能的组合，适合 Term Query；
ik_smart: 会做最粗粒度的拆分，比如会将“中华人民共和国国歌”拆分为“中华人民共和国,国歌”，适合 Phrase 查询。

17.自定义中文词库

1. 在**{es}/plugins/ik/config****下，创建：**

vim custom.dic

2. 并且添加内容：

慕课网 

骚年

**3.**配置自定义扩展词典

root用户进入到plugins/ik/config/IKAnalyzer.cfg.xml 下，修改自定义词库的文件位置

<entry key="ext_dict">custom.dic</entry>

4. 重启Elasticsearch

_又菜又爱学

关注

2
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
1.Elasticsearch的安装和入门学习，了解集群、倒排索引、索引的操作、文档的操作、文档乐观锁、分词器以及自定义分词器等

1.Elasticsearch概述 Elasticsearch 是一个分布式的免费开源搜索和分析引擎，适用于包括文本、数字、地理空间、结构化和非结构化数据等在内的所有类型的数据。Elasticsearch 在 Apache Lucene 的基础上开发而成，由 Elasticsearch N.V.（即现在的 Elastic）于 2010 年首次发布。Elasticsearch 以其简单的 REST 风格 API、分布式特性、速度和可扩展性而闻名，是 Elastic Stack 的核心组件；Elastic
复制链接

扫一扫

专栏目录