dsg_19 elasticsearch

码仆的逆袭

已于 2022-01-27 21:46:12 修改

阅读量216

点赞数

分类专栏：数据库文章标签： elasticsearch 搜索引擎 lucene

于 2021-03-28 15:37:20 首次发布

本文链接：https://blog.csdn.net/qq_34191426/article/details/113620012

版权

数据库专栏收录该内容

17 篇文章 0 订阅

订阅专栏

一般数据库检索的特点（oracle，mysql）

全表扫描，如果表的基数比较大，检索效率就会非常低
不支持搜素词分词检索，比如生化机，就没有办法搜索出生化危机
索引不支持模糊查询
正排索引

倒排索引

在这里插入图片描述
倒排索引：记录词条所在的文档id、词条出现频率、词条在文档中的位置等信息

文档id：用于快速获取文档
词条频率：文档在词条出现的次数，用于评分

lucene

lucene，就是一个jar包，里面包含了封装好的各种建立倒排索引，以及进行搜索的代码，包括各种算法。我们就用java开发的时候，引入lucene.jar，然后基于lucene的api进行开发就可以了。用lucene，我们就可以去将已有的数据建立索引，lucene会在本地磁盘上面，给我们组织索引的数据结构。

Lucene的优势：
- 易扩展
- 高性能（基于倒排索引）
Lucene的缺点：
- 只限于Java语言开发
- 学习曲线陡峭
- 不支持水平扩展

elasticsearch

elasticsearch是对lucene的封装集群，相比于Lucene有以下特点

支持分布式，可水平扩展
提供Restful接口，可被任何语言调用

搜索引擎技术排名：

Elasticsearch：开源的分布式搜索引擎
Splunk：商业项目
Solr：Apache的开源搜索引擎

elasticsearch的核心概念与传统数据库的对比

MySQL	Elasticsearch	说明
Table	Index	索引(`index`)，就是文档的集合，类似数据库的表(`table`)
Row	Document	文档（`Document`），就是一条条的数据，类似数据库中的行（`Row`），文档都是`JSON`格式
Column	Field	字段（`Field`），就是`JSON`文档中的字段，类似数据库中的列（`Column`）
Schema	Mapping	`Mapping`（映射）是索引中文档的约束，例如字段类型约束。类似数据库的表结构（`Schema`）
SQL	DSL	`DSL`是`elasticsearch`提供的`JSON`风格的请求语句，用来操作`elasticsearch`，实现`CRUD`

elasticsearch与一般关系型数据库的对比

Mysql：擅长事务类型操作，可以确保数据的安全和一致性
Elasticsearch：擅长海量数据的搜索、分析、计算

elasticsearch安装

elasticsearch下载地址

#1. 下载elasticsearch-7.10.2-linux-x86_64.tar.gz安装包并上传linux解压
[root@Centos101 elasticsearch]# tar -zvxf elasticsearch-7.10.2-linux-x86_64.tar.gz

#2. 创建data和logs文件夹
[root@Centos101 elasticsearch]# mkdir data
[root@Centos101 elasticsearch]# mkdir logs

#3. 修改配置文件
[root@Centos101 config]# vi elasticsearch.yml
#去除下面的属性的注解，并修改值
#3-1. 集群名称
cluster.name: my-application 
#3-2. 节点名称
node.name: node-1 
#3-3. 数据和日志的存储目录
path.data: /software/elasticsearch/data
path.logs: /software/elasticsearch/logs
#3-4. 设置绑定的ip，设置为0.0.0.0以后就可以让任何计算机节点访问到了 
network.host: 0.0.0.0
#3-5. 端口
http.port: 9200 
#3-6. 设置在集群master主机节点的节点名称
cluster.initial_master_nodes: ["node-1"]
#3-7. 跨域配置
http.cors.enabled: true
http.cors.allow-origin: "*"
http.max_content_length: 200mb
#3-8. 配置发现节点
discovery.seed_hosts: ["192.168.113.100:9300","192.168.113.101:9300","192.168.113.102:9300"]
gateway.recover_after_nodes: 2
network.tcp.keep_alive: true
network.tcp.no_delay: true
transport.tcp.compress: true
#3-9. 集群内同时启动的数据任务个数，默认2个
cluster.routing.allocation.cluster_concurrent_rebalance: 16
#3-10. 添加或删除节点及负载均衡时并发恢复的线程个数，默认4个
cluster.routing.allocation.node_concurrent_recoveries: 16
#3-11. 初始化数据恢复时，并发恢复线程的个数，默认4个
cluster.routing.allocation.node_initial_primaries_recoveries: 16

#4. 启动
[root@Centos101 bin]# ./elasticsearch

#5. 报错
#a.
Exception in thread "main" java.lang.RuntimeException: don't run elasticsearch as root.
#创建用户，将es所在的文件夹赋予用户给新用户，然后切换用户，由新用户来执行es
[root@Centos101 config]# adduser es
[root@Centos101 software]# chown -R es:es elasticsearch

#b.
Caused by: org.elasticsearch.ElasticsearchException: X-Pack is not supported and Machine Learning is not available for [linux-x86]; you can use the other X-Pack features (unsupported) by setting xpack.ml.enabled: false in elasticsearch.yml
#修改配置文件，设置属性xpack.ml.enabled为false
[root@Centos101 config]# vi elasticsearch.yml
xpack.ml.enabled: false

#c.
ERROR: [4] bootstrap checks failed
[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65535]
#解决办法：修改系统配置文件limits.conf，在文件最后添加下面的内容，具体设置多少值由“increase to at least [65535]”决定
[root@Centos101 config]# vi /etc/security/limits.conf
es soft nofile 65536
es hard nofile 65536

#d.
[2]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
#解决办法：修改系统配置文件sysctl.conf，在文件后面添加下面的内容，具体设置多少值由“increase to at least [262144]”决定
[root@Centos101 bin]# vi /etc/sysctl.conf
vm.max_map_count = 262144
[root@Centos101 bin]# sysctl -p
vm.max_map_count = 262144

#e.
[3]: JVM is using the client VM [Java HotSpot(TM) Client VM] but should be using a server VM for the best performance
#解决办法：修改jre里面的文件/jre/lib/i386/jvm.cfg，将-server KNOWN调到-client IF_SERVER_CLASS -server的前面
[root@Centos101 i386]# vi jvm.cfg
-server KNOWN
-client IF_SERVER_CLASS -server
-minimal KNOWN

#f.
[4]: system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk
#解决办法：修改配置文件elasticsearch.yml，将bootstrap.memory_lock属性设为false，并在下一行加一个属性bootstrap.system_call_filter设置为false
[root@Centos101 config]# vi elasticsearch.yml
bootstrap.memory_lock: false
bootstrap.system_call_filter: false

#g.
[5]: max number of threads [3853] for user [se] is too low, increase to at least [4096]
#解决办法：修改系统配置文件limits.conf，在文件最后添加下面的内容，具体设置多少值由“increase to at least [4096]”决定
[root@Centos101 config]# vi /etc/security/limits.conf
es soft nproc 4096
es hard nproc 4096

#6. 重新加载系统配置
[root@Centos101 config]# sysctl -p

#7. 运行启动
[es@Centos101 bin]# ./elasticsearch &

web访问：192.168.113.101:9200
在这里插入图片描述
参数说明：

{
  "name" : "节点名称",
  "cluster_name" : "集群名称",
  "cluster_uuid" : "IN_xW_yWSUmnaezDxhrZIg",
  "version" : {
    "number" : "es版本号",
    "build_flavor" : "default",
    "build_type" : "zip",
    "build_hash" : "747e1cc71def077253878a59143c1f785afa92b9",
    "build_date" : "2021-01-13T00:42:12.435326Z",
    "build_snapshot" : false,
    "lucene_version" : "8.7.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

es运行内存调优

修改jvm.options文件里的参数来配置es运行内存

#es默认设置是1g，满足最普通服务的配置，但是在真正的生产开发中，需要调整它的内存大概到50%，最好不要超过32G
-Xms1g
-Xmx1g

使用postman访问es

创建索引

在这里插入图片描述

获取索引相关信息

在这里插入图片描述

查询es中所有的索引信息

在这里插入图片描述

删除索引

在这里插入图片描述

添加数据

es自动生成ID

在这里插入图片描述

自定义ID

在这里插入图片描述

根据ID查询数据

在这里插入图片描述

随机查询

在这里插入图片描述

完全覆盖性修改

在这里插入图片描述

局部更新

在这里插入图片描述

根据ID删除数据

在这里插入图片描述

码仆的逆袭

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
dsg_19 elasticsearch

elasticsearchelasticsearch安装elasticsearch安装elasticsearch下载地址#1. 下载elasticsearch-7.10.2-linux-x86_64.tar.gz安装包并上传linux解压[root@Centos101 elasticsearch]# tar -zvxf elasticsearch-7.10.2-linux-x86_64.tar.gz#2. 创建data和logs文件夹[root@Centos101 elasticsearch]
复制链接

扫一扫

专栏目录