对于海量数据的搜索,如果单单靠数据库搜索的话效果不是太好,及时利用多线程查询数据,但是随着数据量的增加查询的速度还是跟不上,尤其是查询大批量类似数据时,关系型数据库就无可奈何了,所以全文检索就派上了用场。关于ES的基本定义,我这边就不多啰嗦了,主要在多节点安装下集群。详细步骤如下:
一、安装环境
(1).jdk8
(2).centos6 (通常情况用的centos7比较多,cent6安装过程会有一些问题,但也是可以解决的)
(3)elasticsearch-6.2.4.tar.gz 和 elasticsearch-analysis-ik-6.2.4.zip
(4)elasticsearch-head
(5)三台节点:
192.168.11.78
192.168.11.79
192.168.11.80
二、安装步骤
整体思路:
首先现在一台节点78上安装好,之后再scp安装好的elasticsearch到其他节点,更改相应的配置文件即可。
(1)在linux上创建安装用户bbk并授权,因为es的安装和启动用普通用户,要不然会报错
#es启动时需要使用非root用户,所有创建一个bbk用户:
useradd bbk
#为hadoop用户添加密码:
echo 123456 | passwd --stdin bbk
#将bigdata添加到sudoers
echo "bbk ALL = (root) NOPASSWD:ALL" | tee /etc/sudoers.d/bbk
chmod 0440 /etc/sudoers.d/bbk
mkdir /{bigdata,data}
(2)上传或者直接下载es安装包并解压,并将压缩包移到/bigdata下
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.2.4.tar.gz
tar -zxvf elasticsearch-6.2.4.tar.gz
(3)修改配置文件
vi /bigdata/elasticsearch-6.2.4/config/elasticsearch.yml
#集群名称,通过组播的方式通信,通过名称判断属于哪个集群
cluster.name: my-es
#节点名称,要唯一
node.name: node-1
#数据存放位置
path.data: /data/es/data
#日志存放位置(可选)
path.logs: /data/es/logs
#es绑定的ip地址
network.host: 192.168.11.78
#初始化时可进行选举的节点
discovery.zen.ping.unicast.hosts: ["node-1", "node-2", "node-3"]
【注意】"node-1", "node-2", "node-3"这些节点需要在每台节点的/etc/hosts中配置
配置的文件如下:
# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
# Before you set out to tweak and tune the configuration, make sure you
# understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: my-es
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: node-1
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /data/es/data
#
# Path to log files:
#
path.logs: /data/es/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 192.168.11.78
#
# Set a custom port for HTTP:
#
#http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
#discovery.zen.ping.unicast.hosts: ["host1", "host2"]
discovery.zen.ping.unicast.hosts: ["node-1","node-2","node-3"]
#
# Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):
#
#discovery.zen.minimum_master_nodes:
#
# For more information, consult the zen discovery module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true
http.cors.enabled: true
http.cors.allow-origin: "*"
(3)集群启动与关闭
启动:/bigdata/elasticsearch-6.2.4/bin/elasticsearch -d
关闭:ps -ef |grep elasticsearch 后的进程ID,再用kill -9 id进行暴力杀掉
二、启动出现的常见错误
如果你以为上面装好了,我想说的是才刚刚开始,具体错误在/data/es/log/my-es.log中查看
问题一:警告提示
unable to install syscall filter:
java.lang.UnsupportedOperationException: seccomp unavailable: requires kernel 3.5+ with CONFIG_SECCOMP and CONFIG_SECCOMP_FILTER compiled in
at org.elasticsearch.bootstrap.Seccomp.linuxImpl(Seccomp.java:349) ~[elasticsearch-5.0.0.jar:5.0.0]
at org.elasticsearch.bootstrap.Seccomp.init(Seccomp.java:630) ~[elasticsearch-5.0.0.jar:5.0.0]
报了一大串错误,其实只是一个警告。
解决:使用centos7版本,就不会出现此类问题了。
问题二:ERROR: bootstrap checks failed
max file descriptors [4096] for elasticsearch process likely too low, increase to at least [65536]
max number of threads [1024] for user [lishang] likely too low, increase to at least [2048]
解决:切换到root用户,编辑limits.conf 添加类似如下内容
vi /etc/security/limits.conf
添加如下内容:
* soft nofile 65536
* hard nofile 131072
* soft nproc 2048
* hard nproc 4096
问题三:max number of threads [1024] for user [lish] likely too low, increase to at least [2048]
解决:切换到root用户,进入limits.d目录下修改配置文件。
vi /etc/security/limits.d/90-nproc.conf
修改如下内容:
* soft nproc 1024
#修改为
* soft nproc 2048或者4096
问题四:max virtual memory areas vm.max_map_count [65530] likely too low, increase to at least [262144]
解决:切换到root用户修改配置sysctl.conf
vi /etc/sysctl.conf
添加下面配置:
vm.max_map_count=655360
并执行命令:
sysctl -p
注意上面的命令一定要一起执行
问题五:max file descriptors [4096] for elasticsearch process likely too low, increase to at least [65536]
解决:修改切换到root用户修改配置limits.conf 添加下面两行
命令:vi /etc/security/limits.conf
* hard nofile 65536
* soft nofile 65536
切换到es的用户。
问题六: max number of threads [1024] for user [huanlv] is too low, increase to at least [4096]
vi /etc/security/limits.d/90-nproc.conf
改成4096
问题七: system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk
config/elasticsearch.yml 末尾加上一句
bootstrap.system_call_filter: false
如果上面的问题解决了,就可以启动es了
通过命令检验:
curl -XGET 'http://192.168.11.78:9200/_cluster/health?pretty'
出现上面的画面就说明这台节点安装成功了
四、复制es到其他节点
scp -r elasticsearch-6.2.4/ node-5:$PWD
scp -r elasticsearch-6.2.4/ node-6:$PWD
在其他节点上修改es配置,需要修改的有node.name和network.host
如果其他都启动成功,则出现上面三个节点的状态: