ElesticSearch6.6.2入门和安装部署

最新推荐文章于 2023-09-06 14:36:50 发布

Try Everything、

最新推荐文章于 2023-09-06 14:36:50 发布

阅读量433

点赞数

分类专栏： ES

本文链接：https://blog.csdn.net/weixin_43212365/article/details/105555975

版权

ES 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

官网下载地址：https://www.elastic.co/cn/downloads/

1、ElesticSearch背景及介绍

ELK是三个组件的头字母简称：

Elasticsearch(ES)：对数据进行搜索、分析和存储；也是个NoSQL，类似于Redis/HBase/…

Logstash：收集日志文件，动态数据收集管道，还有Filebeat也可以收集（比Flume轻量级）

Kibana：实现数据可视化

MIS系统搜索是有很大的限制的：

都是点对点的查询，没法做分词，约等于查询

正排索引: doc_id到doc_content的关系

doc_id   doc_content
1       若泽数据从事大数据培训
2       Spark是一种分布式计算引擎
3       大数据培训有很多

倒排索引: 单词到doc_id的关系

word         doc_id
若泽数据      1
从事          1
大数据培训    1,3
Spark         2
一种          2
分布式        2
计算引擎      2
很多          3

核心概念：

NRT ： Near Realtime 近实时的，指从index创建到查询
Cluster：1..n Node
Node
Index：	类似Database
Document： 类似Row
Type： 	类似Table
Field：	类似Column

2、ElesticSearch安装

前置要求：

JDK8+
CentOS7

我们这里选择的版本是6.6.2，JDK1.8，Centos6.5（需要修改配置，当然直接用CentOS7的话就会少很多问题）
在这里插入图片描述

#解压到你自己的app目录下
[hadoop@hadoop001 soft]$ tar -zxvf elasticsearch-6.6.2.tar.gz -C ../app/
[hadoop@hadoop001 elasticsearch-6.6.2]$  mkdir data logs

[hadoop@hadoop001 config]$ pwd
/home/hadoop/app/elasticsearch-6.6.2/config
[hadoop@hadoop001 config]$  vi elasticsearch.yml
cluster.name: ruozedata-es-cluster
node.name: ruozedata-es-node1
path.data: /home/hadoop/app/elasticsearch-6.6.0/data
path.logs: /home/hadoop/app/elasticsearch-6.6.0/logs
network.host: 0.0.0.0

[hadoop@hadoop001 config]$ cd ../bin/
[hadoop@hadoop001 bin]$ rm  *.bat

3、启动过程出现的问题

//另外如果用root用户的话还会报can not run elasticsearch as root，这时候就需要创建用户，比如useradd   elkuser、chown -R   elkuser:elkuser    elasticsearch-6.6.2，当然我们这里用的是hadoop用户，也就没有这个报错了

ERROR: [4] bootstrap checks failed
[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]

//修改文件数nofile,另外修改Linux系统参数肯定要用root用户

#要用root用户，星号表示用户

echo "*soft nofile 65536">> /etc/security/limits.conf

echo "*hard nofile 131072">> /etc/security/limits.conf


[root@hadoop001 ~]# echo "* soft nofile 65536" >>/etc/security/limits.cof
[root@hadoop001 ~]# echo "* hard nofile 131072" >>/etc/security/limits.cof


[2]: max number of threads [1024] for user [hadoop] is too low, increase to at least [4096]

//针对elastic安装所在hadoop用户修改进程数

echo "elkuser soft nproc 4096">> /etc/security/limits.conf
echo "elkuser hard nproc 4096">> /etc/security/limits.conf


[root@hadoop001 ~]# echo "hadoop soft nproc 4096" >>/etc/security/limits.cof
You have new mail in /var/spool/mail/root
[root@hadoop001 ~]# echo "hadoop hard nproc 4096" >>/etc/security/limits.cof


[3]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

sysctl  -w "vm.max_map_count=262144"  >> /etc/sysctl.conf

echo  "vm.max_map_count=262144"  >> /etc/sysctl.conf

sysctl -p 


[root@hadoop001 ~]# sysctl -w vm.max_map_count=262144   #临时生效，只对当前session生效
vm.max_map_count = 262144
You have new mail in /var/spool/mail/root
[root@hadoop001 ~]# echo "vm.max_map_count=262144" >> /etc/sysctl.conf
[root@hadoop001 ~]# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
vm.nr_hugepages = 4
vm.max_map_count = 262144
[root@hadoop001 ~]# 
#永久生效，而上面的修改文件数以及进程数必须要重启

[root@hadoop001 ~]# 



[4]: system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk


两种方法：1.升级centos --> 7.x   2.禁用disable 

//到elastic search 配置文件修改

[hadoop@hadoop001 config]$ vi elasticsearch.yml 
# Lock the memory on startup:
#
bootstrap.memory_lock: false
bootstrap.system_call_filter :false

[root@hadoop001 tmp]# reboot

[root@hadoop001 ~]# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 63715
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 63715
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
[root@hadoop001 ~]#

4、ElesticSearch启动

看下启动脚本，其中可以看出加-d参数可以后台启动

[hadoop@hadoop001 bin]$ vi elasticsearch
#!/bin/bash

# CONTROLLING STARTUP:
#
# This script relies on a few environment variables to determine startup
# behavior, those variables are:
#
#   ES_PATH_CONF -- Path to config directory
#   ES_JAVA_OPTS -- External Java Opts on top of the defaults set
#
# Optionally, exact memory values can be set using the `ES_JAVA_OPTS`. Note that
# the Xms and Xmx lines in the JVM options file must be commented out. Example
# values are "512m", and "10g".
#
#   ES_JAVA_OPTS="-Xms8g -Xmx8g" ./bin/elasticsearch

source "`dirname "$0"`"/elasticsearch-env

ES_JVM_OPTIONS="$ES_PATH_CONF"/jvm.options
JVM_OPTIONS=`"$JAVA" -cp "$ES_CLASSPATH" org.elasticsearch.tools.launchers.JvmOptionsParser "$ES_JVM_OPTIONS"`
ES_JAVA_OPTS="${JVM_OPTIONS//\$\{ES_TMPDIR\}/$ES_TMPDIR} $ES_JAVA_OPTS"

cd "$ES_HOME"
# manual parsing to find out, if process should be detached
if ! echo $* | grep -E '(^-d |-d$| -d |--daemonize$|--daemonize )' > /dev/null; then
  exec \
    "$JAVA" \
    $ES_JAVA_OPTS \
    -Des.path.home="$ES_HOME" \
    -Des.path.conf="$ES_PATH_CONF" \
    -Des.distribution.flavor="$ES_DISTRIBUTION_FLAVOR" \
    -Des.distribution.type="$ES_DISTRIBUTION_TYPE" \
    -cp "$ES_CLASSPATH" \
    org.elasticsearch.bootstrap.Elasticsearch \
    "$@"
else
  exec \
    "$JAVA" \
    $ES_JAVA_OPTS \
    -Des.path.home="$ES_HOME" \
    -Des.path.conf="$ES_PATH_CONF" \
    -Des.distribution.flavor="$ES_DISTRIBUTION_FLAVOR" \
    -Des.distribution.type="$ES_DISTRIBUTION_TYPE" \
    -cp "$ES_CLASSPATH" \
    org.elasticsearch.bootstrap.Elasticsearch \
    "$@" \
    <&- &
  retval=$?
  pid=$!
"elasticsearch" 60L, 1777C

ES的bin目录，启动

[hadoop@hadoop001 bin]$ ./elasticsearch

用如下命令进行启动（直接将参数带在启动命令后面）

bin/elasticsearch -E cluster.name=ruozedata-es-cluster -E node.name=ruozedata-es-node1 -E path.data=ruozedata-es-node1-data
//当然会有一个优先级的问题，生产上改配置文件的多

#如果单台虚拟机模拟集群，那么启动第二个启动，要区别node-name，和data-path
bin/elasticsearch -E cluster.name=ruozedata-es-cluster -E node.name=ruozedata-es-node2 -E path.data=ruozedata-es-node2-data

连接不上的原因是有一个跨域访问没配,另外需要修改配置文件:

#elasticsearch.yml(true前有一个空格)

http.cors.enabled: true
http.cors.allow-origin: "*"

不过这里面有一个优先级的问题，生产上还是直接改elasticsearch.yml配置文件的多

后台启动命令：

[hadoop@hadoop001 bin]$ ./elasticsearch -d
[hadoop@hadoop001 bin]$ ps -ef |grep elasticsearch
hadoop    3217     1 99 04:34 pts/1    00:00:20 /usr/java/jdk1.8.0_45/bin/java -Xms1g -Xmx1g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Djava.io.tmpdir=/tmp/elasticsearch-7405814871228322880 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=data -XX:ErrorFile=logs/hs_err_pid%p.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:logs/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=32 -XX:GCLogFileSize=64m -Des.path.home=/home/hadoop/app/elasticsearch-6.6.2 -Des.path.conf=/home/hadoop/app/elasticsearch-6.6.2/config -Des.distribution.flavor=default -Des.distribution.type=tar -cp /home/hadoop/app/elasticsearch-6.6.2/lib/* org.elasticsearch.bootstrap.Elasticsearch -d
hadoop    3236  3217  0 04:34 pts/1    00:00:00 /home/hadoop/app/elasticsearch-6.6.2/modules/x-pack-ml/platform/linux-x86_64/bin/controller
hadoop    3267  2520  0 04:34 pts/1    00:00:00 grep elasticsearch

这时候在UI地址后面加上如下参数就可以看到相应的节点状况：
在这里插入图片描述

/_cat/nodes   节点状况

/_cluster/health/pretty  集群健康状况

//网络通的情况下是认cluster.name

http://hadoop001:9200/_cat/nodes

http://hadoop001:9200/_cluster/health/pretty
在这里插入图片描述

克隆一个窗口，测试
ES默认端口9200

[hadoop@hadoop001 bin]$ curl -XGET hadoop001:9200
{
  "name" : "ruozedata-es-node1",
  "cluster_name" : "ruozedata-es-cluster",
  "cluster_uuid" : "QJXuB1DBSau5PPKV4L8dow",
  "version" : {
    "number" : "6.6.2",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "3bd3e59",
    "build_date" : "2019-03-06T15:16:26.864148Z",
    "build_snapshot" : false,
    "lucene_version" : "7.6.0",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}
[hadoop@hadoop001 bin]$