Elastic Stack（ELK6.6.2）安装及使用

最新推荐文章于 2024-02-26 17:56:30 发布

MrWangf

最新推荐文章于 2024-02-26 17:56:30 发布

阅读量627

点赞数

分类专栏： Elastic Stack（ELK）

本文链接：https://blog.csdn.net/MrWangf/article/details/105178092

版权

Elastic Stack（ELK）专栏收录该内容

2 篇文章 0 订阅

订阅专栏

1.介绍

ELK是Elasticsearch、Logstash、Kibana的简称，这三者是核心套件，但并非全部。

Elasticsearch是实时全文搜索和分析引擎，提供搜索、分析、存储数据三大功能；是一套开放REST和JAVA API等结构提供高效搜索功能，可扩展的分布式系统。它构建于Apache Lucene搜索引擎库之上。

Logstash官方定位为数据采集器（Data Shipper），可以发现比Beats定位少了Lightweight，意味着Logstash将占用更多资源，但同时功能肯定更强大。它担任ETL角色（英文Extract-Transform-Load的缩写），用来描述将数据从来源端经过抽取（extract）、转换（transform）、加载（load）至目的端的过程。我们通常的数据导入ES集群就可以使用logstash。

Kibana是一个基于Web的图形界面，用于搜索、分析和可视化 存储在 Elasticsearch中的数据。它利用Elasticsearch的REST接口来检索数据，不仅允许用户创建他们自己的数据的定制仪表板视图，还允许他们以特殊的方式查询和过滤数据。
Elastic Stack架构
开源和商业部分

2.Elastic Stack安装

2.1 ES安装配置和启动常见问题

官网下载
https://www.elastic.co/cn/downloads/
Elastic的相关产品各个版本都可以在这里下载。
安装和启动

在这里插入图片描述

注：
1. 环境要求jdk1.8+。
2. 后台启动：/elasticsearch-6.X.X/bin/elasticsearch -d
3. 不能使用root用户启动

配置文件
在解压的config目录下修改elastic.yml配置。

# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
# 集群名称
cluster.name: es-cluster
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#该节点名称，不同节点配置不同的名称
node.name: node-1
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#数据目录
path.data: /home/elasticsearchdata/data
#
# Path to log files:
#日志目录
path.logs: /home/elasticsearchdata/log
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#锁定物理内存地址，防止es内存被交换出去，也就是避免es使用swap交换分区，频繁的交换，会导致IOPS变高。
bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#服务器当前IP
network.host: 192.168.23.10
#
# Set a custom port for HTTP:
#端口
http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#集群所有节点
discovery.zen.ping.unicast.hosts: ["192.168.23.10", "192.168.23.11", "192.168.23.12"]
#
# Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):
#和节点数有关，避免脑裂（节点数/2+1）
discovery.zen.minimum_master_nodes: 2
#
# For more information, consult the zen discovery module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true

启动常见问题

问题一：Exception in thread “main” SettingsException[Failed to load settings from [elasticsearch.yml]]; nested: ElasticsearchParseException[malformed, expected settings to start with ‘object’, instead was [VALUE_STRING]];
加载解析配置文件异常，参数与值之间需要空格，比较难以发现。

node.name:node-1  ##错误
node.name: node-1 ##正确

问题二：org.elasticsearch.bootstrap.StartupException: java.lang.RuntimeException: can not run elasticsearch as root.
处于对root用户的安全保护，需要使用其他用户组进行授权启动

# 新增用户组授权启动
    groupadd es
    useradd es -g es -p es
    chown -R es:es  elasticsearch-6.X.X
# 切换用户，重新启动
    su es
    ./elasticsearch-6.X.X/bin/elasticsearch -d

问题三：OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x0000000085330000, 2060255232, 0) failed; error=‘Cannot a …’(errno=12);
对es的内存分配过高，超出系统内存，在虚拟机上安装时经常出现。需要修改es的 jvm.options 配置。

## JVM configuration
 
################################################################
## IMPORTANT: JVM heap size
################################################################
##
## You should always set the min and max JVM heap
## size to the same value. For example, to set
## the heap to 4 GB, set:
##
## -Xms4g
## -Xmx4g
-Xms3g
-Xmx3g
##
## See https://www.elastic.co/guide/en/elasticsearch/reference/current/heap-size.html
## for more information
##
################################################################
 
# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space

问题四：ERROR: [3] bootstrap checks failed
启动检查不通过，在虚拟机上安装时经常出现。详细错误信息如下：

[2020-03-20T12:35:47,152][INFO ][o.e.b.BootstrapChecks    ] [SUcoFrg] bound or publishing to a non-loopback address, enforcing bootstrap checks
ERROR: [4] bootstrap checks failed
[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]
[2]: max number of threads [3802] for user [elsearch] is too low, increase to at least [4096]
[3]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[4]: memory locking requested for elasticsearch process but memory is not locked
[2020-03-20T12:35:47,170][INFO ][o.e.n.Node               ] [SUcoFrg] stopping ...
[2020-03-20T12:35:47,316][INFO ][o.e.n.Node               ] [SUcoFrg] stopped
[2020-03-20T12:35:47,316][INFO ][o.e.n.Node               ] [SUcoFrg] closing ...
[2020-03-20T12:35:47,336][INFO ][o.e.n.Node               ] [SUcoFrg] closed

解决：
[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]
[2]: max number of threads [3802] for user [elsearch] is too low, increase to at least [4096]
[4]: memory locking requested for elasticsearch process but memory is not locked
step1:修改安全限制配置文件

# 使用root权限 修改安全配置
su root 
vi /etc/security/limits.conf

# End of file
*	soft	nproc         4096
*	hard	nproc         4096
es	hard	nofile        65536 
es	soft	nofile        65536
* 	hard	memlock		unlimited
*	soft	memlock 	unlimited


# 备注： es为用户名 也可以使用*进行通配所有用户  
# nofile 	最大打开文件数目
# nproc 	最大打开线程数目
# memlock 	内存锁

[3]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

step2:修改系统配置文件

vi /etc/sysctl.conf

备注：行末加上vm.max_map_count = 655360 满足错误提示值即可。

# sysctl settings are defined through files in
# /usr/lib/sysctl.d/, /run/sysctl.d/, and /etc/sysctl.d/.
#
# Vendors settings live in /usr/lib/sysctl.d/.
# To override a whole file, create a new file with the same in
# /etc/sysctl.d/ and put new settings there. To override
# only specific settings, add a file with a lexically later
# name in /etc/sysctl.d/ and put new settings there.
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
vm.max_map_count = 655360

以上一些系统配置，修改后可能需要重启，否则不会立即生效。

问题五：org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: failed to obtain node locks, tried [[/home/elasticsearch-6.X.X/data/elasticsearch]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])?
线程占用，杀掉历史进程，重新启动。

# 找到进程
ps -ef | grep elastic
kill -9 进程号
# 重新启动
su es
/elasticsearch-6.X.X/bin/elasticsearch -d

2.2 Kibana安装配置和一些功能介绍

安装和启动

同样需要后台启动：nohup bin/kibana &
停止杀掉kibana，找到kibana的端口号有下面3种方式：
netstat -ntulp|grep 5601;
lsof -i:5601;
ps -ef|grep node

配置文件
在解压的config目录下修改kibana.yml配置。

# 将默认配置改成如下：

# Kibana的host和port
server.port: 5601
server.host: "192.168.23.10"
# 指向之前安装的es实例（只需要配置主节点ip即可，如果配置整个集群，有节点挂掉，kibana也会出现问题。）
elasticsearch.hosts: ["http://127.0.0.1:9200"]

常用功能介绍

Discover 数据搜索查看
Visualize 图表制作
Dashboard 仪表盘制作（将Visualize中制作的各种图表放到Dashboard上综合展示）
Timelion 时序数据的高级可视化分析（写一些简单的查询语言帮助快速分析数据）
DevTools 开发者工具（常用的），与es交互（增删改查）。
Monitoring 监控集群、索引状态（常用）
Management 配置
可查询索引数据量、mapping等信息。Management >Index Management。

2.3 Logstash安装配置和常见问题

安装和启动

注：
核心在于 logstash.conf 配置文件，主要配置数据来源（input）、过滤转换规则（filter）以及输出端（output）

注意： 启动时参数 -f 后支持目录，可执行目录下所有配置。但一定要注意两点：
1.-f conf/，不要写成 -f conf/*，这样只会执行第一个配置文件。
2. 每一个配置文件的input中都要设置一个变量（一般为type），这样在filter、output中可以通过type来区分判断不同的数据来源，否则数据导入错乱，切记切记，勿谓言之不预！！！
原因：logstash虽然支持执行一个目录下的多个配置文件，但本质还是只启动了一个pipeline，多个配置文件会融合为一个大的配置文件。

配置文件（XXX-logstash.conf）

简单的JDBC数据库数据导入es示例：

input {
 # stdin { }
    jdbc {
        jdbc_connection_string => "jdbc:mysql://ip:3306/database"
        jdbc_user => "wf"
        jdbc_password => "wf@123"
		# 需要上传对应数据库连接jar
   		jdbc_driver_library => "/usr/local/logstash-6.3.2/mysql-connector-java-5.1.46.jar"
        jdbc_driver_class => "com.mysql.jdbc.Driver"
        jdbc_paging_enabled => "true"
        statement => "SELECT * FROM tabeName t"
        # statement_filepath => "/usr/local/logstash-5.6.10/mysql2es.sql"
        # 定时：分 时 天 月 年 
        #schedule => "* * * * *"
        type => "indexname"
    }
 }

#filter {} 

 output {
  if [type] == "indexname" {
     stdout {
        codec => json_lines
    }

    elasticsearch {
        hosts => ["http://localhost:9200"]
        index => "indexname"
        document_type => "typename"
        #document_id => "%{id}"
    }
  }
}

从txt文本数据导入es的例子

input{
	file {
		# 指定数据所在目录(支持多级目录模糊匹配如：/**/file*.csv）
		path => ["/home/ELK6/logstash-6.3.1/config/hdfs2es/data/company/*.txt"]	
		# 设置多长时间检测文件是否修改(单位：秒)
		stat_interval => 1
		# 监听文件的起始位置，默认是end
		start_position => beginning
		# 监听文件读取信息记录的位置
		sincedb_path => "/home/ELK6/logstash-6.3.1/config/hdfs2es/since_db.txt"	
		# 设置多长时间会写入读取的位置信息(单位：秒)
		sincedb_write_interval => 5	
		# 设置读取数据文件的编码
		codec => plain {
			charset=>"UTF-8"
		}
		type => "indexname"
	}
}

filter {
  if [type] == "indexname" {
	ruby{
		# 文本数据列名（按顺序）
		init => "@kname=['id','social_no','com_code','org_code','tax_code','com_name']"
		# 分隔符
		code => 'event.append(LogStash::Event.new(Hash[@kname.zip(event.get("message").split("\t"))]))'
		# 过滤掉默认加上的字段
		remove_field => ["host", "tags", "path", "message"]
	}
 }
}

output{
  if [type] == "indexname" {
	elasticsearch{
		#Kibana中配置的用户信息
		user => "es"
		password => "es"
		#es集群地址
		hosts => ["192.168.31.194:9200","192.168.31.195:9200"]
		index => "company"
		document_type => "company"
	}
 }
}

从其他es集群数据导入es的例子

在这里插入代码片

强大的 filter

input {
	stdin { } 
}

filter {  # 配置过滤器
	# 基于正则表达式提供了丰富可复用的模式（pattern）。可以将非结构化数据转化为结构化。
    grok {
        match => { 
         "message" =>"%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}"
        }
    }
	# mutate可以对结构化后的数据进行增加、删除、修改、替换等字段的处理。
    mutate { 
		remove_field => ["message","ecs","input","tags"]
		convert => ["duration", "integer"]
    }

    date {  
    	match => [ "log_timestamp", "yyyy-MMM-dd HH:mm:ss" ]
    	locale => "cn"
  	} 
}

output {

  	stdout {
    	codec => json_lines  # 以json格式客户端输出
  	}

  	elasticsearch {
    	hosts => ["http://localhost:9200"]
    	# 日志索引按日期生成格式nginx-7.3.0-2019.08.12
    	index => "nginx-%{[@metadata][version]}-%{+YYYY.MM.dd}" 
	}
}

grok规则解释：%指明这是一个Grok；IP、WORD、URIPATHPARAM、NUMBER是Grok中已经有的pattern（即正则表达式）；后面的client、method等是结构化后的字段名。

客户端输入输出如下：
在这里插入图片描述

常见问题

问题一：logstash 停留到 waiting input 输入直接退出，导致logstash-input-jdbc插件不能运行。
解决：查看配置发现jdbc.conf中 input存在stdin{} 配置，此配置是从控制台输入，此时后台运行会报错卡主等待，注解掉重启测试解决。
问题二：Logstash could not be started because there is already another instance …详细日志如下：

Sending Logstash logs to /usr/local/logstash/logstash-6.3.2/logs which is now configured via log4j2.properties
[2020-03-28T12:23:45,931][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-03-28T12:23:46,088][FATAL][logstash.runner          ] Logstash could not be started because there is already another instance using the configured data directory.  If you wish to run multiple instances, you must change the "path.data" setting.
[2020-03-28T12:23:46,130][ERROR][org.logstash.Logstash    ] java.lang.IllegalStateException: Logstash stopped processing because of an error: (SystemExit) exit

解决：一般是历史logstash进程没杀掉，如果杀掉了进程，还是启动报这个错误，可以通过检查 conf/logstash.yml 配置文件找到 path.data路径：

# ------------ Data path ------------------
#
# Which directory should be used by logstash and its plugins
# for any persistent needs. Defaults to LOGSTASH_HOME/data
#
# path.data:

进入该路径，删除掉 .lock 文件。

data$ ls -alh
总用量 20K
drwxr-sr-x  4 tingshuo staff 4.0K 11月 20 11:42 .
drwxr-sr-x 14 tingshuo staff 4.0K 11月 20 11:42 ..
drwxr-sr-x  2 tingshuo staff 4.0K 11月 20 11:42 dead_letter_queue
-rw-r--r--  1 tingshuo staff    0 11月 20 11:42 .lock
drwxr-sr-x  2 tingshuo staff 4.0K 11月 20 11:42 queue
-rw-r--r--  1 tingshuo staff   36 11月 20 11:42 uuid
data$ rm .lock

问题三：Expected one of #, input, filter, output at line 1, column 1 (byte 1) after…

# 测试命令文件是否正常
logstash -f  config/logstash-simple-dev.conf -t

这种情况如果反复检查确实配置文件没有语法错误，那极有可能是编码问题。配置文件的编码应该是UTF-8 without bom 而不是 UTF-8。
批量修改编码的方式，我选择的是设置好notpad++默认编码UTF-8 without bom，然后批量打开配置文件（建议一次性最多打开20个），这个过程notpad++会自动转码。

到这里，我们的Elastic Stack，ES的一套组件基本就安装完成了。

3.补充

3.1 IK分词器离线安装

为什么要装IK分词器？
es默认的提供的分词器，对中文支持很差。如关键字“湖北省武汉市”将被分词为“湖北省武汉市”，这样的分词，分词查询性能和效果很差。
IK分词有两种：ik_max_word 和 ik_smart
ik_max_word ：会将文本做最细粒度的拆分；尽可能多的拆分出多词语。
ik_smart：会做最粗粒度的拆分；已被分出的词语将不会再次被其它词语占有。
还是以“湖北省武汉市”为例，这两者的分词效果如下：
ik_max_word：湖北省武汉市湖北湖北省武汉武汉市
ik_smart：湖北省武汉市

(1) ik分词器安装十分简单，下载好对应zip包

下载地址：https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v5.5.1/elasticsearch-analysis-ik-6.2.2.zip
(2) 解压到elasticsearch的安装目录下面的plugins 目录就行了
/home/ELK6/elasticsearch-6.6.2/plugins/elasticsearch-ik
(3) 再次修改目录权限，后台启动es

chown -R es:es elasticsearch-6.6.2/ 

./elasticserch -d

MrWangf

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
Elastic Stack（ELK6.6.2）安装及使用

1.介绍    ELK是Elasticsearch、Logstash、Kibana的简称，这三者是核心套件，但并非全部。    Elasticsearch是实时全文搜索和分析引擎，提供搜集、分析、存储数据三大功能；是一套开放REST和JAVA API等结构提供高效搜索功能，可扩展的分布式系统。它构建于Apache ...
复制链接

扫一扫