Elastic Stack--介绍及架构部署：ElasticSearch、Kibana、Filebeat的RPM包部署安装及基础使用

本文链接：https://blog.csdn.net/lerp020321/article/details/141536401

前言：本博客仅作记录学习使用，部分图片出自网络，如有侵犯您的权益，请联系删除

学习B站博主教程笔记：

最新版适合自学的ElasticStack全套视频（Elk零基础入门到精通教程）Linux运维必备—ElasticSearch+Logstash+Kibana精讲_哔哩哔哩_bilibilihttps://www.bilibili.com/video/BV1VMW3e6Ezk/?spm_id_from=333.1007.tianma.1-1-1.click&vd_source=e539f90574cdb0bc2bc30a8b5cb3fc00

一、Elastic Stack在企业中的常见架构Elastic — The Search AI Company | Elastic

1、架构图

2、Elastic Stack分布式日志系统概述

Elastic Stack，包括（也称ELK Stack）

2.1、ElasticsearcElasticsearch、Kibana、beatsh和Logstashh

简称为ES，ES是一个开源的高扩展的分布式全文搜索引擎，是整个Elastic Stack技术栈的核心。它可以近乎实时的存储、检索数据；本身扩展性很好，可以扩展到上百台服务器，处理PB级别的数据

2.2、Kibana

是一个免费且开放的用户界面，能够让您对Elasticsearch数据进行可视化，并让您在Elastic Stack中进行导航。可以进行各种操作，从跟踪查询负载，到理解请求如何流经整个应用，都能轻松完成

2.3、Beats

一个免费且开放的平台，集合了多种单一用途数据采集器，他们从成百上千或成千上万台机器和系统向Logstash或Elasticsearch发送数据

2.4、Logstash

免费且开放的服务端数据处理管道，能够从多个来源采集数据，转换数据，然后将数据发送到我们的“存储库”中

3、Elastic Stack企业级"EFK"架构图解

数据流走向：源数据层(nginx，tomcat) ---> 数据采集层(filebeat) ---> 数据存储层(ElasticSearch)

4、Elastic Stack企业级"ELK"架构图解

数据流走向：源数据层(nginx，tomcat) ---> 数据采集层(Logstash) ---> 数据存储层(ElasticSearch)

5、Elastic Stack企业级"ELFK"架构图解

数据流走向：源数据层(nginx，tomcat) ---> 数据采集层(filebeat) ---> 转换层(Logstash) ---> 数据存储层(ElasticSearch)

6、Elastic Stack企业级"ELFK"+"kafka"架构图解

数据流走向:源数据(nginx,tomcat) ---> 数据采集(filebeat) ---> 数据缓存层(kafka) --->转换层(Logstash) ---> 数据存储层(ElasticSearch)

7、Elastic Stack企业级"FLFK" + "Kafka"架构演变

二、ElasticSearch和Solr的抉择

1、ElasticSearch和Lucene的关系

Lucene的优缺点：

优点：可以被认为是迄今为止最先进，性能最好的，功能最全的搜索引擎库（框架）

缺点：

只能再java项目中使用，并且要以jar包的方式直接集成在项目中；
使用很复杂，需要深入了解检索的相关知识来创建索引和搜索索引代码；
不支持集群环境，索引数据不同步（不支持大型项目）；
扩展性差，索引库和应用所在同一个服务器，当索引数据过大时，效率逐渐降低；

值得注意的是，上述的Lucene框架中的缺点，ElasticSearch全部都能解决

Elasticsearch是一个实时的分布式搜索和分析引擎。它可以帮助你用前所未有的速度去处理大规模数据。

ES可以用于全文搜索，结构化搜索以及分析，当然你也可以将这三者进行组合

2、ElasticSearch和Solr如何选择

Solr是Apache Lucene项目的开源企业搜索平台。其主要功能包括全文检索、命中标示、分面搜索、动态聚类、数据库集成，以及富文本的处理。

Solr是高度可扩展的，并提供了分布式搜索和索引复制。Solr是最流行的企业级搜索引擎

ElasticSearch与Solr的比较：

Solr利用zookeeper进行分布式管理，而ES自身带有分布式协调管理功能;
Solr支持更多格式（JSON、XML、CSV）的数据，而ES仅支持JSON文件格式;
Solr官方提供的功能更多，而ES本身更注重于核心功能，高级功能多有第三方插件提供;
Solr在"传统搜索”(已有数据)中表现好于ES，但在处理“实时搜索"(实时建立索引)应用时效率明显低于ES
Solr是传统搜索应用的有力解决方案，但Elasticsearch更适用于新兴的实时搜索应用。

三、集群基础环境初始化

1、准备虚拟机

IP地址	主机名	CPU配置	内存配置	磁盘配置	角色说明
10.0.0.2	Master	2core	4G	20+	ES node
10.0.0.3	Node1	2core	4G	20+	ES node
10.0.0.4	Node2	2core	4G	20+	ES node

 cat >> /etc/hosts << 'EOF'
 192.168.1.10 master
 192.168.1.11 node1
 192.168.1.12 node2
 EOF

2、修改软件源

 wget -O /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo

3、修改sshd服务优化

 sed -ri 's@^#UseDNS yes@UseDNS no@g' /etc/ssh/sshd_config
 sed -ri 's#^GSSAPIAuthentication yes#GSSAPIAuthentication no#g' /etc/ssh/sshd_config
 grep ^UseDNS /etc/ssh/sshd_config
 grep ^GSSAPIAuthentication /etc/ssh/sshd_config

4、关闭防火墙

 systemctl disable --now firewalld && systemctl is-enabled firewalld
 systemctl status firewalld

5、禁用selinux

 sed -ri 's#(SELINUX=)enforcing#\1disabled#' /etc/selinux/config
 grep ^SELINUX= /etc/selinux/config
 setenforce 0
 getenforce

6、配置集群免密登录及同步脚本

 # (1)修改主机列表
 cat >> /etc/hosts << 'EOF'
 192.168.1.10 master
 192.168.1.11 node1
 192.168.1.12 node2
 EOF
 
 # (2)master节点上生成密钥对
 ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa -q
 
 # (3)master配置所有集群节点的免密登录
 ssh-copy-id master
 ssh-copy-id node1
 ssh-copy-id node2
 
 # (4)链接测试
 ssh 'master'
 ssh 'node1'
 ssh 'node2'
 
 # (5)为所有节点安装rsync数据同步工具
 yum -y install rsync
 
 # (6)编写同步脚本
 cat > /usr/local/sbin/data_rsync.sh << 'EOF'
 #!/bin/bash
 
 if [ $# -ne 1 ];then
     echo "Usage: $0 /path/to/file(绝对路径)"
     exit
 fi
 
 # 判断文件是否存在
 if [ ! -e $1 ];then
     echo "[ $1 ] dir or file not find!"
     exit
 fi
 
 # 获取父路径
 fullpath=`dirname $1`
 
 # 获取子路径
 basename=`basename $1`
 
 # 进入到父路径
 cd $fullpath
 
 rsync -az $basename `whoami`@master:$fullpath
 rsync -az $basename `whoami`@node1:$fullpath
 rsync -az $basename `whoami`@node2:$fullpath
 EOF
 
 chmod +x /usr/local/sbin/data_rsync.sh

7、集群时间同步

 systemctl start chronyd
 systemctl enable chronyd

四、ElasticSearch单点部署

以下在master节点上进行单点部署操作：

1、下载指定的ES版本

 官网：https://www.elastic.co/cn/
 下载：https://www.elastic.co/cn/downloads/elasticsearch
 文档：https://www.elastic.co/guide/index.html
 图形化界面地址：https://github.com/mobz/elasticsearch-head

 wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.17.3-x86_64.rpm

2、单点部署elasticsearch

 # (1)安装服务
 yum -y localinstall elasticsearch-7.17.3-x86_64.rpm
 
 # (2)启动服务
 systemctl start  elasticsearch.service
 
 # (3)查看服务端口;9200为集群外部提供的端口,提供httpd协议;9300为集群内部通信
 [root@master ~]# ss -ntl
 ...
 LISTEN   0      128       [::ffff:127.0.0.1]:9200              [::]:*   
 LISTEN   0      128                    [::1]:9200              [::]:*   
 LISTEN   0      128       [::ffff:127.0.0.1]:9300              [::]:*   
 LISTEN   0      128                    [::1]:9300              [::]:* 
 ...
 # (4)修改配置文件，将node1与node2集群加入
 vim /etc/elasticsearch/elasticsearch.yml
 ...
 cluster.name: master-elk    
 node.name: master
 path.data: /var/lib/elasticsearch
 path.logs: /var/log/elasticsearch
 network.host: 0.0.0.0
 discovery.seed_hosts: ["master"]
 
 相关参数说明：
     cluster.name:集群名称，若不指定，则默认是"elasticsearch",日志文件的前缀也是集群名称
     node.name:指定节点的名称，可以自定义，推荐使用当前的主机名，要求集群唯一
     path.data:数据路径
     path.logs:日志路径
     network.host:ES服务监听的IP地址
     discovery.seed_hosts:服务发现的主机列表，对于单点部署而言，主机列表和"network.host"字段配置相同即可
     
 # (5)重启服务
 systemctl restart elasticsearch
 
 # 查看日志文件:记录了请求过程等
 tail -100f /var/log/elasticsearch/master-elk.log

五、ElasticSearch分布式集群部署

首先在剩下两个节点中同样安装好elasticsearch：

 scp elasticsearch-7.17.3-x86_64.rpm node1:~
 scp elasticsearch-7.17.3-x86_64.rpm node2:~

1、master修改配置文件

 vim /etc/elasticsearch/elasticsearch.yml
 ...
 cluster.name: <集群名称>    
 node.name: <节点名称>
 path.data: /var/lib/elasticsearch
 path.logs: /var/log/elasticsearch
 network.host: 0.0.0.0
 discovery.seed_hosts: ["192.168.1.10","192.168.1.11","192.168.1.12"]
 cluster.initial_master_nodes: ["192.168.1.10","192.168.1.11","192.168.1.12"]

2、同步配置文件到集群的其他节点

 #(1)运行同步脚本，将配置文件同步到node1与node2节点
 data_rsync.sh /etc/elasticsearch/elasticsearch.yml
 
 #(2)同样的在node1与node2修改此配置文件，只需将node.name节点名称修改即可，例如：
 node1中:
 ..
 node.name: node1
 
 node中:
 node.name: node2
 
 # (3)检查配置文件是否正确:
 egrep -v "^#|^$" /etc/elasticsearch/elasticsearch.yml

3、所有节点删除之前的临时数据

 rm -rf /var/{lib,log}/elasticsearch/* /tmp/*
 ll /var/{lib,log}/elasticsearch/ /tmp/

4、所有节点启动服务

 #(1)所有节点启动服务
 systemctl start elasticsearch
 
 #(2)启动过程中建议查看日志
 tail -100f /var/log/elasticsearch/master-elk.log

5、验证集群是否正常

 [root@master ~]# curl 192.168.1.10:9200/_cat/nodes
 192.168.1.12 34 95 61 2.15 0.61 0.34 cdfhilmrstw - node2
 192.168.1.11 10 94 29 0.92 0.32 0.15 cdfhilmrstw * node1
 192.168.1.10 15 96 37 1.00 0.30 0.14 cdfhilmrstw - master

六、部署Kibana服务

1、本地安装Kibana

 # 再次给出链接搜索下载地址:
 https://www.elastic.co/cn/downloads/past-releases#kibana

 # 在任意一节点安装即可，本例在node2节点安装
 wget https://artifacts.elastic.co/downloads/kibana/kibana-7.17.3-x86_64.rpm
 yum -y localinstall kibana-7.17.3-x86_64.rpm

2、修改kibana的配置文件

 vim /etc/kibana/kibana.yml
 ...
 server.host: 0.0.0.0
 server.name: "elk-server"
 elasticsearch.hosts: ["http://192.168.1.10:9200","http://192.168.1.11:9200","http://192.168.1.12:9200"]
 i18n.locale: "zh-CN"

3、启动kibana服务

 systemctl enable --now kibana
 systemctl status kibana

4、浏览器访问：

示例：菜单 ---> 堆栈监测

七、filebeat环境部署及基础使用

1、部署filebeat环境

 # 本例部署在node1节点中
 wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.17.3-x86_64.rpm
 yum -y localinstall filebeat-7.17.3-x86_64.rpm

2、修改filebeat的配置文件

监控屏幕标准输入：

 (1)编写测试的配置文件
 mkdir /etc/filebeat/config
 cat > /etc/filebeat/config/01-stdin-to-console.yml << 'EOF'
 filebeat.inputs:    # 指定输入的类型
 - type: stdin       # 指定输入的类型为"stdin",表示标准输入
 output.console:     # 指定输出的类型
   pretty: true      # 打印漂亮的格式
 EOF
 
 (2)运行filebeat实例
 filebeat -e -c /etc/filebeat/config/01-stdin-to-console.yml
 
 (3)测试
 111                 # 在屏幕上的标准输入
 {
 ...                 # 可以看到收集了相关信息，并且每三十秒检查一次
   "message": "111",
   "input": {
     "type": "stdin"
   },
   "host": {
     "name": "node1"
   }
 }

3、Input的log类型

filebeat默认按行收集，只有当文件中输入一行且换行后，才会收集到

监控指定日志文件：

 # (1)编写配置文件
 cat > /etc/filebeat/config/02-log-to-console.yml << 'EOF'
 filebeat.inputs:
 - type: log
   paths:
     - /tmp/test.log
 
 output.console:
   pretty: true
 EOF
 
 # (2)运行filebeat实例
 filebeat -e -c /etc/filebeat/config/02-log-to-console.yml
 
 # (3)测试1
 [root@node1 ~]# echo 111 >> /tmp/test.log
 # 另一个终端查看:
 {
   ...
   "message": "111",     # 收集到了我们/tmp/test.log中的日志
   "input": {
     "type": "log"
   },
   ...
 }
 # 测试2:
 [root@node1 ~]# echo -n 222 >> /tmp/test.log    # 不带换行符追加
 [root@node1 ~]# echo -n 3333 >> /tmp/test.log
 # 另一个终端发现没有收集到
 [root@node1 ~]# echo  >> /tmp/test.log
 # 将换行符追加到日志文件后，再次查看:（收集到日志了）
   },
   "log": {
     "offset": 4,    # 记录文件偏移量，通过此可实现，中断续读
     "file": {
       "path": "/tmp/test.log"
     }
   },
   "message": "2223333"
 }

文件记录文件：通过修改文件中的offset偏移量即可恢复从offset处重新读取

 [root@node1 ~]# cat /var/lib/filebeat/registry/filebeat/log.json

4、Input的通配符案例

配置多个log输入：

 cat > 03-logs-to-console.yml << 'EOF'
 filebeat.inputs:
 - type: log
   paths:
     - /tmp/test.log
     - /tmp/*.txt
 
 filebeat.inputs:
 - type: log
   paths:
     - /tmp/test/*/*.log         # 通配符的使用
     
 output.console:
   pretty: true
 EOF

5、将数据写入es

 # 编写配置文件
 cat > 05-log-to-es.yml << 'EOF'
 filebeat.inputs:
 - type: log
   enabled: true
   paths:
     - /tmp/test.log
     - /tmp/*.txt
   tags: ["linux","容器运维"]
   fields:
     school: "xx市xx县"
     class: "linux80"
 
 - type: log
   enabled: true
   paths:
     - /tmp/test/*/*.log
   tags: ["python","云原生开发"]
   fields:
     name: "boy"
     hobby: "抖音"
   fields_under_root: true
 
 output.elasticsearch:
   hosts: ["http://192.168.1.10:9200","http://192.168.1.11:9200","http://192.168.1.12:9200"]
 EOF
 
 # 清空之前的配置记录
 rm -rf /var/lib/filebeat/*
 # 运行测试
 filebeat -e -c /etc/filebeat/config/05-log-to-es.yml

此时，可以在浏览器的kibana中看到："菜单" ---> "Stack Management" ---> "索引管理"：

创建一个索引模式：可以根据自己想要查询的收集类型、日期等创建自己的索引；

在"菜单" ---> "Discover"中即可查看到收集到的日志：

其中还可根据字段进行查询；日志以JSON格式查看等等...

当然，再次向日志文件中追加字段监控后，前端页面也会实时的进行更新：

6、自定义es索引名称

编写配置文件：

 # 与上一样，省略;追加以下配置
 ...
 
 # 禁用索引生命周期管理
 setup.ilm.enabled: false
 # 设置索引模板的名称
 setup.template.name: "cluster-elk"
 # 设置索引模板的匹配模式
 setup.template.pattern: "cluster-elk*"

测试：

 rm -rf /var/lib/filebeat/*
 filebeat -e -c /etc/filebeat/config/06-log-to-es.yml

7、多个不同的索引写入

编写配置文件：

 output.elasticsearch:
   hosts: ["http://192.168.1.10:9200","http://192.168.1.11:9200","http://192.168.1.12:9200"]
   # index: "cluster-elk-%{+yyyy.MM.dd}"
   indices:
     - index: "cluster-elk-linux-%{+yyyy.MM.dd}"
       # 匹配指定字段包含的内容
       when.contains:
         tags: "linux"
     - index: "cluster-elk-python-%{[+yyyy.MM.dd]}"
       when.contains:
         tags: "python" 
  
 # 禁用索引生命周期管理,自定义索引才能生效
 setup.ilm.enabled: false
 # 设置索引模板的名称
 setup.template.name: "cluster-elk"
 # 设置索引模板的匹配模式
 setup.template.pattern: "cluster-elk*"

测试：

 rm -rf /var/lib/filebeat/*
 filebeat -e -c /etc/filebeat/config/07-log-to-es.yml

8、ES的分片和副本及filebeat配置

整体架构图如下：

 ...
 # 禁用索引生命周期管理
 setup.ilm.enabled: false
 # 设置索引模板的名称
 setup.template.name: "cluster-elk"
 # 设置索引模板的匹配模式
 setup.template.pattern: "cluster-elk*"
 # 覆盖已有的索引模板
 setup.template.overwrite: false
 # 配置索引模板
 setup.template.settings:
   index.number_of_shards: 3         # 设置分片数量
   index.number_of_replicas: 0       # 设置副本数量，要求小于集群数量

浏览器可视化界面测试：

 rm -rf /var/lib/filebeat/*
 filebeat -e -c /etc/filebeat/config/08-log-to-es.yml