ELK日志分析处理

small white poplar

已于 2024-02-01 00:55:39 修改

阅读量442

点赞数

分类专栏：云原生文章标签： elk elasticsearch

于 2023-04-14 20:16:01 首次发布

本文链接：https://blog.csdn.net/weixin_55000003/article/details/130151113

版权

云原生专栏收录该内容

24 篇文章 0 订阅

订阅专栏

ELK是一个开源的日志分析系统

ELK是三个开源软件的缩写，分别表示：Elasticsearch , Logstash, Kibana , 它们都是开源软件。新增了一个FileBeat，它是一个轻量级的日志收集处理工具(Agent)，Filebeat占用资源少，适合于在各个服务器上搜集日志后传输给Logstash，官方也推荐此工具。

官方文档

Filebeat：
https://www.elastic.co/cn/products/beats/filebeat
https://www.elastic.co/guide/en/beats/filebeat/5.6/index.html

Logstash：
https://www.elastic.co/cn/products/logstash
https://www.elastic.co/guide/en/logstash/5.6/index.html

Kibana ：
https://www.elastic.co/cn/products/kibana
https://www.elastic.co/guide/en/kibana/5.5/index.html

Elasticsearch：
https://www.elastic.co/cn/products/elasticsearch
https://www.elastic.co/guide/en/elasticsearch/reference/5.6/index.html

elasticsearch中文社区：
https://elasticsearch.cn/

概念

Elasticsearch 日志检索和存储
Logstash 收集分析处理
Kibana 可视化展示
Elasticsearch 基于Lucene的搜索服务器

Elasticsearch 是一个java开发的，开源的分布式、高扩展高实时、RESTful风格的搜索与数据分析引擎。 它的底层是开源库Apache Lucene（搜索引擎）倒排索引。

数据的组织存储方式，名次解释

Index、Type、Document、Field（ES最小单位）

索引、类型、文档、字段，通过找哪个索引的哪个类型哪个文档哪个字段

node、cluster

装有一个ES服务器的节点、有多个node组成的集群

shards、replicas、

切片，将索引切成很多小片存储到不同节点，切稀碎万一某个节点坏掉，整个数据不可读，所以要有副本

单机安装

[root@es-0001 ~]# vim /etc/hosts
192.168.1.21	es-0001
192.168.1.22	es-0002
192.168.1.23	es-0003
192.168.1.24	es-0004
192.168.1.25	es-0005
[root@es-0001 ~]# yum install -y java-1.8.0-openjdk elasticsearch
[root@es-0001 ~]# vim /etc/elasticsearch/elasticsearch.yml
55:  network.host: 0.0.0.0#默认只有127.0.0.1允许访问，所以要改
[root@es-0001 ~]# systemctl enable --now elasticsearch
[root@es-0001 ~]# curl http://127.0.0.1:9200/
{
  "name" : "War Eagle",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "2.3.4",
    "build_hash" : "e455fd0c13dceca8dbbdbb1665d068ae55dabe3f",
    "build_timestamp" : "2016-06-30T11:24:31Z",
    "build_snapshot" : false,
    "lucene_version" : "5.5.0"
  },
  "tagline" : "You Know, for Search"
}

集群安装

cluster.name: 集群名称，所有节点名字必须一样
node.name: 本机主机名不能与其他主机名相同
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 0.0.0.0
discovery.zen.ping.unicast.hosts: ["es-0001", "es-0002"]#创始人节点必须写两个，避免单点，其他节点找不到集群。重新启动也要先启动这两个。

查看集群状态（固定写法）

查看集群的名字，状态，节点数量

 curl http://127.0.0.1:9200/_cluster/health?pretty

集群管理

API管理
插件管理（本质网页）

head插件

在 es-0001 上安装 apache，并部署 head 插件

展现ES集群的拓扑结构，可以进行Index和node级别的操作

提供一组针对集群的查询API将结果以json和表哥的的形式返回

提供一些快捷菜单，展示进群的各种状态

通过 ELB 映射 8080 端口，发布 es-0001 的 web 服务到互联网（添加了一层安全认证通过httpd服务）
es-0001 访问授权

[root@es-0001 ~]# yum install -y httpd
[root@es-0001 ~]# systemctl enable --now httpd
[root@es-0001 ~]# tar zxf head.tar.gz -C /var/www/html

[root@es-0001 ~]# vim /etc/httpd/conf/httpd.conf
# 配置文件最后追加
ProxyRequests off
ProxyPass /es/ http://127.0.0.1:9200/
ProxyPassReverse /es/ http://127.0.0.1:9200/
<Location ~ "^/es(-head)?/">
    Options None
    AuthType Basic
    AuthName "Elasticsearch Admin"
    AuthUserFile "/var/www/webauth"
    Require valid-user
</Location>
[root@es-0001 ~]# htpasswd -cm /var/www/webauth admin
New password: 
Re-type new password: 
Adding password for user admin
[root@es-0001 ~]# vim /etc/elasticsearch/elasticsearch.yml
# 配置文件最后追加
http.cors.enabled : true
http.cors.allow-origin : "*"
http.cors.allow-methods : OPTIONS, HEAD, GET, POST, PUT, DELETE
http.cors.allow-headers : X-Requested-With,X-Auth-Token,Content-Type,Content-Length
[root@es-0001 ~]# systemctl restart elasticsearch httpd

通过网页插件访问es集群

API简单管理

htpp请求三部分

请求行、消息报头、请求正文

请求行：Method Request-URL http-version

http请求方法

常用方法：get、post、head

其他方法：options、put、delete、trace、connect

curl

-X 请求方式
-H 自定义请求头

集群状态查询

# 查询支持的关键字
[root@es-0001 ~]# curl -XGET http://127.0.0.1:9200/_cat/
# 查具体的信息
[root@es-0001 ~]# curl -XGET http://127.0.0.1:9200/_cat/master
# 显示详细信息 ?v
[root@es-0001 ~]# curl -XGET http://127.0.0.1:9200/_cat/master?v
# 显示帮助信息 ?help561
[root@es-0001 ~]# curl -XGET http://127.0.0.1:9200/_cat/master?help

创建索引

指定索引的名称，指定分片数量，指定副本数量
创建索引使用 PUT 方法，创建完成以后通过 head 插件验证

增加数据

[root@es-0001 ~]# curl -XPUT -H "Content-Type: application/json" \
                    http://127.0.0.1:9200/tedu/teacher/1 -d '{
                      "职业": "诗人",
                      "名字": "李白",
                      "称号": "诗仙",
                      "年代": "唐"
                  }'

查询数据

[root@es-0001 ~]# curl -XGET http://127.0.0.1:9200/tedu/teacher/_search?pretty
[root@es-0001 ~]# curl -XGET http://127.0.0.1:9200/tedu/teacher/1?pretty

删除数据

# 删除一条
[root@es-0001 ~]# curl -XDELETE http://127.0.0.1:9200/tedu/teacher/1
# 删除索引
[root@es-0001 ~]# curl -XDELETE http://127.0.0.1:9200/tedu

导入数据

[root@ecs-proxy ~]# gunzip logs.jsonl.gz 
[root@ecs-proxy ~]# curl -XPOST -H "Content-Type: application/json" http://192.168.1.21:9200/_bulk --data-binary @logs.jsonl

kibana安装

数据的可视化展示

yum install -y kibana

vim /etc/kibana/kibana.yml
server.port: 5601
server.host: "0.0.0.0"
elasticsearch.host: ["http://xxx:9200","http://xxx:9200"]
i18n.locale: "zh-CN"

systemctl restart kibana

访问，点击管理创建索引，根据时间戳

可视化，根据之前创建的饼图分析，拆分切片，词，点击播放按钮开始分析数据

套圈分析指数增长不要套太多容易死机，3圈差不多，可以调整顺序重新分析，或者关闭分析

Logstash

数据采集、加工处理（转json格式）以及传输的工具

键值对形势key => "value"

依赖java环境

工作模式

input------收集数据
filter-------处理数据
output----输出

这个配置文件的内容包括了下面安装完filebeat的内容配置

yum install -y java-1.8.0-openjdk-devel logstash
 
ln -s /etc/logstash /usr/share/logstash/config
vim /etc/logstash/conf.d/my.conf
input{
  #stdin{}
  beats {
  port => 5044
  }
}
filter{
  grok {
    match => { "message" => "HTTPD_COMMONLOG" }
    #宏文件中有写好的正则表达式，不会搜
      }
}
 
output{
  stdout{ codec => "rubydebug" }#默认就是rubydebug
  elasticsearch {
    hosts => ["http://x.x.x.1:9200","http://x.x.x.2:9200"]
    index => "namelog-%{+YYYY.MM.dd}"
    }
}

插件

#看官网各种插件的使用
/usr/share/logstash/bin/logstash-plugin list

Filebeat

logstash依赖java环境装在web服务器上太耗资源，但是lostash的input里不能老是自己手动拷贝日志过去，不现实。

所以一个性能消耗低的中转站filebeat出现了，可以在web服务器和logatsh之间通过网络当传话筒。

如果输出的日志本身就是json格式，那么可以直接跳过logstash，输出到elasticsearch中。

安装配置

yum install -y filebeat
systemctl start filebeat
vim /etc/filebeat/filebeat.yml

24: enabled: true #打开收集模块
28: - /var/log/httpd/access_log #定义日志路径
148: #注释掉，默认发给elasticsearch
150: #注释掉，默认发给elasticsearch
161: output.logstash #设置输出模块
163: hosts: ["x.x.x.x:5044"]
179: #收集系统相关信息注释掉
180: #收集系统相关信息注释掉
181: #收集系统相关信息注释掉

systemctl restart httpd filebeat

以上的配置只适用于单一的日志收集，如果有数据库，网页日志，xxx很多类型的日志，那么logstash就需要去针对不同类型的日志作出不同的处理，那么就需要在收集日志（filebeat）的时候打标签，然后在logstash里根据if判断标签去走不同的过滤器，并决定是否输出到elasticsearch

多类型filebeat+logstash

1.filebeat增量修改文件-----添加标签

vim /etc/filebeat/filebeat.yml

45: fields:
      logtype: http_log

systemctl restart filebeat

2.logstash

vim /etc/logstash/conf.d/my.conf

input{
    #stdin{}
  beats {
  port => 5044
  }
}
filter{
  if [fields][logtype] == "http_log" {
  grok {
    match => { "message" => "HTTPD_COMMONLOG" }
    #宏文件中有写好的正则表达式，不会搜
        }
  }      
}
 
output{
  stdout{ codec => "rubydebug" }#默认就是rubydebug
    if [fields][logtype] == "http_log" {
    elasticsearch {
      hosts => ["http://x.x.x.1:9200","http://x.x.x.2:9200"]
      index => "namelog-%{+YYYY.MM.dd}"
        }
    }
}