搭建es+kibana+logstash+filebeat 日志收集分析

实现目标:

使用elasticsearch+kibana+logstash+filebeat搭建服务,以支持(日志收集切分展示查找)

架构:

10.6.14.77 es,kibana,logstash  (三项默认的配置均为localhost,起在同一台服务器不再需要修改)

logstash 在单服务器上起多个实例,分别收集每个服务的日志,以隔离各个服务的日志收集

filebeat (起在各个需要收集日志的服务器上)

通过supervisor控制各个服务器上的响应服务

版本:

es 6.2.3 (基础日志数据存储)

kibana 6.2.3 (可视化web端服务)

logstash 6.2.3 (日志切分及归类)

filebeat 6.3.2 (各服务器日志收集)

 

前置:

需要安装java8

elastic stack 官网:https://www.elastic.co/products/

 

elasticsearch:

1.unzip elasticsearch-6.2.3.zip

2.启动  ./bin/elasticsearch  

   后台启动  ./bin/elasticsearch -d

   检测是否启动成功:curl localhost:9200

3.若报错:can not run elasticsearch as root 

   原因:这是出于系统安全考虑设置的条件。由于ElasticSearch可以接收用户输入的脚本并且执行,为了系统安全考虑, 建议创建一个单独的用户用来运行ElasticSearch

   解决方案:创建elsearch用户组及elsearch用户

   groupadd elsearch

   useradd elsearch -g elsearch -p elasticsearch

   chown -R elsearch:elsearch elasticsearch

4.若报错:max virtual memory areas vm.maxmapcount [65530] is too low

   sudo sysctl -w vm.max_map_count=262144

   在本机查看启动信息:curl localhost:9200

5. 配置外网访问:

config/elasticsearch.yml

修改为:network.host: 0.0.0.0

    

filebeat:

下载地址:https://www.elastic.co/cn/downloads/beats/filebeat

1.修改filebeat.yml配置

hint: filebeat.yml的默认配置是将数据output至elasticsearch,需要将其注释掉,并启用logstash output

filebeat.inputs: # 设置输入源
  # Change to true to enable this input configuration.
  enabled: true 
  paths:
        - /data/log/* # 读取日志路径
output.logstash: # 设置输出至logstash
  hosts: ["10.6.14.77:5044"]

tail_files: true #从文件末尾开始读取,否则在启动时会读取全文件

# hint: 将es的output注释掉

2.命令行测试启动:

./filebeat -e -c filebeat.yml

 

logstash:

下载地址:https://www.elastic.co/cn/downloads/logstash

1.unzip logstash-6.2.3.zip

2.命令行测试启动:

./bin/logstash -e "input {stdin{}} output {stdout{}}"

可在命令行输入,并输出至命令行

3.配置文件测试启动:

-e:指定logstash的配置信息,可以用于快速测试;
-f :指定logstash的配置文件;可以用于生产环境;

配置文件样例:

input { stdin { } }

filter {
  grok {
    match => { "message" => "%{COMBINEDAPACHELOG}" }
  }
  date {
    match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
  }
}

output {
  stdout { codec => rubydebug }
}
.bin/logstash -f test.conf

4.配置启动

input {
    beats {
        port => "5044"
    }
} # 配置输入源为5044端口,filebeat默认打到5044端口

input {
	beats {
		add_field => {"log_type" => "pisces"} # 对于不同服务的filebeat打来的log,添加不同的log_type
		port => 5044
	}
	beats {
		add_field => {"log_type" => "aries"}
		port => 5043
	}
	beats {
		add_field => {"log_type" => "aquarius"}
		port => 5045
	}
}
filter {
	if "pisces" in [tags]{
		grok {
			match => { "message" => "\[%{WORD:info_level} %{DATESTAMP:timestamp} %{WORD:temp}:%{NUMBER:temp}\] %{NUMBER:status} %{WORD:method} %{URIPATHPARAM:request} \(%{IP:ip}\) %{GREEDYDATA:C}"
			}
		} # 切分规则,grok中大写字母为grok规则中的匹配关键字
		mutate {
			split => ["request", "?"] # 将grok中匹配的”request“按”?“分隔
			add_field => {
				"uri" => "%{[request][0]}"
			}
			add_field => {
				"param" => "%{[request][1]}"
			}
            convert => ["C", "float"] # 将入库数据类型改为float,方便后续的kibana统计
		}

	    kv { # 将mutate中添加的”param“以“field_split”切分并按”include_keys"中的键值进行汇总
			source => "param"
            field_split => "&?"
			include_keys => ["subject", "paper_id", "session_key"]
			target => "kv"
		}
        date { # 将时间改为服务日志的时间,(否则会为进入es的时间)
			match => ["timestamp", "dd-MM-YY HH:mm:ss"]
	                timezone => "+08:00"
		}
        if "_grokparsefailure" in [tags] { # 抛弃掉不符合grok规则的日志
			drop {}
		}
        if ([uri] =~ "^\/klx") { # 抛弃掉某些无用的日志
			drop {}
		}
		if ([uri] == "/api/student/check_vip" or [uri] == "/api/correction/practice" or [uri] == "/api/student/recent_paper") {
			drop {} # 抛弃掉某些无用的日志
		}
		mutate {
			add_field => {"par" => "%{kv}"}
		}
		json {
			source => "par"
			remove_field => ["par"]
		}
	}
	if [log_type] == "aries" {
		grok {
			match => { "message" => "\[%{WORD:info_level} %{DATESTAMP:timestamp} %{WORD:temp}:%{NUMBER:temp}\] %{NUMBER:status} %{WORD:method} %{URIPATHPARAM:request} \(%{IP:ip}\) %{GREEDYDATA:C}"
			}
		}
		mutate {
			split => ["request", "?"]
			add_field => {
				"uri" => "%{[request][0]}"
			}
			add_field => {
				"param" => "%{[request][1]}"
			}
		}

	    kv {
			source => "param"
                	field_split => "&?"
			include_keys => ["subject", "paper_id", "username", "gourp_id", "role"]
			target => "kv"
		}
		mutate {
			add_field => {"par" => "%{kv}"}
		}
 		json {
			source => "par"
			remove_field => ["par"]
		}
        date {
			match => ["timestamp", "dd-MM-YY HH:mm:ss"]
	                timezone => "+08:00"
		}
		if "_grokparsefailure" in [tags]{
			drop {}
		}
        	if ([uri] =~ "^\/klx") {
			drop {}
		}
		if ([uri] == "/analysis_authority") {
			drop {}
		}
		if ([uri] =~ "\/class_compare$") {mutate {replace => {"uri" => "/subject_analysis/class_compare"}}} # 将某些符合的uri替换掉
		if ([uri] =~ "\/subject_compare$") {mutate {replace => {"uri" => "/subject_analysis/subject_compare"}}}
		if ([uri] =~ "\/result_report$") {mutate {replace => {"uri" => "/subject_analysis/result_report"}}}
		if ([role] == "%E5%AD%A6%E7%A7%91%E8%80%81%E5%B8%88") {mutate {replace => {"role" => "teacher"}}} # 中文码替换
		if ([role] == "%E5%AD%A6%E7%A7%91%E7%BB%84%E9%95%BF") {mutate {replace => {"role" => "subject_leader"}}}
		if ([role] == "%E7%8F%AD%E4%B8%BB%E4%BB%BB") {mutate {replace => {"role" => "class_manager"}}}
		if ([role] == "%E5%B9%B4%E7%BA%A7%E4%B8%BB%E4%BB%BB") {mutate {replace => {"role" => "grade_manager"}}}
	}
        if "gardener" in [tags] {
		grok {
			match => { "message" => "\[%{WORD:info_level} %{DATESTAMP:timestamp} %{WORD:temp}:%{NUMBER:temp}\] %{NUMBER:status} %{WORD:method} %{URIPATHPARAM:request} \(%{IP:ip}\) %{GREEDYDATA:C}"
			}
		}
		mutate {
			split => ["request", "?"]
			add_field => {
				"uri" => "%{[request][0]}"
			}
			add_field => {
				"param" => "%{[request][1]}"
			}
		}
		kv {
			source => "param"
			field_split => "&?"
			include_keys => ["session_key", "group_id", "subject", "paper_id", "username"]
			target => "kv"
		}
		mutate {
			add_field => {"par" => "%{kv}"}
		}
		json {
			source => "par"
			remove_field => ["par"]
		}
		date {
			match => ["timestamp", "dd-MM-yy HH:mm:ss"]
			timezone => "+08:00"
		}
		if "_grokparsefailure" in [tags] {
			drop {}
		}
	}

	mutate {
		remove_field => ["request", "param", "beat", "input", "offset", "timestamp", "ip", "source", "prospector", "temp", "kv"]
	}
	translate { # 对切分出的uri进行type汇总
		field => "[uri]"
		destination => "[uri_type]"
		dictionary => {
			"/analysis/exam_list" => "学情列表"
			"/analysis_v2/general" => "总体分析"
			"/analysis_v2/report" => "成绩报表"
			"/analysis_v2/title" => "加载单科学情"
			"/analysis_v2/item_detail" => "讲评单题统计"
			"/analysis_v2/paper" => "试卷讲评"
			"/analysis_v2/report_card_download" => "下载成绩单"
			"/my_students/student_answer" => "查看学生成绩"
			"/analysis_v2/result" => "考试分析"
			"/learning_tracking/generic" => "学情追踪基本学情"
			"/learning_tracking/students_score" => "学情追踪学生成绩"
			"/analysis_v2/check_explain" => "选中讲解"
			"/analysis_v2/paper_setting" => "单科学情自定义"
			"/analysis_v2/report_search" => "成绩单搜索"
			"/union_exam_analysis/statement_list" => "联考报告列表"
			"/analysis_v2/aim_item" => "举一反三"
			"/analysis_v2/report_download" => "单科学情报表下载"
			"/analysis_v2/wrong_item_download" => "单科学情错题号下载"
			"/subject_analysis/class_compare" => "多科学情班级对比"
			"/subject_analysis/subject_compare" => "多科学情学科对比"
			"/subject_analysis/result_report" => "多科学情成绩报表"
			"/subject_analysis/result_report_data" => "多科学情其他报表"

			"/api/student/paper_list" => "单科考试列表"
			"/api/student/analysis_report" => "学情报告"
			"/api/correction/subjects" => "错题本首页"
			"/api/student/situation_analysis" => "考情分析"
			"/api/student/subject_wrong_knowledge" => "报告错题本按知识点"
			"/api/student/subject_wrong_paper" => "报告错题本按考试"
			"/api/student/items_analysis" => "报告试题详情"
			"/api/correction/correct" => "提交订正"
			"/api/correction/subject_book" => "错题本列表"
			"/api/student/statement_list" => "多科考试列表"
			"/api/correction/items_list" => "错题本试题列表"
			"/api/student/depth_analysis" => "报告深度分析"
			"/api/student/SW_subject" => "报告优劣势学科"
			"/api/student/report_trend" => "报告成绩变化趋势"
			"/api/correction/image_upload" => "上传订正图片"
			"/api/student/item_grasp" => "报告试题标记掌握"
			"/api/student/pdf_render" => "导出报告错题本"
			"/api/correction/pdf_render" => "导出错题本"
			"/api/correction/note" => "错题本笔记"

			"/correction/list" => "错题本列表"
			"/correction/general" => "单科错题总览"
			"/correction/detail" => "单科单学生错题详情"
			"/correction/urge" => "催订正"
			"/analysis/paper_list" => "学情列表"
			"/analysis/paper_info" => "单科学情总览"
			"/analysis/paper_basic" => "单科学情基本信息"
			"/analysis/paper_special_student" => "单科学情关注学生"
			"/analysis/paper_items" => "单科学情逐题分析"
			"/analysis/paper_students" => "单科学情成绩单"
			"/analysis/paper_detail" => "单科试题详情"
		}
	}
}

output { # 按日志tags的不同存入es中不同的index中
	if "pisces" in [tags]{
	    elasticsearch {
			hosts => ["10.8.12.71:9200"]
			index => "pisces.log"
		}
	}
    if "gardener" in [tags]{
		elasticsearch {
			hosts => ["10.8.12.71:9200"]
			index => "gardener.log"
		}
	}
	if [log_type] == "aries" {
		elasticsearch {
			hosts => ["10.8.12.71:9200"]
			index => "aries.log"
		}
	}
    if [log_type] == "aquarius" {
        elasticsearch {
            hosts => ["10.8.12.71:9200"]
            index => "aquarius.log"
        }
    }

	stdout {}
}

grok切分规则:https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns

grok调试工具:https://grokdebug.herokuapp.com/

logstash规则官方文档:https://www.elastic.co/guide/en/logstash/current/plugins-filters-date.html

以上grok规则会将日志切分为以下结构存入es:

log:[I 26-03-2019 14:50:01 web:1971] 200 GET /api/student/situation_analysis?session_key=88d4faeacb0e97d24982405cf2e788b7&source=paper&subject=math&paper_id=5c9820af3eaeefc905030afa&t=1553583001380 (113.91.43.156) 40.97ms

{
             "C" => "40.97ms",
          "host" => "M7-10-6-12-27-14-77",
       "message" => "[I 26-03-2019 14:50:01 web:1971] 200 GET /api/student/situation_analysis?session_key=88d4faeacb0e97d24982405cf2e788b7&source=paper&subject=math&paper_id=5c9820af3eaeefc905030afa&t=1553583001380 (113.91.43.156) 40.97ms",
          "temp" => [
        [0] "web",
        [1] "1971"
    ],
        "method" => "GET",
       "request" => [
        [0] "/api/student/situation_analysis",
        [1] "session_key=88d4faeacb0e97d24982405cf2e788b7&source=paper&subject=math&paper_id=5c9820af3eaeefc905030afa&t=1553583001380"
    ],
            "ip" => "113.91.43.156",
        "status" => "200",
           "uri" => "/api/student/situation_analysis",
    "info_level" => "I",
     "timestamp" => "26-03-2019 14:50:01",
            "kv" => {
        "session_key" => "88d4faeacb0e97d24982405cf2e788b7",
            "subject" => "math",
           "paper_id" => "5c9820af3eaeefc905030afa"
    },
      "@version" => "1",
    "@timestamp" => 2019-03-26T06:50:01.000Z,
         "param" => "session_key=88d4faeacb0e97d24982405cf2e788b7&source=paper&subject=math&paper_id=5c9820af3eaeefc905030afa&t=1553583001380"
}

 

kibana:

下载地址:https://www.elastic.co/cn/downloads/kibana

1.tar zxvf kibana-6.2.3-linux-x86_64.tar.gz

2.启动 ./bin/kibana

   后台启动:nohup ./bin/kibana &

3.配置外网访问:

config/kibana.yml

修改为:server.host: 0.0.0.0

4.修改elasticsearch.url

启动后需在kibana web端management中配置添加index

 

HINT

一:采集同一台服务器上日志,并做区分采用不同logstash存入es不同index中

1.修改filebeat.yml配置

filebeat.inputs:
# 设置两个path源,对于不同源的日志文件,写入不同的tags标记
- type: log
  paths:
    - /data/log/pisces/*
  tags: ["pisces"]

- type: log
  paths:
    - /data/log/aries/*
  tags: ["aries"]

2.修改logstash.conf配置

#对于包含不同tags标签的来源数据存入不同index,同理可在filter中对不同日志源采取不同分词
output {
	if "pisces" in [tags] {
		elasticsearch {
			hosts => ["10.6.14.77:9200"]
			index => "pisces.log"
		}
	}
	if "aries" in [tags] {
		elasticsearch {
			hosts => ["10.6.14.77:9200"]
			index => "aries.log"
		}
	}
}

二:在同一台服务器上起多个logstash实例

在同一台服务器上直接起多个logstash实例时,在启动第二个时会直接报错,log信息:

Logstash could not be started because there is already another instance using the configured data directory.  If you wish to run multiple instances, you must change the "path.data" setting.

解决办法:

1.为每个服务单独创建一个path.data

  例:/opt/sites/logstash-6.2.3/aquarius_data

         /opt/sites/logstash-6.2.3/aries_data

2.运行logstash时指定 path.data

   /opt/sites/logstash-6.2.3/bin/logstash -f /opt/sites/logstash-6.2.3/conf/aries.conf --path.data /opt/sites/logstash-6.2.3/aries_data

   /opt/sites/logstash-6.2.3/bin/logstash -f /opt/sites/logstash-6.2.3/conf/aquarius.conf --path.data /opt/sites/logstash-6.2.3/aquarius_data

 

 

supervisor 运维配置

目标:所有服务使用supervisor启动,并管理相关log

相关配置:

Elasticsearch

[program: Elasticsearch]
command =/data/elasticsearch-6.2.3/bin/elasticsearch
process_name=%(process_num)d
stopsignal=KILL
user=elsearch
redirect_stderr=true
stdout_logfile_maxbytes=5MB
stdout_logfile_backups=20
stdout_logfile=/data/log/Elasticsearch/Elasticsearch.log
environment=PATH=/data/jdk1.8.0_201/bin:%(ENV_PATH)s

kibana

[program:kibana]
command = /data/kibana-6.2.3-linux-x86_64/bin/kibana
process_name=%(process_num)d
stopsignal=KILL
user=root
redirect_stderr=true
stdout_logfile_maxbytes=5MB
stdout_logfile_backups=20
stdout_logfile=/data/log/kibana/kibana.log
environment=PATH=/data/jdk1.8.0_201/bin:%(ENV_PATH)s

logstash (若在单服务器上启动多个logstash实例,需要为每个实例配置专用的 path.data,并在启动时指定

[program:logstash]
command =/data/logstash-6.2.3/bin/logstash -f /data/logstash-6.2.3/pisces.conf
directory=/data/logstash-6.2.3
process_name=%(process_num)d
stopsignal=KILL
user=root
redirect_stderr=true
stdout_logfile_maxbytes=5MB
stdout_logfile_backups=20
stdout_logfile=/data/log/logstash/logstash.log
environment=PATH=/data/jdk1.8.0_201/bin:/data/logstash-6.2.3/bin:%(ENV_PATH)s

filebeat

[program:filebeat]
command = /data/filebeat-6.3.2-linux-x86_64/filebeat -e -c /data/filebeat-6.3.2-linux-x86_64/filebeat.yml
process_name=%(process_num)d
stopsignal=KILL
user=root
redirect_stderr=true
stdout_logfile_maxbytes=5MB
stdout_logfile_backups=20
stdout_logfile=/data/log/filebeat/filebeat.log
environment=PATH=/data/jdk1.8.0_201/bin:%(ENV_PATH)s

 

2021-1-21新增:

堆栈报错日志收集

问题:由于日志收集为单行收集,无法收集服务器上的堆栈报错

解决:安装multiline插件,配置logstash切词规则,可将堆栈报错整体收集存入es

 

1.安装multiline插件 

./bin/logstash-plugin install logstash-filter-multiline

如遇到无法安装,可修改Gemfile中的source为http

2.logstash中修改添加切词规则

    multiline {
            pattern => "\[%{WORD:info_level} %{DATESTAMP:timestamp}"
            negate => true
            what => "previous"
    }
# 不符合pattern匹配规则的日志行会添加至what(上一行)统一当做一条存入es,negate设置正选或反选

 

  • 1
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值