实现目标:
使用elasticsearch+kibana+logstash+filebeat搭建服务,以支持(日志收集切分展示查找)
架构:
10.6.14.77 es,kibana,logstash (三项默认的配置均为localhost,起在同一台服务器不再需要修改)
logstash 在单服务器上起多个实例,分别收集每个服务的日志,以隔离各个服务的日志收集
filebeat (起在各个需要收集日志的服务器上)
通过supervisor控制各个服务器上的响应服务
版本:
es 6.2.3 (基础日志数据存储)
kibana 6.2.3 (可视化web端服务)
logstash 6.2.3 (日志切分及归类)
filebeat 6.3.2 (各服务器日志收集)
前置:
需要安装java8
elastic stack 官网:https://www.elastic.co/products/
elasticsearch:
1.unzip elasticsearch-6.2.3.zip
2.启动 ./bin/elasticsearch
后台启动 ./bin/elasticsearch -d
检测是否启动成功:curl localhost:9200
3.若报错:can not run elasticsearch as root
原因:这是出于系统安全考虑设置的条件。由于ElasticSearch可以接收用户输入的脚本并且执行,为了系统安全考虑, 建议创建一个单独的用户用来运行ElasticSearch
解决方案:创建elsearch用户组及elsearch用户
groupadd elsearch
useradd elsearch -g elsearch -p elasticsearch
chown -R elsearch:elsearch elasticsearch
4.若报错:max virtual memory areas vm.maxmapcount [65530] is too low
sudo sysctl -w vm.max_map_count=262144
在本机查看启动信息:curl localhost:9200
5. 配置外网访问:
config/elasticsearch.yml
修改为:network.host: 0.0.0.0
filebeat:
下载地址:https://www.elastic.co/cn/downloads/beats/filebeat
1.修改filebeat.yml配置
hint: filebeat.yml的默认配置是将数据output至elasticsearch,需要将其注释掉,并启用logstash output
filebeat.inputs: # 设置输入源
# Change to true to enable this input configuration.
enabled: true
paths:
- /data/log/* # 读取日志路径
output.logstash: # 设置输出至logstash
hosts: ["10.6.14.77:5044"]
tail_files: true #从文件末尾开始读取,否则在启动时会读取全文件
# hint: 将es的output注释掉
2.命令行测试启动:
./filebeat -e -c filebeat.yml
logstash:
下载地址:https://www.elastic.co/cn/downloads/logstash
1.unzip logstash-6.2.3.zip
2.命令行测试启动:
./bin/logstash -e "input {stdin{}} output {stdout{}}"
可在命令行输入,并输出至命令行
3.配置文件测试启动:
-e:指定logstash的配置信息,可以用于快速测试;
-f :指定logstash的配置文件;可以用于生产环境;
配置文件样例:
input { stdin { } }
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output {
stdout { codec => rubydebug }
}
.bin/logstash -f test.conf
4.配置启动
input {
beats {
port => "5044"
}
} # 配置输入源为5044端口,filebeat默认打到5044端口
input {
beats {
add_field => {"log_type" => "pisces"} # 对于不同服务的filebeat打来的log,添加不同的log_type
port => 5044
}
beats {
add_field => {"log_type" => "aries"}
port => 5043
}
beats {
add_field => {"log_type" => "aquarius"}
port => 5045
}
}
filter {
if "pisces" in [tags]{
grok {
match => { "message" => "\[%{WORD:info_level} %{DATESTAMP:timestamp} %{WORD:temp}:%{NUMBER:temp}\] %{NUMBER:status} %{WORD:method} %{URIPATHPARAM:request} \(%{IP:ip}\) %{GREEDYDATA:C}"
}
} # 切分规则,grok中大写字母为grok规则中的匹配关键字
mutate {
split => ["request", "?"] # 将grok中匹配的”request“按”?“分隔
add_field => {
"uri" => "%{[request][0]}"
}
add_field => {
"param" => "%{[request][1]}"
}
convert => ["C", "float"] # 将入库数据类型改为float,方便后续的kibana统计
}
kv { # 将mutate中添加的”param“以“field_split”切分并按”include_keys"中的键值进行汇总
source => "param"
field_split => "&?"
include_keys => ["subject", "paper_id", "session_key"]
target => "kv"
}
date { # 将时间改为服务日志的时间,(否则会为进入es的时间)
match => ["timestamp", "dd-MM-YY HH:mm:ss"]
timezone => "+08:00"
}
if "_grokparsefailure" in [tags] { # 抛弃掉不符合grok规则的日志
drop {}
}
if ([uri] =~ "^\/klx") { # 抛弃掉某些无用的日志
drop {}
}
if ([uri] == "/api/student/check_vip" or [uri] == "/api/correction/practice" or [uri] == "/api/student/recent_paper") {
drop {} # 抛弃掉某些无用的日志
}
mutate {
add_field => {"par" => "%{kv}"}
}
json {
source => "par"
remove_field => ["par"]
}
}
if [log_type] == "aries" {
grok {
match => { "message" => "\[%{WORD:info_level} %{DATESTAMP:timestamp} %{WORD:temp}:%{NUMBER:temp}\] %{NUMBER:status} %{WORD:method} %{URIPATHPARAM:request} \(%{IP:ip}\) %{GREEDYDATA:C}"
}
}
mutate {
split => ["request", "?"]
add_field => {
"uri" => "%{[request][0]}"
}
add_field => {
"param" => "%{[request][1]}"
}
}
kv {
source => "param"
field_split => "&?"
include_keys => ["subject", "paper_id", "username", "gourp_id", "role"]
target => "kv"
}
mutate {
add_field => {"par" => "%{kv}"}
}
json {
source => "par"
remove_field => ["par"]
}
date {
match => ["timestamp", "dd-MM-YY HH:mm:ss"]
timezone => "+08:00"
}
if "_grokparsefailure" in [tags]{
drop {}
}
if ([uri] =~ "^\/klx") {
drop {}
}
if ([uri] == "/analysis_authority") {
drop {}
}
if ([uri] =~ "\/class_compare$") {mutate {replace => {"uri" => "/subject_analysis/class_compare"}}} # 将某些符合的uri替换掉
if ([uri] =~ "\/subject_compare$") {mutate {replace => {"uri" => "/subject_analysis/subject_compare"}}}
if ([uri] =~ "\/result_report$") {mutate {replace => {"uri" => "/subject_analysis/result_report"}}}
if ([role] == "%E5%AD%A6%E7%A7%91%E8%80%81%E5%B8%88") {mutate {replace => {"role" => "teacher"}}} # 中文码替换
if ([role] == "%E5%AD%A6%E7%A7%91%E7%BB%84%E9%95%BF") {mutate {replace => {"role" => "subject_leader"}}}
if ([role] == "%E7%8F%AD%E4%B8%BB%E4%BB%BB") {mutate {replace => {"role" => "class_manager"}}}
if ([role] == "%E5%B9%B4%E7%BA%A7%E4%B8%BB%E4%BB%BB") {mutate {replace => {"role" => "grade_manager"}}}
}
if "gardener" in [tags] {
grok {
match => { "message" => "\[%{WORD:info_level} %{DATESTAMP:timestamp} %{WORD:temp}:%{NUMBER:temp}\] %{NUMBER:status} %{WORD:method} %{URIPATHPARAM:request} \(%{IP:ip}\) %{GREEDYDATA:C}"
}
}
mutate {
split => ["request", "?"]
add_field => {
"uri" => "%{[request][0]}"
}
add_field => {
"param" => "%{[request][1]}"
}
}
kv {
source => "param"
field_split => "&?"
include_keys => ["session_key", "group_id", "subject", "paper_id", "username"]
target => "kv"
}
mutate {
add_field => {"par" => "%{kv}"}
}
json {
source => "par"
remove_field => ["par"]
}
date {
match => ["timestamp", "dd-MM-yy HH:mm:ss"]
timezone => "+08:00"
}
if "_grokparsefailure" in [tags] {
drop {}
}
}
mutate {
remove_field => ["request", "param", "beat", "input", "offset", "timestamp", "ip", "source", "prospector", "temp", "kv"]
}
translate { # 对切分出的uri进行type汇总
field => "[uri]"
destination => "[uri_type]"
dictionary => {
"/analysis/exam_list" => "学情列表"
"/analysis_v2/general" => "总体分析"
"/analysis_v2/report" => "成绩报表"
"/analysis_v2/title" => "加载单科学情"
"/analysis_v2/item_detail" => "讲评单题统计"
"/analysis_v2/paper" => "试卷讲评"
"/analysis_v2/report_card_download" => "下载成绩单"
"/my_students/student_answer" => "查看学生成绩"
"/analysis_v2/result" => "考试分析"
"/learning_tracking/generic" => "学情追踪基本学情"
"/learning_tracking/students_score" => "学情追踪学生成绩"
"/analysis_v2/check_explain" => "选中讲解"
"/analysis_v2/paper_setting" => "单科学情自定义"
"/analysis_v2/report_search" => "成绩单搜索"
"/union_exam_analysis/statement_list" => "联考报告列表"
"/analysis_v2/aim_item" => "举一反三"
"/analysis_v2/report_download" => "单科学情报表下载"
"/analysis_v2/wrong_item_download" => "单科学情错题号下载"
"/subject_analysis/class_compare" => "多科学情班级对比"
"/subject_analysis/subject_compare" => "多科学情学科对比"
"/subject_analysis/result_report" => "多科学情成绩报表"
"/subject_analysis/result_report_data" => "多科学情其他报表"
"/api/student/paper_list" => "单科考试列表"
"/api/student/analysis_report" => "学情报告"
"/api/correction/subjects" => "错题本首页"
"/api/student/situation_analysis" => "考情分析"
"/api/student/subject_wrong_knowledge" => "报告错题本按知识点"
"/api/student/subject_wrong_paper" => "报告错题本按考试"
"/api/student/items_analysis" => "报告试题详情"
"/api/correction/correct" => "提交订正"
"/api/correction/subject_book" => "错题本列表"
"/api/student/statement_list" => "多科考试列表"
"/api/correction/items_list" => "错题本试题列表"
"/api/student/depth_analysis" => "报告深度分析"
"/api/student/SW_subject" => "报告优劣势学科"
"/api/student/report_trend" => "报告成绩变化趋势"
"/api/correction/image_upload" => "上传订正图片"
"/api/student/item_grasp" => "报告试题标记掌握"
"/api/student/pdf_render" => "导出报告错题本"
"/api/correction/pdf_render" => "导出错题本"
"/api/correction/note" => "错题本笔记"
"/correction/list" => "错题本列表"
"/correction/general" => "单科错题总览"
"/correction/detail" => "单科单学生错题详情"
"/correction/urge" => "催订正"
"/analysis/paper_list" => "学情列表"
"/analysis/paper_info" => "单科学情总览"
"/analysis/paper_basic" => "单科学情基本信息"
"/analysis/paper_special_student" => "单科学情关注学生"
"/analysis/paper_items" => "单科学情逐题分析"
"/analysis/paper_students" => "单科学情成绩单"
"/analysis/paper_detail" => "单科试题详情"
}
}
}
output { # 按日志tags的不同存入es中不同的index中
if "pisces" in [tags]{
elasticsearch {
hosts => ["10.8.12.71:9200"]
index => "pisces.log"
}
}
if "gardener" in [tags]{
elasticsearch {
hosts => ["10.8.12.71:9200"]
index => "gardener.log"
}
}
if [log_type] == "aries" {
elasticsearch {
hosts => ["10.8.12.71:9200"]
index => "aries.log"
}
}
if [log_type] == "aquarius" {
elasticsearch {
hosts => ["10.8.12.71:9200"]
index => "aquarius.log"
}
}
stdout {}
}
grok切分规则:https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns
grok调试工具:https://grokdebug.herokuapp.com/
logstash规则官方文档:https://www.elastic.co/guide/en/logstash/current/plugins-filters-date.html
以上grok规则会将日志切分为以下结构存入es:
log:[I 26-03-2019 14:50:01 web:1971] 200 GET /api/student/situation_analysis?session_key=88d4faeacb0e97d24982405cf2e788b7&source=paper&subject=math&paper_id=5c9820af3eaeefc905030afa&t=1553583001380 (113.91.43.156) 40.97ms
{
"C" => "40.97ms",
"host" => "M7-10-6-12-27-14-77",
"message" => "[I 26-03-2019 14:50:01 web:1971] 200 GET /api/student/situation_analysis?session_key=88d4faeacb0e97d24982405cf2e788b7&source=paper&subject=math&paper_id=5c9820af3eaeefc905030afa&t=1553583001380 (113.91.43.156) 40.97ms",
"temp" => [
[0] "web",
[1] "1971"
],
"method" => "GET",
"request" => [
[0] "/api/student/situation_analysis",
[1] "session_key=88d4faeacb0e97d24982405cf2e788b7&source=paper&subject=math&paper_id=5c9820af3eaeefc905030afa&t=1553583001380"
],
"ip" => "113.91.43.156",
"status" => "200",
"uri" => "/api/student/situation_analysis",
"info_level" => "I",
"timestamp" => "26-03-2019 14:50:01",
"kv" => {
"session_key" => "88d4faeacb0e97d24982405cf2e788b7",
"subject" => "math",
"paper_id" => "5c9820af3eaeefc905030afa"
},
"@version" => "1",
"@timestamp" => 2019-03-26T06:50:01.000Z,
"param" => "session_key=88d4faeacb0e97d24982405cf2e788b7&source=paper&subject=math&paper_id=5c9820af3eaeefc905030afa&t=1553583001380"
}
kibana:
下载地址:https://www.elastic.co/cn/downloads/kibana
1.tar zxvf kibana-6.2.3-linux-x86_64.tar.gz
2.启动 ./bin/kibana
后台启动:nohup ./bin/kibana &
3.配置外网访问:
config/kibana.yml
修改为:server.host: 0.0.0.0
4.修改elasticsearch.url
启动后需在kibana web端management中配置添加index
HINT
一:采集同一台服务器上日志,并做区分采用不同logstash存入es不同index中
1.修改filebeat.yml配置
filebeat.inputs:
# 设置两个path源,对于不同源的日志文件,写入不同的tags标记
- type: log
paths:
- /data/log/pisces/*
tags: ["pisces"]
- type: log
paths:
- /data/log/aries/*
tags: ["aries"]
2.修改logstash.conf配置
#对于包含不同tags标签的来源数据存入不同index,同理可在filter中对不同日志源采取不同分词
output {
if "pisces" in [tags] {
elasticsearch {
hosts => ["10.6.14.77:9200"]
index => "pisces.log"
}
}
if "aries" in [tags] {
elasticsearch {
hosts => ["10.6.14.77:9200"]
index => "aries.log"
}
}
}
二:在同一台服务器上起多个logstash实例
在同一台服务器上直接起多个logstash实例时,在启动第二个时会直接报错,log信息:
Logstash could not be started because there is already another instance using the configured data directory. If you wish to run multiple instances, you must change the "path.data" setting.
解决办法:
1.为每个服务单独创建一个path.data
例:/opt/sites/logstash-6.2.3/aquarius_data
/opt/sites/logstash-6.2.3/aries_data
2.运行logstash时指定 path.data
/opt/sites/logstash-6.2.3/bin/logstash -f /opt/sites/logstash-6.2.3/conf/aries.conf --path.data /opt/sites/logstash-6.2.3/aries_data
/opt/sites/logstash-6.2.3/bin/logstash -f /opt/sites/logstash-6.2.3/conf/aquarius.conf --path.data /opt/sites/logstash-6.2.3/aquarius_data
supervisor 运维配置
目标:所有服务使用supervisor启动,并管理相关log
相关配置:
Elasticsearch
[program: Elasticsearch]
command =/data/elasticsearch-6.2.3/bin/elasticsearch
process_name=%(process_num)d
stopsignal=KILL
user=elsearch
redirect_stderr=true
stdout_logfile_maxbytes=5MB
stdout_logfile_backups=20
stdout_logfile=/data/log/Elasticsearch/Elasticsearch.log
environment=PATH=/data/jdk1.8.0_201/bin:%(ENV_PATH)s
kibana
[program:kibana]
command = /data/kibana-6.2.3-linux-x86_64/bin/kibana
process_name=%(process_num)d
stopsignal=KILL
user=root
redirect_stderr=true
stdout_logfile_maxbytes=5MB
stdout_logfile_backups=20
stdout_logfile=/data/log/kibana/kibana.log
environment=PATH=/data/jdk1.8.0_201/bin:%(ENV_PATH)s
logstash (若在单服务器上启动多个logstash实例,需要为每个实例配置专用的 path.data,并在启动时指定
[program:logstash]
command =/data/logstash-6.2.3/bin/logstash -f /data/logstash-6.2.3/pisces.conf
directory=/data/logstash-6.2.3
process_name=%(process_num)d
stopsignal=KILL
user=root
redirect_stderr=true
stdout_logfile_maxbytes=5MB
stdout_logfile_backups=20
stdout_logfile=/data/log/logstash/logstash.log
environment=PATH=/data/jdk1.8.0_201/bin:/data/logstash-6.2.3/bin:%(ENV_PATH)s
filebeat
[program:filebeat]
command = /data/filebeat-6.3.2-linux-x86_64/filebeat -e -c /data/filebeat-6.3.2-linux-x86_64/filebeat.yml
process_name=%(process_num)d
stopsignal=KILL
user=root
redirect_stderr=true
stdout_logfile_maxbytes=5MB
stdout_logfile_backups=20
stdout_logfile=/data/log/filebeat/filebeat.log
environment=PATH=/data/jdk1.8.0_201/bin:%(ENV_PATH)s
2021-1-21新增:
堆栈报错日志收集
问题:由于日志收集为单行收集,无法收集服务器上的堆栈报错
解决:安装multiline插件,配置logstash切词规则,可将堆栈报错整体收集存入es
1.安装multiline插件
./bin/logstash-plugin install logstash-filter-multiline
如遇到无法安装,可修改Gemfile中的source为http
2.logstash中修改添加切词规则
multiline {
pattern => "\[%{WORD:info_level} %{DATESTAMP:timestamp}"
negate => true
what => "previous"
}
# 不符合pattern匹配规则的日志行会添加至what(上一行)统一当做一条存入es,negate设置正选或反选