ELK日志收集系统规划方案
ELK架构示意图
软件版本信息
- 安装方式
yum install filebeat-5.4.0-x86_64.rpm -y
使用方法
浏览器访问:http://xxx:5601
实现的效果:
filebeat配置文件示例
logstash配置文件示例
elasticsearch配置文件示例
kibana配置文件示例
Ansible自动化安装配置Filebeat
Ansible是一个基于python开发的自动化运维工具。
实现步骤:
- 检测filebeat是否已经安装
- 停止原有的filebeat程序
- 卸载原有的filebeat
- 创建远程主机文件夹
- copy安装包过去到创建的文件夹
- 执行安装filebeat命令
- copy配置文件到filebeat配置文件目录
- 启动filebeat,测试elasticserach index
- logstash,elasticsearch因为是单节点,所以采用手工安装的方式
- 安装
[root@modoule ~]# yum install epel-release -y
[root@modoule ~]# yum install ansible -y
- 配置主机
[root@modoule ansible]# vim /etc/ansible/hosts
[app]
172.18.146.1
172.18.146.2
172.18.146.3
172.18.146.4
172.18.146.5
172.18.146.6
172.18.146.7
172.18.146.8
172.18.146.9
172.18.146.10
172.18.146.179
172.18.146.119
172.18.146.100
172.18.146.13
172.18.146.156
172.18.146.178
172.18.146.158
172.18.146.188
172.18.146.177
[mysql]
172.18.146.10 # master
172.18.146.11 # backup
[nginx]
# 172.18.146.20 # sync,can not login
172.18.146.1
172.18.146.2
172.18.146.3
[other]
172.18.146.17
172.18.146.16
172.18.146.18
172.18.146.15
172.18.146.11
172.18.146.19
172.18.146.51
172.18.146.50
172.18.146.52
172.18.146.53
172.18.146.59
172.18.146.60
172.18.146.90
[ELK]
172.18.146.110 # es
172.18.146.112 # logstash
172.18.146.113 # kibana
- Ansible自动化任务清单
---
- hosts: all
remote_user: root
gather_facts: F
tasks:
- name: ensure filebeat installed # 检测服务器是否安装filebeat程序
yum: name=filebeat state=present
- name: stop filebeat # 停止filebeat
service: name=filebeat state=stopped
register: result
- name: uninstall filebeat # 卸载filebeat
yum: name=filebeat state=absent
when: result is success
- name: create folder to storge filebeat # 创建文件夹
file:
path: /workspace/app/app_pagekage_source state=directory
force: no
- name: copy filebeat to remote host # copy filebeat程序到个节点目录
copy:
src: /root/data/filebeat-5.4.0-x86_64.rpm
dest: /workspace/app/app_pagekage_source/filebeat-5.4.0-x86_64.rpm
force: no
register: back
- name: install filebeat from a local path # 安装filebeat
yum:
name: /workspace/app/app_pagekage_source/filebeat-5.4.0-x86_64.rpm
state: present
when: back is success
register: res
- name: get IP # 判断IP,只在控制节点执行copy配置文件脚本
shell: hostname -I | grep 141 | wc -l
register: ip
- debug: var=ip
- name: copy filebeat.yml to remote host # copy配置文件至各节点
script: /root/test.sh
when: ip.stdout == "1" and res is success
- name: start filebeat # 启动filebeat程序
service: name=filebeat state=started
批量copy filebeat配置文件脚本
#!/bin/bash
path=/root/haha
cd $path && a=$(ls) && \
for i in $a;do cd ./$i && scp filebeat.yml root@192.168.227.$i: \
/etc/filebeat/ && cd ..;done
微服务类日志切割清理通用脚本
- 日志按天轮询:
进入应用目录,找到 .log 或者 .out 结尾的文件并打包,随后清空这两种日志文件的内容。
定时任务:00 00 * * * /server/scripts/zip_app_log.sh > /dev/null 2>&1
#!/bin/bash
yesterday=$(date -d "-1 day" +%Y%m%d)
cd /workspace/data/webapps && \
path=$(find ./ ! -name "." -type d -prune)
for i in $path
do
cd $i && a=$(find ./ -name "*.log" -o -name "*.out") && \
zip -r ${yesterday}.zip $a && for j in $a;do echo -n "" > $j; \
done && cd ..
done
- 按天清理7天以前的日志文件:
进入应用内目录,找到过去7天的打包的日志文件,并删除。
定时任务:00 00 * * * /server/scripts/clean_app_log.sh > /dev/null 2>&1
#!/bin/bash
cd /workspace/data/webapps && \
path=$(find ./ ! -name "." -type d -prune)
for p in $path
do
cd $p && find ./ -type f -mtime +7 -name "*.zip" -exec rm -f {} \; \
&& cd ..
done
MySQL日志
- MySQL日志分类:错误日志、查询日志、慢查询日志、事务日志、二进制日志
对应的物理文件如下:
1.错误日志:/workspace/data/mysql/mysql-error.log
清理方法:
mv mysql-error.log mysql-error.log -old
mysql -uroot -p’xxx’-e ‘flush error logs’
mv mysql-error.log -old backup-directory
定时脚本:/server/scripts/ cut_mysql_error_log.sh #按天切割,保留7天
#!/bin/bash
logs_path="/workspace/data/mysql"
cd $logs_path && \
/usr/bin/zip mysql-error_$(date +%F).zip \
`find ./ -type f -name "mysql-error.log"` && \
mysql -uroot -p’xxx’-e 'flush slow logs'
定时任务:00 00 * * * /server/scripts/ cut_mysql_error_log.sh > /dev/null 2>&1
- 查询日志:生产环境未开启
- 慢查询日志:/workspace/data/mysql/mysql-slow.log 生产未开启
清理方法:
mv mysql-slow.log mysql-slow.log -old
mysql -uroot -p’xxx’-e ‘flush slow logs’
mv mysql-slow.log -old backup-directory
4.事务日志: 生产有开启,不做处理
5.二进制日志:/workspace/data/mysql/mysql-bin
清理方法:二进制日志保留7天,根据时间点清理 设置expire_logs_days = 7
语法:
mysql> PURGE { BINARY | MASTER }
LOGS{TO 'log_name' | BEFORE datetime_expr }
示例:
mysql>PURGEBINARY LOGS TO 'mysql-bin.000007';
# 删除07文件之前的二进制文件
mysql>PURGEBINARY LOGS BEFORE '19-05-01 10:26:36';
# 删除19年5月1号之前的二进制日志
mysql> flush logs; # 重新生成所有日志。
Nginx日志
- Nginx生产环境只用来做代理,总共有4个服务器(内网IP):按天分割,保留7天
172.18.146.186 路径:/usr/local/nginx/logs
172.18.146.191 路径:/apps/nginx/logs
172.18.146.192 路径:/apps/nginx/logs
172.18.146.163 路径:/apps/nginx/logs
切割及清理方法:
定时任务:00 00 * * * /bin/sh /server/scripts/cut_nginx_log.sh > /dev/null 2>&1
按天切割打包脚本:/server/scripts/cut_nginx_log.sh
#!/bin/bash # 定时切割
logs_path="/apps/nginx/logs"
cd $logs_path && \
/usr/bin/zip Nginx_logs_$(date +%F -d -1day).zip
`find ./ -regex ".*\.log\|.*\.err"` && \
find ./ -regex ".*\.log\|.*\.err" -exec rm -f {} \;
7天清理脚本:/server/scripts/clean_nginx_log.sh
#!/bin/bash
logs_path="/workspace/app/nginx/logs"
cd $logs_path && \
find ./ -type f -mtime +7 -name "*.zip" -exec rm -f {} \;
kafka日志
- Kafka目前生产环境有三台服务器:一个星期清理一次
172.18.146.1 路径:/apps/kafka_2.11/logs
172.18.146.2 路径:/apps/kafka_2.11/logs
172.18.146.3 路径:/apps/kafka_2.11/logs
开发确认不收集Kafka日志,只配置定期清理即可。
清理脚本:/server/scripts/clean_kafka_log.sh
#!/bin/bash
logs_path="/workspace/app/kafka_2.11/logs"
cd $logs_path && \
find ./ -type f -mtime +7 -regex ".*/server.*\|.*/state-change.*" \
-exec rm -f {} \;
定时任务:00 00 * * * /server/scripts/ clean_kafka_log.sh > /dev/null 2>&1
Redis日志
- Redis目前生产环境有三台服务器:
172.18.146.1 路径:未开启记录日志功能
172.18.146.2 路径:/workspace/log/redis
172.18.146.3 路径:/workspace/log/redis
定时任务:00 00 * * * /server/scripts/ cut_redis_log.sh > /dev/null 2>&1
00 00 * * 0 /server/scripts/clean_redis_log.sh > /dev/null 2>&1
清理脚本:日志按天切割,保留七天
#!/bin/bash # 按天切割
logs_path="/workspace/log/redis"
cd $logs_path && \
/usr/bin/zip redis_$(date +%F -d -1day).zip \
`find ./ -name "redis.log"` && echo -n "" > `find ./ -name "redis.log"`
#!/bin/bash # 清理脚本
logs_path="/workspace/log/redis"
cd $logs_path && \
find ./ -type f -mtime +7 -name "redis.log" -exec rm -f {} \;
RabbitMQ日志
- RabbitMQ生产有两台服务器,启用一台:已自动按天分割,保留最近七天,本次不收集
172.18.146.10 路径:/var/log/rabbitmq
172.18.146.10 路径:未运行
定时任务:00 00 * * * /server/scripts/clean_rabbitmq_log.sh > /dev/null 2>&1
清理脚本:/server/scripts/clean_rabbitmq_log.sh
#!/bin/bash
logs_path="/var/log/rabbitmq"
cd $logs_path && \
find ./ -type f -mtime +7 -name "rabbit*.gz" -exec rm -f {} \;
- 定期删除索引
定时任务:0 1 * * * /server/scripts/clean_es_indexer.sh
#/bin/bash
#es-index-clear
#只保留15天内的日志索引
LAST_DATA=`date -d "-15 days" "+%Y.%m.%d"`
#删除上个月份所有的索引
curl -XDELETE 'http://ip:port/*-'${LAST_DATA}'*'