背景
常见的服务和主机已经使用Prometheus进行监控了,但是网络设备还未配置监控。使用基于SNMP对网络设备进行监控。
设备概览
主要类型为H3C的路由器和交换机。
H3CS5560交换机
路由器MER5200 er8300
一台群晖的NAS服务
步骤
配置网络设备开启telnet远程;
配置启用snmp协议;
zabbix界面添加主机进行监控;
snmp简介
SNMP simple network managerment protocol 分为三代,v3较为安全,需要配置用户名和密码及v3的加密密码。
组件
- nginx
- php7
- mysql
- zabbix_server
- zabbix_agent
- zabbix_get
zabbix最新的lts版本6.0zabbix_server 不支持在centos7上面安装,所以选用5.0版本。
MySQL
配置MySQL,容器方式启动
MySQL采用8.0版本,配置文件,数据持久化
docker pull mysql:8.0
mkdir -pv /data/mysql/{conf,data,logs}
[root@localhost ~]# cat /data/mysql/conf/my.cnf
[mysqld]
default_authentication_plugin=mysql_native_password
character-set-server=utf8mb4
collation-server=utf8mb4_bin
default-storage-engine=INNODB
# Remove leading # and set to the amount of RAM for the most important data
# cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%.
# innodb_buffer_pool_size = 128M
#
# Remove leading # to turn on a very important data integrity option: logging
# changes to the binary log between backups.
# log_bin
#
# Remove leading # to set options mainly useful for reporting servers.
# The server defaults are faster for transactions and fast SELECTs.
# Adjust sizes as needed, experiment to find the optimal values.
# join_buffer_size = 128M
# sort_buffer_size = 2M
# read_rnd_buffer_size = 2M
# Remove leading # to revert to previous value for default_authentication_plugin,
# this will increase compatibility with older clients. For background, see:
# https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_default_authentication_plugin
# default-authentication-plugin=mysql_native_password
skip-host-cache
skip-name-resolve
datadir=/var/lib/mysql
socket=/var/run/mysqld/mysqld.sock
secure-file-priv=/var/lib/mysql-files
user=mysql
pid-file=/var/run/mysqld/mysqld.pid
[client]
socket=/var/run/mysqld/mysqld.sock
!includedir /etc/mysql/conf.d/
# 启动运行
docker run -d --restart=always --name mysql -p 3306:3306 -v /data/mysql/conf/my.cnf:/etc/mysql/my.cnf -v /data/mysql/data:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=xxxxxxxxxxx mysql:8.0
使用yum repo中自带的是mariadb的MySQL5.x版本的客户端连接8.0版本数据库会报错
ERROR 2059 (HY000): Authentication plugin ‘sha256_password’ cannot be loaded: /usr/lib64/mysql/plugin/sha256_password.so: cannot open shared object file: No such file or directory
解决:安装MySQL community的客户端
wget https://repo.mysql.com//mysql80-community-release-el7-3.noarch.rpm
yum install mysql-community-client --nogpgcheck
mysql -uroot -h127.0.0.1 -pxxxxxxx
MySQL [(none)]> create database zabbix character set utf8 collate utf8_bin;
MySQL [(none)]> create user zabbix@'%' identified by 'xxxxxxx';
MySQL [(none)]> grant all privileges on zabbix.* to zabbix@'%';
MySQL [(none)]> set global log_bin_trust_function_creators = 1;
MySQL [(none)]> flush privileges;
zabbix配置
rpm -Uvh https://repo.zabbix.com/zabbix/5.0/rhel/7/x86_64/zabbix-release-5.0-1.el7.noarch.rpm
vim /etc/yum.repos.d/zabbix.repo
[zabbix-frontend]
...
enabled=1
yum install zabbix-server-mysql zabbix-agent centos-release-scl zabbix-web-mysql-scl zabbix-nginx-conf-scl
vim /etc/opt/rh/rh-php72/php-fpm.d/zabbix.conf
listen.acl_users = apache,nginx
php_value[date.timezone] = Asia/Shanghai
vim /etc/opt/rh/rh-nginx116/nginx/conf.d/zabbix.conf
# listen 80;
# server_name example.com;
systemctl enable --now zabbix-server zabbix-agent rh-nginx116-nginx rh-php72-php-fpm
systemctl is-active zabbix-server;systemctl is-active zabbix-agent;systemctl is-active rh-php72-php-fpm;systemctl is-active rh-nginx116-nginx
vim /etc/opt/rh/rh-php72/php-fpm.d/zabbix.conf
listen.acl_users = apache,nginx
php_value[date.timezone] = Asia/Shanghai
vim /etc/opt/rh/rh-nginx116/nginx/conf.d/zabbix.conf
# listen 80;
# server_name example.com;
systemctl enable --now zabbix-server zabbix-agent rh-nginx116-nginx rh-php72-php-fpm
检查所有的组件依赖都是ok
默认web登录账户名密码为Admin(A是大写) zabbix
snmp测试工具
yum install net-snmp-utils -y
添加主机
注意的点是:由于使用了snmpv3协议,所以在添加主机的时候,要配置宏,在宏中定义用户名密码和加密密码
模板选择
使用zabbix中自带的H3C的模板即可覆盖大部分,少部分没有的,可以在知了社区,打售后电话找到监控项对应的OID,通过snmpwalk命令进行测试并在RS的web界面或cli命令行界面进行校验之后,在zabbix中创建监控项和对应的触发器。
告警消息通知
大概步骤:
编写微信告警脚本
web界面创建告警媒介,填写告警消息模板
web界面指定用户关联指定告警模板
web界面创建告警动作
将触发器同动作关联,进行测试
告警流程:监控模板获取监控信息–》根据监控信息配置触发器–》触发器触发指定动作–》动作通过告警媒介通知用户
以企业微信机器人告警做为示例
# 告警脚本存放路径
[root@localhost ~]# grep -i alertscript /etc/zabbix/zabbix_server.conf
### Option: AlertScriptsPath
# AlertScriptsPath=${datadir}/zabbix/alertscripts
AlertScriptsPath=/usr/lib/zabbix/alertscripts
字体问题
zabbix监控图形界面显示中文
复制Windows中的中文字体C:\Windows\Fonts 到Linux主机中
# zabbix中定义监控图形字体的文件
[root@localhost ~]# grep 'ZBX_GRAPH_FONT_NAME' /usr/share/zabbix/include/defines.inc.php
define('ZBX_GRAPH_FONT_NAME', 'graphfont'); // font file name
# 找到文件位置 是一个软连接
[root@localhost ~]# find / -name "graphfont.*"
/usr/share/zabbix/assets/fonts/graphfont.ttf
[root@localhost ~]# ll /usr/share/zabbix/assets/fonts/
total 0
lrwxrwxrwx. 1 root root 33 Aug 24 21:07 graphfont.ttf -> /etc/alternatives/zabbix-web-font
[root@localhost ~]# ll /etc/alternatives/zabbix-web-font
lrwxrwxrwx. 1 root root 38 Aug 24 21:07 /etc/alternatives/zabbix-web-font -> /usr/share/fonts/dejavu/DejaVuSans.ttf
[root@localhost ~]# ll /usr/share/fonts/dejavu/DejaVuSans.ttf
-rw-r--r--. 1 root root 720012 Feb 27 2011 /usr/share/fonts/dejavu/DejaVuSans.ttf
# 将windows主机中的中文字体复制到此文件夹中
cp -r /root/SIMKAI.TTF /usr/share/zabbix/assets/fonts/
mv SIMKAI.TTF simkai.ttf
vim /usr/share/zabbix/include/defines.inc.php
[root@localhost fonts]# grep 'ZBX_GRAPH_FONT_NAME' /usr/share/zabbix/include/defines.inc.php
define('ZBX_GRAPH_FONT_NAME', 'simkai'); // font file name
[root@localhost fonts]# systemctl restart rh-php72-php-fpm.service
zabbix模板误删恢复
https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates
找到zabbix的版本,和模板的名字,可以下载之后在zabbix中导入。
问题
zabbix机器中使用snmpwalk命令测试可以获取到被监控节点的各个oid详细信息,但是zabbix server的web界面中显示timeout或者其他的报错信息。
解决方法:无意间重启了zabbix server,问题得到解决。
网络设备重启后zabbix不会自动采集数据
在zabbix server节点清空采集数据缓存之后,等到下一个zabbix采集数据的时间点就会自动采集数据了。