zabbix-server 突然挂了,查看日志
排查后发现是因为突然监控主机增长导致内存不足。
1015995:20211027:161344.761 [history_influxdb.so] Database Engine used: mysql
1015995:20211027:161344.761 [history_influxdb.so] Using compatibility with Zabbix 4
1015995:20211027:161344.761 loaded modules: history_influxdb.so
1015995:20211027:161344.766 current database version (mandatory/optional): 04040000/04040002
1015995:20211027:161344.766 required mandatory version: 04040000
1015995:20211027:161344.775 server #0 started [main process]
1015997:20211027:161344.776 server #1 started [configuration syncer #1]
1015997:20211027:161345.434 __mem_malloc: skipped 0 asked 24 skip_min 18446744073709551615 skip_max 0
1015997:20211027:161345.434 [file:dbconfig.c,line:94] __zbx_mem_realloc(): out of memory (requested 24 bytes)
1015997:20211027:161345.434 [file:dbconfig.c,line:94] __zbx_mem_realloc(): please increase CacheSize configuration parameter
1015997:20211027:161345.434 === memory statistics for configuration cache ===
1015997:20211027:161345.434 min chunk size: 18446744073709551615 bytes
1015997:20211027:161345.434 max chunk size: 0 bytes
1015997:20211027:161345.434 memory of total size 8388232 bytes fragmented into 76951 chunks
1015997:20211027:161345.434 of those, 0 bytes are in 0 free chunks
1015997:20211027:161345.434 of those, 7157032 bytes are in 76951 used chunks
日志中说内存不足需要添加内存:
[file:dbconfig.c,line:94] __zbx_mem_realloc(): out of memory (requested 24 bytes)
[file:dbconfig.c,line:94] __zbx_mem_realloc(): please increase CacheSize configuration parameter
所以我们到配置文件中修改内存大小限制CacheSize
vi /etc/zabbix/zabbix_server.conf
### Option: CacheSize
# Size of configuration cache, in bytes.
# Shared memory size for storing host, item and trigger data.
#
# Mandatory: no
# Range: 128K-64G
# Default:
#CacheSize=8m
根据需要修改
### Option: CacheSize
# Size of configuration cache, in bytes.
# Shared memory size for storing host, item and trigger data.
#
# Mandatory: no
# Range: 128K-64G
# Default:
CacheSize=10G
重启
systemctl restart zabbix-server
没有重启成功,查看日志发现
1016076:20211027:161435.681 [history_influxdb.so] Database Engine used: mysql
1016076:20211027:161435.681 [history_influxdb.so] Using compatibility with Zabbix 4
1016076:20211027:161435.681 loaded modules: history_influxdb.so
1016076:20211027:161435.682 cannot initialize configuration cache: cannot get private shared memory of size 10737418240 for configuration cache: [22] Invalid argument
1016096:20211027:161445.928 Starting Zabbix Server. Zabbix 4.4.10 (revision 4db30afc70).
根据提示应该是超过了内核share memory的范围造成的。
这里可以将 CacheSize 设置在内核共享的范围内即可。
下面演示修改内核内存段内存的最大(kernel.shmall),和内存的总页数(kernel.shmall)
cannot initialize configuration cache: cannot get private shared memory of size 10737418240 for configuration cache: [22] Invalid argument
查看内核配置,看到内存的配置大小。(字节为单位)
[root@ ~]# sysctl -a|grep shm
kernel.shm_next_id = -1
kernel.shm_rmid_forced = 0
kernel.shmall = 8589934592
kernel.shmmax = 8589934592
kernel.shmmni = 40960
sysctl: reading key "net.ipv6.conf.all.stable_secret"
sysctl: reading key "net.ipv6.conf.bond0.stable_secret"
- kernel.shmmax :
是核心参数中最重要的参数之一,用于定义单个共享内存段的最大值。设置应该足够大,能在一个共享内存段下容纳下整个的SGA , 设置的过低可能会导致需要创建多个共享内存段,这样可能导致系统性能的下降。至于导致系统下降的主要原因为在实例启动以及 ServerProcess 创建的时候,多个小的共享内存段可能会导致当时轻微的系统性能的降低 ( 在启动的时候需要去创建多个虚拟地址段,在进程创建的时候要让进程对多个段进行“识别”,会有一些影响 ) ,但是其他时候都不会有影响。
官方建议值:
32 位 linux 系统:可取最大值为 4GB ( 4294967296bytes ) -1byte ,即 4294967295 。建议值为多于内存的一半,所以如果是 32 为系统,一般可取值为 4294967295 。 32 位系统对 SGA 大小有限制,所以 SGA 肯定可以包含在单个共享内存段中。
64 位 linux 系统:可取的最大值为物理内存值 -1byte ,建议值为多于物理内存的一半,一般取值大于 SGA_MAX_SIZE 即可,可以取物理内存 -1byte 。
[root@gt-zabbix ~]# cat /etc/sysctl.conf
# sysctl settings are defined through files in
# /usr/lib/sysctl.d/, /run/sysctl.d/, and /etc/sysctl.d/.
#
# Vendors settings live in /usr/lib/sysctl.d/.
# To override a whole file, create a new file with the same in
# /etc/sysctl.d/ and put new settings there. To override
# only specific settings, add a file with a lexically later
# name in /etc/sysctl.d/ and put new settings there.
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
kernel.shmmax = xxx
kernel.shmall = xxx
net.ipv6.conf.all.disable_ipv6=1
- kernel.shmall :
该参数控制可以使用的共享内存的总页数。Linux 共享内存页大小为 4KB, 共享内存段的大小都是共享内存页大小的整数倍。
一个共享内存段的最大大小是16G ,那么需要共享内存页数是 16GB/4KB==4194304 (页)
然后执行sysctl -p生效