kubectl top与docker stats内存不一致

名称类型单位说明
container_memory_rssgauge字节数 bytesRSS内存,即常驻内存集(Resident Set Size),是分配给进程使用实际物理内存,而不是磁盘上缓存的虚拟内存。RSS内存包括所有分配的栈内存和堆内存,以及加载到物理内存中的共享库占用的内存空间,但不包括进入交换分区的内存。
container_memory_usage_bytesgauge字节数 bytes当前使用的内存量,包括所有使用的内存,不管有没有被访问。
container_memory_max_usage_bytesgauge字节数 bytes最大内存使用量的记录。
container_memory_cachegauge字节数 bytes高速缓存(cache)的使用量。cache是位于CPU与主内存间的一种容量较小但速度很高的存储器,是为了提高cpu和内存之间的数据交换速度而设计的 Size),是分配给进程使用实际物理内存,而不是磁盘上缓存的虚拟内存。RSS内存包括所有分配的栈内存和堆内存,以及加载到物理内存中的共享库占用的内存空间,但不包括进入交换分区的内存。
container_memory_swapgauge字节数 bytes虚拟内存使用量。虚拟内存(swap)指的是用磁盘来模拟内存使用。当物理内存快要使用完或者达到一定比例,就可以把部分不用的内存数据交换到硬盘保存,需要使用时再调入物理内存
container_memory_working_set_bytesgauge字节数 bytes当前内存工作集(working set)使用量。
container_memory_failcntcounter申请内存失败次数计数。
container_memory_failures_totalcounter累计的内存申请错误次数。
  • container_memory_working_set_bytes = container_memory_usage_bytes - total_inactive_anon - total_inactive_file
  • memory used =container_memory_usage_bytes - cache
  • cache = total_inactive_file + total_active_file

PS:kubelet比较container_memory_working_set_bytes和container_spec_memory_limit_bytes来决定oom container

total_inactive_anon、total_inactive_file为非活动内存,可以被交换到磁盘 cache 缓存存储器存储当前保存在内存中的磁盘数据,所以判断container_memory_working_set_bytes会比container_memory_usage_bytes更为准确

  • https://segmentfault.com/a/1190000021402244?utm_source=tag-newest
  • https://blog.csdn.net/palet/article/details/82889493
  • https://zhuanlan.zhihu.com/p/96597715
  • https://www.ibm.com/support/pages/kubectl-top-pods-and-docker-stats-show-different-memory-statistics

kubectl top 12.5G
在这里插入图片描述

docker stats 11.42G
在这里插入图片描述

memory_stats和memory.usage_in_bytes
在这里插入图片描述

使用kubectl top(container_memory_working_set_bytes) = memory.usage_in_bytes - inactive_file 可以得出数值12.5G
使用docker stats(memory used) = memory.usage_in_bytes - cache可以得到数值11.42G

  • https://segmentfault.com/a/1190000021402244?utm_source=tag-newest
  • https://blog.csdn.net/palet/article/details/82889493
  • https://zhuanlan.zhihu.com/p/96597715
  • https://www.ibm.com/support/pages/kubectl-top-pods-and-docker-stats-show-different-memory-statistics
  • https://docs.signalfx.com/en/latest/integrations/integrations-reference/integrations.kubernetes.html

PS:以下为一些官方指标,英语水平较差,很多未进行翻译,避免误翻译造成歧义,CPU与磁盘的翻译后也将原文一同记录,仅供查阅使用。

docker官方注解:

https://docs.docker.com/config/containers/runmetrics/

memory.stat:

MetricDescription
cacheThe amount of memory used by the processes of this control group that can be associated precisely with a block on a block device. When you read from and write to files on disk, this amount increases. This is the case if you use “conventional” I/O (open, read, write syscalls) as well as mapped files (with mmap). It also accounts for the memory used by tmpfs mounts, though the reasons are unclear.
rssThe amount of memory that doesn’t correspond to anything on disk: stacks, heaps, and anonymous memory maps.
mapped_fileIndicates the amount of memory mapped by the processes in the control group. It doesn’t give you information about how much memory is used; it rather tells you how it is used.
pgfault, pgmajfaultIndicate the number of times that a process of the cgroup triggered a “page fault” and a “major fault”, respectively. A page fault happens when a process accesses a part of its virtual memory space which is nonexistent or protected. The former can happen if the process is buggy and tries to access an invalid address (it is sent a SIGSEGV signal, typically killing it with the famous Segmentation fault message). The latter can happen when the process reads from a memory zone which has been swapped out, or which corresponds to a mapped file: in that case, the kernel loads the page from disk, and let the CPU complete the memory access. It can also happen when the process writes to a copy-on-write memory zone: likewise, the kernel preempts the process, duplicate the memory page, and resume the write operation on the process’s own copy of the page. “Major” faults happen when the kernel actually needs to read the data from disk. When it just duplicates an existing page, or allocate an empty page, it’s a regular (or “minor”) fault.
swapThe amount of swap currently used by the processes in this cgroup.
active_anon, inactive_anonThe amount of anonymous memory that has been identified has respectively active and inactive by the kernel. “Anonymous” memory is the memory that is not linked to disk pages. In other words, that’s the equivalent of the rss counter described above. In fact, the very definition of the rss counter is active_anon + inactive_anon - tmpfs (where tmpfs is the amount of memory used up by tmpfs filesystems mounted by this control group). Now, what’s the difference between “active” and “inactive”? Pages are initially “active”; and at regular intervals, the kernel sweeps over the memory, and tags some pages as “inactive”. Whenever they are accessed again, they are immediately retagged “active”. When the kernel is almost out of memory, and time comes to swap out to disk, the kernel swaps “inactive” pages.
active_file, inactive_fileCache memory, with active and inactive similar to the anon memory above. The exact formula is cache = active_file + inactive_file + tmpfs. The exact rules used by the kernel to move memory pages between active and inactive sets are different from the ones used for anonymous memory, but the general principle is the same. When the kernel needs to reclaim memory, it is cheaper to reclaim a clean (=non modified) page from this pool, since it can be reclaimed immediately (while anonymous pages and dirty/modified pages need to be written to disk first).
unevictableThe amount of memory that cannot be reclaimed; generally, it accounts for memory that has been “locked” with mlock. It is often used by crypto frameworks to make sure that secret keys and other sensitive material never gets swapped out to disk.
memory_limit, memsw_limitThese are not really metrics, but a reminder of the limits applied to this cgroup. The first one indicates the maximum amount of physical memory that can be used by the processes of this control group; the second one indicates the maximum amount of RAM+swap.
CPU指标
名称类型单位说明
container_cpu_usage_seconds_totalcounter该容器服务针对每个CPU累计消耗的CPU时间。如果有多个CPU,则总的CPU时间需要把各个CPU耗费的时间相加
Cumulative cpu time consumed per cpu in nanoseconds.
container_cpu_user_seconds_totalcounter该容器服务用户(user)累计消耗的CPU时间
Cumulative user cpu time consumed in nanoseconds.
container_cpu_system_seconds_totalcounter该容器服务系统(system)累计消耗的CPU时间
Cumulative system cpu time consumed in nanoseconds.
container_cpu_cfs_throttled_seconds_totalcountercfs 是完全公平调度器(Completely Fair Scheduler)的缩写,是Linux的一种控制CPU资源占用的机制,可以按指定比例分配调度CPU 的使用时间。这个指标指的是该容器服务被限制使用的CPU时间
Counter Total time duration the container has been throttled seconds
container_cpu_cfs_throttled_periods_totalcounter文档注释是:“Number of throttled period intervals.”,解释为被限制/节流的CPU时间周期数。
Counter Number of throttled period intervals
container_cpu_cfs_periods_totalcounter已经执行的CPU时间周期数。
Counter Number of elapsed enforcement period intervals
container_cpu_load_average_10sgauge过去10秒内的CPU负载的平均值。
Gauge Value of container cpu load average over the last 10 seconds

CPU计算公式

PS:CPU指数只有时间单位,所以需要使用rate函数进行转换:

rate(container_cpu_usage_seconds_total{name=~“组件名称.*”}[5m]) #单个CPU消耗占比

sum(rate(container_cpu_usage_seconds_total{name=~“组件名称.*”}[5m]))
#CPU消耗总和

rate(container_cpu_user_seconds_total{name=~“组件名称.*”}[5m]) #用户消耗CPU占比

rate(container_cpu_system_seconds_total{name=~“组件名称.*”}[5m])
#系统消耗CPU占比

磁盘指标
名称类型单位说明
container_fs_writes_bytes_totalcounter字节写入的累计字节数
Cumulative count of bytes written
container_fs_reads_bytes_totalcounter字节读取的累计字节数
Cumulative count of bytes read
container_fs_usage_bytescounter字节容器磁盘空间使用
Number of bytes that are consumed by the container on this filesystem.
container_fs_io_time_seconds_totalcounter执行I/O所花费的时间
Cumulative count of seconds spent doing I/Os
container_fs_io_time_weighted_seconds_totalcounter累计加权I/O时间
Cumulative weighted I/O time in seconds
网络指标
名称类型单位说明
container_network_receive_bytes_totalcounter字节请求流量数(一段时间内)
Cumulative count of bytes received
container_network_transmit_bytes_totalcounter字节出口流量数
Cumulative count of bytes transmitted
Cumulative count of packets receivedcounter请求数据包数
Cumulative count of packets received
container_network_transmit_packets_totalcounter出口数据包数
Cumulative count of packets transmitted
container_network_receive_packets_dropped_totalcounter请求丢包数
Cumulative count of packets dropped while receiving
container_network_transmit_packets_dropped_totalcounter出口丢包数
Cumulative count of packets dropped while transmitting
container_network_receive_errors_totalcounter请求流量数错误数
Cumulative count of errors encountered while receiving
container_network_transmit_errors_totalcounter出口流量数错误数
Cumulative count of errors encountered while transmitting
  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值