linux内存负载高怎么解决,Linux系统Load Average平均负载高如何处理

最新推荐文章于 2023-08-23 17:24:20 发布

weixin_39617669

最新推荐文章于 2023-08-23 17:24:20 发布

阅读量223

点赞数

文章标签： linux内存负载高怎么解决

通过top或着uptime命令可以看到系统的平均负载，如下，分别表示过去 1 分钟、5 分钟、15 分钟的系统平均负载(之所以统计三个时间点数值，是为了更好的反映系统整体的负载趋势)

[root@k8s-master ~]# uptime

10:54:36 up 8 days, 12:31, 1 user, load average: 0.25, 0.51, 1.19

平均负载的含义：系统处于可运行状态和不可中断状态的平均进程数，也就是平均活跃进程数，这里的平均指的是指数衰减平均值，对应到进程的状态如下

可运行状态(Running或Runnable)

不可中断状态(Uninterruptible Sleep，也称为 Disk Sleep)

#查找R或D状态的进程ps aux | awk '{if($8 ~ /R|D/) print $0 }'[root@k8s-master ~]# ps aux | awk '{if($8 ~ /R|D/) print $0 }'root9 0.0 0.0 0 0 ? R Jun30 6:08[rcu_sched]

root30474 0.0 0.0 157456 1912 pts/0 R+ 11:10 0:00 psaux

root30475 0.0 0.0 113548 1232 pts/0 R+ 11:10 0:00 awk {if($8 ~ /R|D/) print $0 }

根据上述平均负载的定义，能够导致平均负载升高的场景有：

1、处于Running状态的进程大量消耗cpu(CPU密集型进程)

2、大量处于Runnable的进程，cpu会频繁进行上下文切换(寄存器、程序计数器)

操作系统管理的任务包括进程(线程)，还有硬件通过触发信号，会导致中断处理程序的调用

上下文切换包括：进程上下文切换(虚拟内存、栈、全局变量等用户空间的资源，还包括了内核堆栈、寄存器等内核空间的状态)、线程上下文切换、以及中断上下文切换

特权模式切换：用户态到内核态的上下文切换

查看系统总体的上下文切换情况

vmstat5[root@k8s-master sysstat]# vmstat 5procs-----------memory---------- ---swap-- -----io---- -system-- ------cpu-----r b swpdfree buff cache si so bi bo in cs us sy idwa st8 0 0 131084 203200 1801028 0 0 1724 166 2 8 12 7 79 2 0

0 0 0 130480 203216 1801272 0 0 42 81 1829 5047 8 5 86 1 0

0 0 0 129676 203232 1802092 0 0 161 110 1814 4840 16 6 76 1 0

0 0 0 129344 203236 1802496 0 0 78 70 1939 5258 14 12 73 1 0

0 0 0 128464 203248 1803156 0 0 120 106 1836 5195 9 6 84 1 0

1 0 0 128000 203256 1803612 0 0 91 98 1807 4726 16 6 77 1 0cs(context switch)是每秒上下文切换的次数。in(interrupt)则是每秒中断的次数。

r(Running or Runnable)是就绪队列的长度，

也就是正在运行和等待 CPU 的进程数。

b(Blocked)则是处于不可中断睡眠状态的进程数。

各类型中断次数统计

watch -d cat /proc/interrupts

查看每个进程的上下文切换次数

pidstat -w -t 1 (-t显示线程上下文切换统计)05:17:54 PM UID PID cswch/s nvcswch/s Command05:17:55 PM 0 1 1.03 0.00systemd05:17:55 PM 0 3 21.65 0.00 ksoftirqd/0

05:17:55 PM 0 9 100.00 0.00rcu_sched05:17:55 PM 0 296 12.37 0.00 kworker/0:1H05:17:55 PM 0 320 6.19 0.00 jbd2/vda1-8

05:17:55 PM 0 1139 1.03 0.00iscsid05:17:55 PM 1337 5766 13.40 0.00envoy05:17:55 PM 1337 6061 16.49 1.03envoy05:17:55 PM 1337 6065 16.49 0.00envoy05:17:55 PM 1 6141 1.03 2.06python05:17:55 PM 1337 6331 13.40 0.00envoy05:17:55 PM 0 10774 14.43 0.00envoy05:17:55 PM 0 10805 2.06 0.00YDLive05:17:55 PM 0 10844 15.46 0.00envoy05:17:55 PM 0 11532 3.09 1.03coredns05:17:55 PM 0 12767 40.21 15.46etcd05:17:55 PM 0 13170 10.31 0.00 kube-proxy05:17:55 PM 0 18892 1.03 0.00sshd05:17:55 PM 0 19758 1.03 0.00 kworker/u2:2

05:17:55 PM 0 27112 8.25 0.00 kworker/0:2

05:17:55 PM 0 28294 1.03 1.03pidstat05:17:55 PM 0 28365 2.06 0.00YDService

cswch：自愿上下文切换，是指进程无法获取所需资源，导致的上下文切换。比如说， I/O、内存等系统资源不足时，就会发生自愿上下文切换。

nvcswch：非自愿上下文切换，则是指进程由于时间片已到等原因，被系统强制调度，进而发生的上下文切换

3、存在处于D状态的进程

centos安装 pidstat和mpstat工具 yum install -y sysstat

#模拟进程消耗cpu

stress-i 1 --timeout 600#查看各cpu使用情况(我这里只有一个cpu)

(取5s内的数据计算一组平均值)

[root@k8s-master ~]# date;mpstat -P ALL 5 1Thu Jul9 14:32:06 CST 2020Linux3.10.0-957.27.2.el7.x86_64 (k8s-master.com) 07/09/2020 _x86_64_ (1CPU)02:32:07 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle02:32:12 PM all 96.79 0.00 3.21 0.00 0.00 0.00 0.00 0.00 0.00 0.00

02:32:12 PM 0 96.79 0.00 3.21 0.00 0.00 0.00 0.00 0.00 0.00 0.00Average: CPU%usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle

Average: all96.79 0.00 3.21 0.00 0.00 0.00 0.00 0.00 0.00 0.00Average:0 96.79 0.00 3.21 0.00 0.00 0.00 0.00 0.00 0.00 0.00#确定占用cpu较多的进程 pidstat的-u参数表示查看cpu指标

[root@k8s-master ~]# pidstat -u 5 1Linux3.10.0-957.27.2.el7.x86_64 (k8s-master.com) 07/09/2020 _x86_64_ (1CPU)02:37:27 PM UID PID %usr %system %guest %CPU CPU Command02:37:33 PM 0 25315 80.90 0.00 0.00 80.90 0stress02:37:33 PM 0 25539 0.00 0.39 0.00 0.39 0pidstat

Average: UID PID%usr %system %guest %CPU CPU Command

Average:0 25315 80.90 0.00 0.00 80.90 -stress

Average:0 25539 0.00 0.39 0.00 0.39 - pidstat

stress -i 1 --timeout 600通过mpstat可以看到cpu0有60%的时间片都在用于等待io，且这个过错不可中断

[root@k8s-master ~]# mpstat -P ALL 5 1Linux3.10.0-957.27.2.el7.x86_64 (k8s-master.com) 07/09/2020 _x86_64_ (1CPU)03:07:38 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle03:07:43 PM all 9.58 0.00 30.00 60.00 0.00 0.42 0.00 0.00 0.00 0.00

03:07:43 PM 0 9.58 0.00 30.00 60.00 0.00 0.42 0.00 0.00 0.00 0.00Average: CPU%usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle

Average: all9.58 0.00 30.00 60.00 0.00 0.42 0.00 0.00 0.00 0.00Average:0 9.58 0.00 30.00 60.00 0.00 0.42 0.00 0.00 0.00 0.00

通过前面的命令ps aux也能看到stess进程状态为D

为什么会有D状态的进程？

不可中断状态的进程则是正处于内核态关键流程中的进程，并且这些流程是不可打断的，比如最常见的是等待硬件设备的 I/O 响应，也就是我们在 ps 命令中看到的 D 状态(Uninterruptible Sleep，也称为 Disk Sleep)的进程。比如，当一个进程向磁盘读写数据时，为了保证数据的一致性，在得到磁盘回复前，它是不能被其他进程或者中断打断的，这个时候的进程就处于不可中断状态。如果此时的进程被打断了，就容易出现磁盘数据与进程数据不一致的问题。所以，不可中断状态实际上是系统对进程和硬件设备的一种保护机制。

ps aux中各进程状态说明(man ps)

PROCESS STATE CODES

Here are the different values that the s,stat and state output specifiers (header "STAT" or "S") will display to describe the state of a process:

D uninterruptiblesleep(usually IO)

R running or runnable (on run queue)

S interruptiblesleep (waiting foran event to complete)

T stopped by job control signal

t stopped by debugger during the tracing

W paging (not valid since the2.6.xx kernel)

X dead (should never be seen)

Z defunct ("zombie") process, terminated but not reaped by its parent

For BSD formats and when thestatkeyword is used, additional characters may be displayed:< high-priority (not niceto other users)

N low-priority (niceto other users)

L has pages locked into memory (for real-timeand custom IO)

s is a session leader

l is multi-threaded (using CLONE_THREAD, like NPTL pthreads do)+ is in the foreground process group

weixin_39617669

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
linux内存负载高怎么解决,Linux系统Load Average平均负载高如何处理

通过top或着uptime命令可以看到系统的平均负载，如下，分别表示过去 1 分钟、5 分钟、15 分钟的系统平均负载(之所以统计三个时间点数值，是为了更好的反映系统整体的负载趋势)[root@k8s-master ~]# uptime10:54:36 up 8 days, 12:31, 1 user, load average: 0.25, 0.51, 1.19平均负载的含义：系统处于可运行...
复制链接

扫一扫