性能调优中,经常听到“cpu负载最好不要超过cpu数量”。
cpu负载命令
uptime命令,得到如下结果
load average: 1.05, 0.70, 5.09
3个数字,分别表示最近1分钟,5分钟,15分钟的负载分别是1.05,0.70,5.09
负载的含义
百度上一堆乱七八糟的资料,还是Wikipedia讲的清楚
For example, one can interpret a load average of "1.73 0.60 7.98" on a single-CPU system as:
- during the last minute, the system was overloaded by 73% on average (1.73 runnable processes, so that 0.73 processes had to wait for a turn for a single CPU system on average).
- during the last 5 minutes, the CPU was idling 40% of the time on average.
- during the last 15 minutes, the system was overloaded 698% on average (7.98 runnable processes, so that 6.98 processes had to wait for a turn for a single CPU system on average).
负载值 - cpu线程数 = 平均有多少个cpu线程在等待cpu时间片。
显然,负载值越高表示系统的压力越大。
如果负载等于cpu线程数,则表示cpu使用率是100%。再高则会出现线程等待。
现代多核CPU
以英特尔至强为例,物理核数是4,每个核开了2个线程,则逻辑核数是8.
计算负载时的cpu线程数,用的是8,不是4。
cpu使用率和cpu负载的区别
现在公司的线上环境cpu负载超过了8,有时候达到了15,但是cpu使用率只有70%-75%。
cpu使用率表示:在整个程序运行过程中,cpu运行的时间占比。对于高并发的服务器,频繁的线程切换,会导致cpu等待IO,所以cpu的使用率到不到85%的告警但是cpu负载已经比较高了。