使用TOP
查看CPU的消耗情况
top
-
Listed below are top's available fields. They are always associated with the letter shown, regardless of the position you may have established for them with the 'o' (Order fields) interactive command.Any field is selectable as the sort field, and you control whether they are sorted high-to-low or low-to-high. For additional information on sort provisions see topic 3c. TASK Area Commands.
a: PID -- Process Id
The task's unique process ID, which periodically wraps, though never restarting at zero.
b: PPID -- Parent Process Pid
The process ID of a task's parent.
c: RUSER -- Real User Name
The real user name of the task's owner.
d: UID -- User Id
The effective user ID of the task's owner.
e: USER -- User Name
The effective user name of the task's owner.
f: GROUP -- Group Name
The effective group name of the task's owner.
g: TTY -- Controlling Tty
The name of the controlling terminal. This is usually the device (serial port, pty, etc.) from which the process was started, and which it uses for input oroutput. However, a task need not be associated with a terminal, in which case you'll see '?' displayed.
h: PR -- Priority
The priority of the task.
i: NI -- Nice value
The nice value of the task. A negative nice value means higher priority, whereas a positive nice value means lower priority. Zero in this field simply means priority will not be adjusted in determining a task's dispatchability.
j: P -- Last used CPU (SMP)
A number representing the last used processor. In a true SMP environment this will likely change frequently since the kernel intentionally uses weak affinity. Also, the very act of running top may break this weak affinity and cause more processes to change CPUs more often (because of the extra demand for cpu time).
k: %CPU -- CPU usage
The task's share of the elapsed CPU time since the last screen update, expressed as a percentage of total CPU time. In a true SMP environment, if 'Irix mode' is Off, top will operate in 'Solaris mode' where a task's cpu usage will be divided by the total number of CPUs. You toggle 'Irix/Solaris' modes with the 'I' interactive command.
l: TIME -- CPU Time
Total CPU time the task has used since it started. When 'Cumulative mode' is On, each process is listed with the cpu time that it and its dead children has used. You toggle 'Cumulative mode' with 'S', which is a command-line option and an interactive command. See the 'S' interactive command for additional information regarding this mode.
m: TIME+ -- CPU Time, hundredths
The same as 'TIME', but reflecting more granularity through hundredths of a sec ond.
n: %MEM -- Memory usage (RES)
A task's currently used share of available physical memory.
o: VIRT -- Virtual Image (kb)
The total amount of virtual memory used by the task. It includes all code, data and shared libraries plus pages that have been swapped out. (Note: you can define the STATSIZE=1 environment variable and the VIRT will be calculated from the /proc/#/state VmSize field.)
VIRT = SWAP + RES.
p: SWAP -- Swapped size (kb)
The swapped out portion of a task's total virtual memory image.
q: RES -- Resident size (kb)
The non-swapped physical memory a task has used.
RES = CODE + DATA.
r: CODE -- Code size (kb)
The amount of physical memory devoted to executable code, also known as the'text resident set' size or TRS.
s: DATA -- Data+Stack size (kb)
The amount of physical memory devoted to other than executable code, also known the 'data resident set' size or DRS.
t: SHR -- Shared Mem size (kb)
The amount of shared memory used by a task. It simply reflects memory that could be potentially shared with other processes.
u: nFLT -- Page Fault count
The number of major page faults that have occurred for a task. A page fault occurs when a process attempts to read from or write to a virtual page that is not currently present in its address space. A major page fault is when disk access is involved in making that page available.
v: nDRT -- Dirty Pages count
The number of pages that have been modified since they were last written to disk. Dirty pages must be written to disk before the corresponding physical memory location can be used for some other virtual page.
w: S -- Process Status
The status of the task which can be one of:
'D' = uninterruptible sleep
'R' = running
'S' = sleeping
'T' = traced or stopped
'Z' = zombie
Tasks shown as running should be more properly thought of as 'ready to run' --their task_struct is simply represented on the Linux run-queue. Even without a true SMP machine, you may see numerous tasks in this state depending on top's delay interval and nice value.
x: Command -- Command line or Program name
Display the command line used to start a task or the name of the associated program. You toggle between command line and name with 'c', which is both a command-line option and an interactive command. When you've chosen to display command lines, processes without a command line (like kernel threads) will be shown with only the program name in parentheses, as in this example: ( mdrecoveryd ) Either form of display is subject to potential truncation if it's too long to fit in this field's current width. That width depends upon other fields selected, their order and the current screen width.
Note: The 'Command' field/column is unique, in that it is not fixed-width. When displayed, this column will be allocated all remaining screen width (up to the maximum 512 characters) to provide for the potential growth of program names into command lines.
y: WCHAN -- Sleeping in Function
Depending on the availability of the kernel link map ('System.map'), this field will show the name or the address of the kernel function in which the task is currently sleeping. Running tasks will display a dash ('-') in this column.
Note: By displaying this field, top's own working set will be increased by over 700Kb. Your only means of reducing that overhead will be to stop and restart top.
z: Flags -- Task Flags
This column represents the task's current scheduling flags which are expressed in hexadecimal notation and with zeros suppressed. These flags are officially documented in <linux/sched.h>. Less formal documentation can also be found on the 'Fields select' and 'Order fields' screens.
默认情况下仅显示比较重要的 PID、USER、PR、NI、VIRT、RES、SHR、S、%CPU、%MEM、TIME+、COMMAND 列。
如果查看每个核消耗情况,可进入TOP视图后按1就会按核显示消耗的情况。
默认情况下TOP视图中显示的未进程的CPU消耗状况,在TOP视图中按shift+h后可按线程查看CPU
的消耗状况。
如图:
此时的PID为线程ID。其后的%CPU表示该线程所消耗的CPU百分比。
cs:CPU上下文频繁切换、
内存相关的是memory下的:swpd,free,buff,cache以及swap下的si,so
其中swpd是指虚拟内存已使用的部分,单位为KB,free表示空闲的物理内存,buff表示用于缓冲的内存。
cache 表示用户缓存的内存.
swap下的si是每秒从disk读取到内存的数据量。
swpd值过高通常是由于物理内存不够了,os将物理内存中的一部分数据转为存放到硬盘上进行存储,
由于Java应用是单进程应用,因此只要JVM的内存设置不是过大,是不会操作到swap区域的,物理内存消耗过高可能是由于JVM内存设置过大,创建的Java线程过多或通过DirectByteBuffer往物理内存中放置了过多的对象造成的。
1. 确定占用cpu高的线程id:
us高
当us值过高时,表示运行的应用消耗了大部分的CPU,这种情况下,对于JAVA应用而言,最重要的是找到具体消耗CPU所执行的代码。
可采用如下方式做到。
首先通过Linux 提供的线程命令找到到消耗CPU严重的线程及其ID,将此线程转化为十六进制的值,之后通过kill -3[javapid]
或jstack的方式Dump出应用的线程信息,通过之前转化出的十六进制的值找到对应的nid值的线程,该线程即为消耗CPU的线程,在
采样时需多执行几次上述的过程,已确保找到真实的消耗CPU的线程。
Java应用造成US高的主要原因是线程一直处于可运行(Runnable)状态,通常是这些线程在执行无阻塞,循环,正则或纯粹的计算等动作
造成,另外一个可能也会造成US高的原因是频繁的FULL GC。
如每次请求都需要分配较多内存,当访问量高的时候就将导致不断的进行GC,系统响应速度下降,进而造成堆积的请求更多,消耗的内存更严重,最严重的时候也可能导致系统不断的FullGC ,对于频繁的GC的状况要通过分析JVM内存的消耗来查找原因。
sy高
当sy高表示Linux花费了更多的时间在上下文切换,Java应用造成这种现象的主要原因是启动的线程比较多,且这些线程多数都处于不断的阻塞,
例如锁等待,IO等待等等
文件IO消耗分析
Linux在操作文件时,将数据放入文件缓存区,知道内存不够或系统要释放内存给用户进程使用,因此在查看Linux内存状况是经常会发现free的物理内存不多,
但cache用了很多,这是linux提升文件IO速度的一种做法。如物理空间内存够用,通常Linux只有些文件和第一次读取文件时会产生真正的IO。
在Linux 中要跟踪线程文件IO的消耗,主要方法是通过pidstat来查找。
pidstat -d -t -p [pid] 1 100
iostat
直接输入iostat可查看各个设备的IO历史状况
iostat
Linux 2.6.32-220.el6.x86_64(idc01-sys-mo-01)
avg-cpu:
Device:
Device:表示设备卷标名或分区名,
TPS:表示每秒IO请求数,这也是IO消耗情况中值得关注的数字。
Blk_read/s :是指每秒读的块数量,通常块的大小512字节
Blk_wrtn/s :每秒写入的块数
Blk_read :总共读取的块数
Blk_wrtn:总共写入的块数。
还可以输入 iostat -x xvda 3 5 这样的方式来定时采样查看IO的消耗状况
iostat -x xvda 3 5
Linux 2.6.32-220.el6.x86_64(idc01-sys-mo-01)
avg-cpu:
Device:
avg-cpu:
Device:
avg-cpu:
Device:
avg-cpu:
Device:
avg-cpu:
Device:
r/s:每秒读的请求数
w/s:每秒写的请求数
await:平均每次IO操作等待时间,单位毫秒