http://www.c0t0d0s0.org/~3/k3bizFzXMq0/6546-Meet-the-stats-today-mpstat.html
Meet the stats - today: mpstat
Monday, May 10. 2010
<!--
In this installment of the "Meet the Stats" series i want to talk with you about the mpstat. In my opinion, mpstat is one of the most useful tools to find what your processors are really doing.
Using mpstat
Let's execute mpstat on a system. I've used my fileserver for this task on Saturday morning, it's a system with four cores. So mpstat reports 4 lines to me.$ mpstat 1
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 4 0 28 679 154 573 3 12 8 0 767 1 2 0 96
1 4 0 22 504 145 485 2 9 4 0 661 1 2 0 97
2 4 0 30 579 81 425 3 12 6 0 519 1 2 0 97
3 5 0 25 505 250 517 3 12 5 0 758 1 3 0 96
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 0 0 9 567 182 372 0 2 0 0 338 0 1 0 99
1 0 0 21 454 174 468 1 1 2 0 468 1 1 0 98
2 0 0 12 480 15 304 1 4 1 0 249 2 1 0 97
3 27 0 15 157 68 147 1 2 0 0 422 0 1 0 99
jmoekamp@hivemind:~$
Internals
You need some knowledge about the inner workings of an operating system to really understand the output of this command, but a basic understanding is relatively easy to reach.- All those operations are normal occurrences in a Solaris system. You can't say "oh ... i have too much events of this kind" just by looking at this numbers, because the observed pattern is possibly normal for your load. So it's very reasonable to run mpstat from time to time during load times just to get a baseline. Debugging in the event of a major fsckup is much easier with such historical data because otherwise you look for a pattern that is looks pathological, but it is maybe just the way things go in your application.
- Forget about this wt column. It's the "wait time" column, but it isn't computed anymore, it's simply set to zero. The reason for keeping this column is the binary compatibility guarantee. You can't leave it out because one column less could break programs and you can't fill it with a dash, as programs may expect a number here.
- When not otherwise stated the numbers are "events per second". Exceptions are the last four columns and obviously the first one.
A recommended reading
In the following description i sacrified complete correctness for understandability, as i simplified some of the dependencies.To understand the full implications of all the numbers presented by all the *stat commands you should start to gather some knowledge about the internals of Solaris. There is an excellent book about it. It's " Solaris Internals: Solaris 10 and Open Solaris Kernel Architecture" written by Richard McDougall and Jim Mauro. The ISBN of this great book is 0-13-148209-2.
DTrace examples
The DTrace examples are from my standard cheat sheet i'm using at customer site. However they aren't mine, i've gathered them from the Prefetch.net's Dtrace Cookbook.The numbers and their meaning
sys,usr and idl have obvious meanings, they tell you the percentage of time the system stays in kernelland, in userland or idling. When you really want to know something about the load on your system, look at this values and forget about load average.The other values are a little more difficult to explain. Basically you can divide the columns 2-12 into three groups: The virtual memory part (row 2 and 3), the interrupt part (row 4-7), the scheduling part (8-11) and the locks part (row 11-12)
The virtual memory part
To understand the meaning of the both rows regarding the virtual memory subsystem you should have some knowledge about the concept of virtual memory. But i will try to give you some insight into this part without getting overly complex.At first it's important to know that memory is organized in pages. Those pages are chunks of memory. The possible sizes of the page are hardware dependent, but let's assume that we have pages in the size of 8 Kilobytes.
As you may know, modern operating system doesn't allocate real memory when your application is requesting for memory. Instead it allocated virtual memory.
When you access a memory page the first time, a page fault occurs. This page fault leads to the mapping of a physical memory page to the virtual memory page.
The mapping is done by adding an entry to the Hash Page Table. And here the minor and major page faults differs:
- minf:
When the memory subsystem doesn't find a mapping in the Hash Page Table, but knows that a page with the same content is on the list of free pages, a minor fault occurs. The page is just inserted in the Hash Page Table and the system works with the data in the memory already there. You can measure what applications create minor page faults with a short dtrace oneliner:# dtrace -n 'vminfo:::as_fault{@execs[execname]=count()}'
dtrace: description 'vminfo:::as_fault
' matched 1 probe
^C
dtrace 104
jmoekamp@hivemind:~# - majf:
A major fault has much more severe consequences. It occurs, when there is no mapping to a physical page in the hash page table and the content of the page was migrated to the swap space. Obviously takes some time.# dtrace -n 'vminfo:::maj_fault{@execs[execname] = count() }'
dtrace: description 'vminfo:::maj_fault' matched 1 probe
^C
Obviously major faults have a bigger impact to the system performance than minor faults, as the second one doesn't need to access a rotating rust device aka hard disk. However even minor faults can have a significant impact to performance. But that's enough stuff for an own article or an evening with the book mentioned above.
The interrupt part
- xcal:
xcal's or cross calls is a special kind of interrupt. Whenever a processors need another processor to do something for them, a so-called cross call is issued. There are several reasons to issue cross calls like updating certain tables on other processors.# dtrace -n 'sysinfo:::xcalls{@execs[execname] = count();}'
dtrace: description 'sysinfo:::xcalls' matched 1 probe
^C
firefox-bin 6
thunderbird-bin 6
VBoxHeadless 9
dtrace 24
pageout 1660
sched 1834
jmoekamp@hivemind:~# - intr:
An interrupt interrupts preempts the current work on the processor and forces it to execute the code needed to handle the interrupt, for example to trigger the processing incoming network packages. To get some insight into the drivers generating interrupts, it's interesting to use the intrstat command:jmoekamp@hivemind:~# intrstat
device | cpu0 %tim cpu1 %tim cpu2 %tim cpu3 %tim
-------------+------------------------------------------------------------
[...]
e1000g#0 | 0 0,0 0 0,0 0 0,0 0 0,0
e1000g#1 | 0 0,0 0 0,0 0 0,0 0 0,0
ehci#0 | 0 0,0 0 0,0 0 0,0 0 0,0
ehci#1 | 0 0,0 0 0,0 0 0,0 0 0,0
hci1394#0 | 0 0,0 0 0,0 123 0,0 0 0,0
[...]
pci-ide#0 | 0 0,0 0 0,0 247 0,3 0 0,0
rge#0 | 0 0,0 0 0,0 0 0,0 11 0,0 - ithr:
ithr or "interrupts as threads" refers to a special mechanism to handle those interrupts. Many interrupts are handled in threads that are triggered by an interrupts. This column counts the interrupts handled by such threads.
Interrupts are important for the operation of the system, however they interrupt (they are called "interrupts" for a reason) the application running on this processor. Thus a high number of interrupts can lead into a situation, where many interrupts significantly slows down the application.
There are some tricks to reduce this interruption. For example you can force the interrupts on a subset of all processors by declaring most of the CPUs as "non-interrupt".
The scheduling part
- csw:
Context switches take place, when a currently running thread doesn't have anything to compute on the processor. For example because it wait's for data from the disk. The process gives back the processor to scheduling and a different process is scheduled on the proc. As the other process has a totally different set of register contents for example, the OS has to switch from the context of the old process to the one of the new thread. This is called context switch. Obviously there is a performance penalty bound to this event, as the switching takes some time. When you want to know what processes causes the context switches, you can use the sysinfo:::psswitch probe helps you:# dtrace -n 'sysinfo:::pswitch{@execs[execname] = count(); }'
dtrace: description 'sysinfo:::pswitch' matched 3 probes
^C
fmd 1
[...]
VBoxHeadless 2054
sched 9657
jmoekamp@hivemind:~# - icsw:
involuntary context switches is the forced variant of a context switch. Whenever a processor has consumed it's time slice or when a higher priority process is ready for execution, an involuntary context switch is done. It just forces the process off from the processor.# dtrace -n 'sysinfo::preempt:inv_swtch{@execs[execname] = count();}'
dtrace: description 'sysinfo::preempt:inv_swtch
' matched 1 probe
^C
VBoxHeadless 1
VBoxSVC 1
gam_server 1
thunderbird-bin 2
firefox-bin 3
gnome-netstatus- 3 - migr:
A "thread migrations" is counted when a process is scheduled on a different processor than it's last time. This can have a certain a big performance impact, as the caches in the processor aren't warmed for the process, thus leading to more cache misses thus leading to more accesses to the slower main memory instead to the caches.# dtrace -n ' sched:::off-cpu{self->cpu = cpu;}
sched:::on-cpu /self->cpu != cpu/
{
printf("%s migrated from cpu %d to cpu %d\n",execname,self->cpu,cpu);
self->cpu = 0;
}'
dtrace: description ' sched:::off-cpu
' matched 6 probes
^C
CPU ID FUNCTION:NAME
2 10067 resume:on-cpu firefox-bin migrated from cpu 0 to cpu 2
2 10067 resume:on-cpu thunderbird-bin migrated from cpu 0 to cpu 2
2 10067 resume:on-cpu nskernd migrated from cpu 0 to cpu 2
2 10067 resume:on-cpu nskernd migrated from cpu 0 to cpu 2
2 10067 resume:on-cpu VBoxHeadless migrated from cpu 0 to cpu 2
2 10067 resume:on-cpu sched migrated from cpu 0 to cpu 2
2 10067 resume:on-cpu gnome-netstatus- migrated from cpu 0 to cpu 2
2 10067 resume:on-cpu gnome-netstatus- migrated from cpu 0 to cpu 2
The locks part.
- smtx:
smtx or "spins on mutexes" reports how often the code flow on the processor wasn't able to gather a mutex lock. Mutex is a shorthand for "Mutual Exclusion". A mutex lock provides exclusive read and write access to the thread owning it.# dtrace -n 'lockstat:::adaptive-spin, lockstat:::adaptive-block
> {
> @execs[execname,probename] = count();
> }'
dtrace: description 'lockstat:::adaptive-spin, lockstat:::adaptive-block
' matched 2 probes
^C
gnome-netstatus- adaptive-spin 2
sched adaptive-block 5
zpool-datapool adaptive-block 5
VBoxHeadless adaptive-spin 21
zpool-datapool adaptive-spin 78
sched adaptive-spin 262
# - srw:
srw or "spins on reader/writer locks" counts the number of spins on reader/writer locks. rwlocks are another kind of locks in Solaris. They allow just one thread to own the write lock, but other threads just readind are able to to acquire it. When you want to know what processes spin are responsible for the srw events, a dtrace one-liner can help you:# dtrace -n 'lockstat:::rw-block
{
@execs[execname] = count();
}'
dtrace: description 'lockstat:::rw-block
' matched 1 probe
^C
VBoxHeadless 8
#
Do you want to learn more?
man pages
docs.sun.com: intrstat
docs.sun.com: mpstat
Misc
Prefetch.net: Dtrace Cookbook
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/228190/viewspace-662358/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/228190/viewspace-662358/