Profiling the kernel

There are several facilities to see where the kernel spendsits resources. A simple one is the profiling function, thatstores the current EIP (instruction pointer) at each clock tick.

Boot the kernel with command line option profile=2(or some other number instead of 2). This will causea file/proc/profile to be created.The number given after profile= is the number of positionsEIP is shifted right when profiling. So a large number gives acoarse profile.The counters are reset by writing to/proc/profile.The utility readprofile will output statistics for you.It does not sort - you have to invokesort explicitly.But given a memory map it will translate addresses to kernel symbols.

See kernel/profile.c and fs/proc/proc_misc.cand readprofile(1).

For example:

# echo > /proc/profile
...
# readprofile -m System.map-2.5.59 | sort -nr | head -2
510502 total                                      0.1534
508548 default_idle                           10594.7500

The first column gives the number of timer ticks.The last column gives the number of ticks divided by thesize of the function.

The command readprofile -r is equivalent toecho > /proc/profile.

2.10.1 Oprofile

A more advanced mechanism is given by oprofile.

Prepare kernel

Build a kernel (2.5.43 or later) with CONFIG_PROFILING=yand CONFIG_OPROFILE=y. Now the kernel knows about theoprofilefs virtual filesystem. The utilitiesmentioned below will mount it on/dev/oprofile.It is a good idea to add idle=poll to the kernelcommand line; this will make sure time spent in the idle threadis properly accounted for.

Install oprofile

Get the oprofile utility fromhttp://oprofile.sourceforge.net/.Configure with./configure --with-kernel-support,then make and (as root)make install.This yields binaries and a man page:

% ls /usr/local/bin
op_dump  op_merge    op_start  op_time       opcontrol    oprofiled
op_help  op_session  op_stop   op_to_source  oprof_start  oprofpp
% ls /usr/local/share/oprofile
stl.pat
% ls /usr/local/share/doc/oprofile
oprofile.html
% ls /usr/local/man/man1
op_help.1   op_time.1       opcontrol.1  oprofiled.1
op_merge.1  op_to_source.1  oprofile.1   oprofpp.1
%
All man pages are links to the oprofile man page.Thus, there are two sources of information:
% man oprofile
% mozilla /usr/local/share/doc/oprofile/oprofile.html

Setup oprofile

Make sure you are root and your PATH contains /usr/local/bin.The first action required is invokingopcontrol --setup ....This will create a setup file /root/.oprofile/daemonrc.You need the big kernel toplevelvmlinux file (not bzImage,and not the smaller arch/i386/boot/compressed/vmlinux).

# opcontrol --setup --vmlinux=/foo/vmlinux
# cat /root/.oprofile/daemonrc
IGNORE_MYSELF=0
SEPARATE_LIB_SAMPLES=0
SEPARATE_KERNEL_SAMPLES=0
VMLINUX=/foo/vmlinux
BUF_SIZE=0
one_enabled=1
#
The precise setup call needed depends on your hardware.The above example is for an old Pentium without hardware performancecounters. For a P3, use
opcontrol --setup --vmlinux=/foo/vmlinux --ctr0-event=CPU_CLK_UNHALTED --ctr0-count=100000
For a P4, use
opcontrol --setup --vmlinux=/foo/vmlinux --ctr0-event=GLOBAL_POWER_EVENTS --ctr0-unit-mask=1 --ctr0-count=100000
For an Athlon or x86-64, use
opcontrol --setup --vmlinux=/foo/vmlinux --ctr0-event=RETIRED_INSNS --ctr0-count=100000
There are many other possible setup options. The command op_helpwill list them once you have done the setup :-).(See also the sourceforge site for Intel P6/PII/PIII, Intel P4 and AMD events.)

The general idea is that one specifies a certain type of event(such as CPU_CLK_UNHALTED, a CPU clock cycle), and a count(like the 100000 above). Now once every count occurrencesof this event the value of EIP is recorded.Pick the value of count suitably: with a 400 MHz machine a countof 100000 for CPU_CLK_UNHALTED means 4000 interrupts/second.Too small a count and the machine dies an interrupt death.Too large a count and the profile will be very coarse.

Use of oprofile

First start the oprofile daemon.

# opcontrol --start-daemon
Using log file /var/lib/oprofile/oprofiled.log
Daemon started.
# ps ax | grep oprofile
... /usr/local/bin/oprofiled ...

Next clear out old profiling data.

# opcontrol --reset

Next start measuring, do whatever should be measured,and stop measuring.

# opcontrol --start
Profiler running.
# do_something
# opcontrol --stop
Stopping profiling.
#

One now has profiling data below /var/lib/oprofile/.See below for what to do with the data.

When no more profiling is needed, kill the daemon:

# opcontrol --shutdown
Stopping profiling.
Killing daemon.
#

To clean up all data generated by oprofile (after generating anydesired output):

# rm -r /var/lib/oprofile

A partial cleanup is done by the opcontrol --resetmentioned above.

Output

To get a printout of the data for /bin/foo, use, e.g.oprofpp -l -i /bin/foo. The output will be boring,unless/bin/foo was not stripped after compilation,so that it contains symbol information.

The programs and libraries that were invoked (and hence are suitablearguments foroprofpp -l -i ...) and the numbers of ticks spentin each are given by the commandop_time.With the -l option the output will be split according to symbol.

# op_time | tail -3
3134       9.0531  0.0000 /lib/i686/libc-2.2.4.so
4813      13.9032  0.0000 /usr/bin/find
26212     75.7178  0.0000 /foo/vmlinux
# op_time -l | tail -6
c012c7c0 703      2.04509     kmem_cache_alloc        /foo/vmlinux
c02146c0 786      2.28655     __copy_to_user_ll       /foo/vmlinux
c0169b70 809      2.35345     ext3_readdir            /foo/vmlinux
c01476f0 854      2.48436     link_path_walk          /foo/vmlinux
c016fcd0 1446     4.20655     ext3_find_entry         /foo/vmlinux
00000000 4591     13.3556     (no symbol)             /usr/bin/find
#

To get statistics for the kernel only, use the vmlinuxname specified.

# oprofpp -l -i /foo/vmlinux | tail
c012ca30 488      1.86174     kmem_cache_free
c010e280 496      1.89226     mask_and_ack_8259A
c010a61a 506      1.93041     restore_all
c0119220 603      2.30047     do_softirq
c0110b30 663      2.52938     delay_tsc
c012c7c0 703      2.68198     kmem_cache_alloc
c02146c0 786      2.99863     __copy_to_user_ll
c0169b70 809      3.08637     ext3_readdir
c01476f0 854      3.25805     link_path_walk
c016fcd0 1446     5.51656     ext3_find_entry
#

One can get disassembly or annotated source with indicationon where the counts occurred.

# op_to_source -a -i /foo/vmlinux
...
/* 424 1.618% */
c0169ac0 <ext3_check_dir_entry>:
 /* 6 0.02289% */
c0169ac0:       push   %edi
 /* 56 0.2136% */
c0169ac1:       push   %esi
c0169ac2:       push   %ebx
c0169ac3:       mov    0x18(%esp,1),%esi
 /* 43 0.164% */
c0169ac7:       xor    %ebx,%ebx
c0169ac9:       mov    0x14(%esp,1),%edi
c0169acd:       movzwl 0x4(%esi),%ecx
 /* 23 0.08775% */
c0169ad1:       cmp    $0xb,%ecx
c0169ad4:       jg     c0169ae0 <ext3_check_dir_entry+0x20>
c0169ad6:       mov    $0xc032b160,%ebx
...

# op_to_source --source-dir=. --output-dir=/tmp -i /path/binary
# diff -u ./source.c /tmp
...
 int get_line(register FILE *f, int *length)
+/* get_line 24 54.55% */
...
+       /* 5 11.36% */
        c = Getc (f);
...

This profile with annotated source is available only for binariescompiled with-g2 (and not stripped) so that they stillcontain source line numbers.


homepages.cwi.nl/~aeb/linux/profile.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值