系统级性能优化通常包括两个阶段:性能剖析(performance profiling)和代码优化。性能剖析的目标是寻找性能瓶颈,查找引发性能问题的原因及热点代码。代码优化的目标是针对具体性能问题而优化代码或编译选项,以改善软件性能。本篇主要讲性能分析中常用的工具——perf。
Perf 原理如下图所示:
Perf依赖如下组建实现功能:
1)PMU:性能监控单元(Performance Monitor Unit), CPU提供的一个性能监视单元,用于统计CPU性能数据;
2)Tracepoint:散落在内核源代码中的一些 hook,它们可以在特定的代码被运行到时被触发,这一特性可以被各种 trace/debug 工具所使用。
3)内核运行状态计数,例如: 1) 进程切换 2) Page fault 3) 中断计数,记录在内核数据中。
perf event事件分为三种:
- Hardware Event由PMU部件产生,在特定的条件下探测性能事件是否发生以及发生的次数。比如cache命中。
- Software Event是内核产生的事件,分布在各个功能模块中,统计和操作系统相关性能事件。比如进程切换,tick数等。
- Tracepoint Event是内核中静态tracepoint所触发的事件,这些tracepoint用来判断程序运行期间内核的行为细节,比如slab分配器的分配次数等。
下面开始介绍Perf的使用:
0:编译Perf
安装依赖:
sudo apt-get install libgtk2.0-dev libaudit-dev libdwarf-dev libbfd-dev libnuma-dev libperl-dev libzstd-dev libdw-dev
Perf是开源代码,你可以选择安装DEB包的方式使用,也是自行编译。任何人都可以下载编译,Perf代码在LINUX内核目录树中,进入内核tools/perf目录,执行 make
如果想用gdb debug perf,则需要修改MAKEFILE重新编译:
export DEBUG=1
之后修改tools/perf/Makefile.config,将编译优化级别降低为-O0.
1.安装perf
$ uname -r
5.4.0-81-generic
$ sudo apt install linux-tools-common
$ sudo apt install linux-tools-5.4.0-81-generic
$ sudo apt install trace-cmd
$ sudo apt install kernelshark
$ sudo apt-get install linux-tools-common linux-tools-"$(uname -r)" linux-cloud-tools-"$(uname -r)" linux-tools-generic linux-cloud-tools-generic
2.生成data:
使用方法,以分析pid为123456的cpu-clock情况,延时5秒,并查看函数调用
sudo perf record -p 123456 -g -e cpu-clock sleep 5
生成perf.data,下面拿一个进行堆排序的程序为例,对齐进行火焰图分析:
czl@czl-VirtualBox:~/flame$ pidof a.out
10007
czl@czl-VirtualBox:~/flame$ sudo perf record -F 99 -p 10007 -g -- sleep 30
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.100 MB perf.data (747 samples) ]
czl@czl-VirtualBox:~/flame$
perf record 表示采集系统事件, 没有使用 -e 指定采集事件, 则默认采集 cycles(即 CPU clock 周期), -F 99 表示每秒 99 次, -p 13204 是进程号, 即对哪个进程进行分析, -g 表示记录调用栈, sleep 30 则是持续 30 秒.-F 指定采样频率为 99Hz(每秒99次), 如果 99次 都返回同一个函数名, 那就说明 CPU 这一秒钟都在执行同一个函数, 可能存在性能问题.
此时生成的perf.data已经包含了这段时间内的系统采样信息,保存下来的静态信息可以用来进行下一步的分析,这样做的好处是可以不用实时在线,后续可以用离线的数据进行分析。比如,可以执行如下命令,看一下当前数据代表的perf top输出。
$ sudo perf report -i perf.data
热点汇编指令:
3.分析perf.data
执行如下命令
sudo perf report -n --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 747 of event 'cpu-clock:pppH'
# Event count (approx.): 7545454470
#
# Children Self Samples Command Shared Object Symbol
# ........ ........ ............ ....... ................. ..................................
#
70.41% 5.89% 44 a.out libc-2.27.so [.] __GI___libc_write
|
|--64.52%--__GI___libc_write
| entry_SYSCALL_64_after_hwframe
| |
| --64.39%--do_syscall_64
| |
| |--42.03%--__x64_sys_write
| | ksys_write
| | |
| | |--39.89%--vfs_write
| | | |
| | | --38.42%--__vfs_write
| | | |
| | | --38.02%--tty_write
| | | |
| | | |--35.07%--n_tty_write
| | | | |
| | | | |--21.02%--mutex_lock
| | | | | |
| | | | | --20.62%--_cond_resched
| | | | | |
| | | | | --20.35%--__schedule
| | | | | |
| | | | | --20.21%--finish_task_switch
| | | | |
| | | | |--5.35%--pty_write
| | | | | |
| | | | | |--4.28%--tty_flip_buffer_push
| | | | | | |
| | | | | | --4.15%--queue_work_on
| | | | | |
| | | | | --0.80%--__lock_text_start
| | | | |
| | | | |--3.88%--do_output_char
| | | | | |
| | | | | --3.75%--pty_write
| | | | | |
| | | | | |--2.95%--tty_flip_buffer_push
| | | | | | queue_work_on
| | | | | |
| | | | | --0.67%--__lock_text_start
| | | | |
| | | | |--1.07%--wait_woken
| | | | | schedule_timeout
| | | | | schedule
| | | | | __schedule
| | | | | finish_task_switch
| | | | |
| | | | --0.54%--tty_write_room
| | | | tty_buffer_space_avail
| | | |
| | | --0.80%--__virt_addr_valid
| | |
| | --1.47%--__fdget_pos
| | |
| | --1.34%--__fget_light
| |
| --20.88%--exit_to_usermode_loop
| |
| --20.62%--schedule
| |
| --20.48%--__schedule
| |
| --20.35%--finish_task_switch
|
|--1.07%--0x3020203120203220
| __GI___libc_write
|
|--0.94%--0x3220203120203020
| __GI___libc_write
|
|--0.80%--0x3620203420203720
| __GI___libc_write
|
|--0.67%--0x3520203420203620
| __GI___libc_write
|
|--0.67%--0x3220203320203420
| __GI___libc_write
|
|--0.54%--0x3220203420203520
| __GI___libc_write
|
--0.54%--0x3220203020203120
__GI___libc_write
64.52% 0.00% 0 a.out [kernel.kallsyms] [k] entry_SYSCALL_64_after_hwframe
|
---entry_SYSCALL_64_after_hwframe
|
--64.39%--do_syscall_64
|
|--42.03%--__x64_sys_write
| ksys_write
| |
| |--39.89%--vfs_write
| | |
| | --38.42%--__vfs_write
| | |
| | --38.02%--tty_write
| | |
| | |--35.07%--n_tty_write
| | | |
| | | |--21.02%--mutex_lock
| | | | |
| | | | --20.62%--_cond_resched
| | | | |
| | | | --20.35%--__schedule
| | | | |
| | | | --20.21%--finish_task_switch
| | | |
| | | |--5.35%--pty_write
| | | | |
| | | | |--4.28%--tty_flip_buffer_push
| | | | | |
| | | | | --4.15%--queue_work_on
| | | | |
| | | | --0.80%--__lock_text_start
| | | |
| | | |--3.88%--do_output_char
| | | | |
| | | | --3.75%--pty_write
| | | | |
| | | | |--2.95%--tty_flip_buffer_push
| | | | | queue_work_on
| | | | |
| | | | --0.67%--__lock_text_start
| | | |
| | | |--1.07%--wait_woken
| | | | schedule_timeout
| | | | schedule
| | | | __schedule
| | | | finish_task_switch
| | | |
| | | --0.54%--tty_write_room
| | | tty_buffer_space_avail
| | |
| | --0.80%--__virt_addr_valid
| |
| --1.47%--__fdget_pos
| |
| --1.34%--__fget_light
|
--20.88%--exit_to_usermode_loop
|
--20.62%--schedule
|
--20.48%--__schedule
|
--20.35%--finish_task_switch
64.39% 1.47% 11 a.out [kernel.kallsyms] [k] do_syscall_64
|
--62.92%--do_syscall_64
|
|--42.03%--__x64_sys_write
| ksys_write
| |
| |--39.89%--vfs_write
| | |
| | --38.42%--__vfs_write
| | |
| | --38.02%--tty_write
| | |
| | |--35.07%--n_tty_write
| | | |
| | | |--21.02%--mutex_lock
| | | | |
| | | | --20.62%--_cond_resched
| | | | |
| | | | --20.35%--__schedule
| | | | |
| | | | --20.21%--finish_task_switch
| | | |
| | | |--5.35%--pty_write
| | | | |
| | | | |--4.28%--tty_flip_buffer_push
| | | | | |
| | | | | --4.15%--queue_work_on
| | | | |
| | | | --0.80%--__lock_text_start
| | | |
| | | |--3.88%--do_output_char
| | | | |
| | | | --3.75%--pty_write
| | | | |
| | | | |--2.95%--tty_flip_buffer_push
| | | | | queue_work_on
| | | | |
| | | | --0.67%--__lock_text_start
| | | |
| | | |--1.07%--wait_woken
| | | | schedule_timeout
| | | | schedule
| | | | __schedule
| | | | finish_task_switch
| | | |
| | | --0.54%--tty_write_room
| | | tty_buffer_space_avail
| | |
| | --0.80%--__virt_addr_valid
| |
| --1.47%--__fdget_pos
| |
| --1.34%--__fget_light
|
--20.88%--exit_to_usermode_loop
|
--20.62%--schedule
|
--20.48%--__schedule
|
--20.35%--finish_task_switch
42.17% 0.13% 1 a.out [kernel.kallsyms] [k] __x64_sys_write
|
--42.03%--__x64_sys_write
ksys_write
|
|--39.89%--vfs_write
| |
| --38.42%--__vfs_write
| |
| --38.02%--tty_write
| |
| |--35.07%--n_tty_write
| | |
| | |--21.02%--mutex_lock
| | | |
| | | --20.62%--_cond_resched
| | | |
| | | --20.35%--__schedule
| | | |
| | | --20.21%--finish_task_switch
| | |
| | |--5.35%--pty_write
| | | |
| | | |--4.28%--tty_flip_buffer_push
| | | | |
| | | | --4.15%--queue_work_on
| | | |
| | | --0.80%--__lock_text_start
| | |
| | |--3.88%--do_output_char
| | | |
| | | --3.75%--pty_write
| | | |
| | | |--2.95%--tty_flip_buffer_push
| | | | queue_work_on
| | | |
| | | --0.67%--__lock_text_start
| | |
| | |--1.07%--wait_woken
| | | schedule_timeout
| | | schedule
| | | __schedule
| | | finish_task_switch
| | |
| | --0.54%--tty_write_room
| | tty_buffer_space_avail
| |
| --0.80%--__virt_addr_valid
|
--1.47%--__fdget_pos
|
--1.34%--__fget_light
42.03% 0.13% 1 a.out [kernel.kallsyms] [k] ksys_write
|
--41.90%--ksys_write
|
|--39.89%--vfs_write
| |
| --38.42%--__vfs_write
| |
| --38.02%--tty_write
| |
| |--35.07%--n_tty_write
| | |
| | |--21.02%--mutex_lock
| | | |
| | | --20.62%--_cond_resched
| | | |
| | | --20.35%--__schedule
| | | |
| | | --20.21%--finish_task_switch
| | |
| | |--5.35%--pty_write
| | | |
| | | |--4.28%--tty_flip_buffer_push
| | | | |
| | | | --4.15%--queue_work_on
| | | |
| | | --0.80%--__lock_text_start
| | |
| | |--3.88%--do_output_char
| | | |
| | | --3.75%--pty_write
| | | |
| | | |--2.95%--tty_flip_buffer_push
| | | | queue_work_on
| | | |
| | | --0.67%--__lock_text_start
| | |
| | |--1.07%--wait_woken
| | | schedule_timeout
| | | schedule
| | | __schedule
| | | finish_task_switch
| | |
| | --0.54%--tty_write_room
| | tty_buffer_space_avail
| |
| --0.80%--__virt_addr_valid
|
--1.47%--__fdget_pos
|
--1.34%--__fget_light
41.90% 0.27% 2 a.out [kernel.kallsyms] [k] __schedule
|
--41.63%--__schedule
finish_task_switch
41.63% 41.63% 311 a.out [kernel.kallsyms] [k] finish_task_switch
|
|--7.36%--0x3220203120203020
| __GI___libc_write
| entry_SYSCALL_64_after_hwframe
| do_syscall_64
| |
| |--5.35%--exit_to_usermode_loop
| | schedule
| | __schedule
| | finish_task_switch
| |
| --2.01%--__x64_sys_write
| ksys_write
| vfs_write
| __vfs_write
| tty_write
| n_tty_write
| mutex_lock
| _cond_resched
| __schedule
| finish_task_switch
|
|--5.35%--0x3220203020203120
| __GI___libc_write
| entry_SYSCALL_64_after_hwframe
| do_syscall_64
| |
| |--2.81%--exit_to_usermode_loop
| | schedule
| | __schedule
| | finish_task_switch
| |
| --2.54%--__x64_sys_write
| ksys_write
| vfs_write
| __vfs_write
| tty_write
| n_tty_write
| |
| --2.28%--mutex_lock
| _cond_resched
| __schedule
| finish_task_switch
|
|--5.09%--0x3220203120203320
| __GI___libc_write
| entry_SYSCALL_64_after_hwframe
| do_syscall_64
| |
| |--2.54%--__x64_sys_write
| | ksys_write
| | vfs_write
| | __vfs_write
| | tty_write
| | n_tty_write
| | mutex_lock
| | _cond_resched
| | __schedule
| | finish_task_switch
| |
| --2.54%--exit_to_usermode_loop
| schedule
| __schedule
| finish_task_switch
|
|--4.55%--0x3620203720203820
| __GI___libc_write
| entry_SYSCALL_64_after_hwframe
| do_syscall_64
| |
| |--2.68%--__x64_sys_write
| | ksys_write
| | vfs_write
| | __vfs_write
| | tty_write
| | n_tty_write
| | mutex_lock
| | _cond_resched
| | __schedule
| | finish_task_switch
| |
| --1.87%--exit_to_usermode_loop
| schedule
| __schedule
| finish_task_switch
|
|--4.28%--0x3220203420203520
| __GI___libc_write
| entry_SYSCALL_64_after_hwframe
| do_syscall_64
| |
| |--2.68%--__x64_sys_write
| | ksys_write
| | vfs_write
| | __vfs_write
| | tty_write
| | n_tty_write
| | mutex_lock
| | _cond_resched
| | __schedule
| | finish_task_switch
| |
| --1.61%--exit_to_usermode_loop
| schedule
| __schedule
| finish_task_switch
|
|--4.15%--0x3520203420203620
| __GI___libc_write
| entry_SYSCALL_64_after_hwframe
| do_syscall_64
| |
| |--2.28%--__x64_sys_write
| | ksys_write
| | vfs_write
| | __vfs_write
| | tty_write
| | n_tty_write
| | mutex_lock
| | _cond_resched
| | __schedule
| | finish_task_switch
| |
| --1.87%--exit_to_usermode_loop
| schedule
| __schedule
| finish_task_switch
|
|--4.15%--0x3220203320203420
| __GI___libc_write
| entry_SYSCALL_64_after_hwframe
| do_syscall_64
| |
| |--2.68%--__x64_sys_write
| | ksys_write
| | vfs_write
| | __vfs_write
| | tty_write
| | n_tty_write
| | |
| | --2.41%--mutex_lock
| | _cond_resched
| | __schedule
| | finish_task_switch
| |
| --1.47%--exit_to_usermode_loop
| schedule
| __schedule
| finish_task_switch
|
|--4.02%--0x3620203420203720
| __GI___libc_write
| entry_SYSCALL_64_after_hwframe
| do_syscall_64
| |
| |--2.68%--__x64_sys_write
| | ksys_write
| | vfs_write
| | __vfs_write
| | tty_write
| | n_tty_write
| | |
| | |--2.14%--mutex_lock
| | | _cond_resched
| | | __schedule
| | | finish_task_switch
| | |
| | --0.54%--wait_woken
| | schedule_timeout
| | schedule
| | __schedule
| | finish_task_switch
| |
| --1.34%--exit_to_usermode_loop
| schedule
| __schedule
| finish_task_switch
|
--2.68%--0x3020203120203220
__GI___libc_write
entry_SYSCALL_64_after_hwframe
do_syscall_64
|
|--1.47%--exit_to_usermode_loop
| schedule
| __schedule
| finish_task_switch
|
--1.20%--__x64_sys_write
ksys_write
vfs_write
__vfs_write
tty_write
n_tty_write
mutex_lock
_cond_resched
__schedule
finish_task_switch
39.89% 0.54% 4 a.out [kernel.kallsyms] [k] vfs_write
|
--39.36%--vfs_write
|
--38.42%--__vfs_write
|