1、perf
Perf 是用来进行软件性能分析的工具。通过它,应用程序可以利用 PMU,tracepoint 和内核中的特殊计数器来进行性能统计。它不但可以分析指定应用程序的性能问题 (per thread),也可以用来分析内核的性能问题,当然也可以同时分析应用代码和内核,从而全面理解应用程序中的性能瓶颈。
2、perf安装
[root@VM_0_11_centos selinux]# yum -y install perf
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
* base: mirrors.aliyun.com
3、常用命令
1)perf list:查看当前系统支持的性能事件,可以列出所有能够触发 perf 采样点的事件。;
[root@VM_0_11_centos selinux]# perf list
List of pre-defined events (to be used in -e):
alignment-faults [Software event]
bpf-output [Software event]
context-switches OR cs [Software event]
cpu-clock [Software event]
cpu-migrations OR migrations [Software event]
dummy [Software event]
emulation-faults [Software event]
major-faults [Software event]
minor-faults [Software event]
page-faults OR faults [Software event]
task-clock [Software event]
msr/tsc/ [Kernel PMU event]
rNNN [Raw hardware event descrip
cpu/t1=v1[,t2=v2,t3 ...]/modifier [Raw hardware event descrip
(see 'man perf-list' on how to encode it)
mem:<addr>[/len][:access] [Hardware breakpoint]
[root@VM_0_11_centos selinux]#
2)perf stat:对全局性能进行统计
//test.c
#include <stdio.h>
#include <stdlib.h>
void longa()
{
int i,j;
for(i = 0; i < 1000000; i++)
j=i; //am I silly or crazy? I feel boring and desperate.
}
void foo2()
{
int i;
for(i=0 ; i < 10; i++)
longa();
}
void foo1()
{
int i;
for(i = 0; i< 100; i++)
longa();
}
int main(void)
{
foo1();
foo2();
}
[root@VM_0_11_centos perf]# gcc -o test -g test.c
[root@VM_0_11_centos perf]# ll
total 16
-rwxr-xr-x 1 root root 9688 Feb 21 20:39 test
-rw-r--r-- 1 root root 356 Feb 21 20:39 test.c
[root@VM_0_11_centos perf]#
[root@VM_0_11_centos perf]# perf stat ./test
Performance counter stats for './test':
269.33 msec task-clock # 0.982 CPUs utilized
22 context-switches # 0.082 K/sec
0 cpu-migrations # 0.000 K/sec
112 page-faults # 0.416 K/sec
<not supported> cycles
<not supported> instructions
<not supported> branches
<not supported> branch-misses
0.274310192 seconds time elapsed
0.269620000 seconds user
0.000000000 seconds sys
[root@VM_0_11_centos perf]#
上面告诉我们,程序 test 是一个 CPU bound 型,因为 task-clock-msecs 接近 1。
3)perf top:可以实时查看当前系统进程函数占用率情况;
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
long long i = 0;
while(1) {
i++;
}
}
4) perf record:记录信息到perf.data(默认文件);perf report:生成报告;
[root@VM_0_11_centos perf]# perf record – e cpu-clock ./test
Workload failed: No such file or directory
[root@VM_0_11_centos perf]# ll
total 28
-rw------- 1 root root 9920 Feb 21 20:52 perf.data
-rwxr-xr-x 1 root root 9688 Feb 21 20:51 test
-rw-r--r-- 1 root root 356 Feb 21 20:51 test.c
[root@VM_0_11_centos perf]# perf report