how to use perf

Since I did't see here anything about perf which is a relatively new tool for profiling the kernel and user applications on Linux I decided to add this information.

First of all - this is a tutorial about Linux profiling with perf

You can use perf if your Linux Kernel is greater than 2.6.32 or oprofile if it is older. Both programs don't require from you to instrument your program (like gprof requires). However in order to get call graph correctly in perf you need to build you program with -fno-omit-frame-pointer. For example: g++ -fno-omit-frame-pointer -O2 main.cpp.

You can see "live" analysis of your application with perf top:

sudo perf top -p `pidof a.out` -K

Or you can record performance data of a running application and analyze them after that:

1) To record performance data:

perf record -p `pidof a.out`

or to record for 10 secs:

perf record -p `pidof a.out` sleep 10

or to record with call graph ()

perf record -g -p `pidof a.out`

2) To analyze the recorded data

perf report --stdio

perf report --stdio --sort=dso -g none

perf report --stdio -g none

perf report --stdio -g

Or you can record performace data of a application and analyze them after that just by launching the application in this way and waiting for it to exit:

perf record ./a.out

This is an example of profiling a test program

The test program is in file main.cpp (I will put main.cpp at the bottom of the message):

I compile it in this way:

g++ -m64 -fno-omit-frame-pointer -g main.cpp -L.  -ltcmalloc_minimal -o my_test

I use libmalloc_minimial.so since it is compiled with -fno-omit-frame-pointer while libc malloc seems to be compiled without this option. Then I run my test program

./my_test 100000000

Then I record performance data of a running process:

perf record -g  -p `pidof my_test` -o ./my_test.perf.data sleep 30

Then I analyze load per module:

perf report --stdio -g none --sort comm,dso -i ./my_test.perf.data

# Overhead  Command                 Shared Object

# ........  .......  ............................

#

70.06%  my_test  my_test

and so on ...

Then call chains are analyzed:

perf report --stdio -g graph -i ./my_test.perf.data | c++filt

0.16%  my_test  [kernel.kallsyms]             [k] _spin_lock

and so on ...

So at this point you know where your program spends time.

And this is main.cpp for the test:

#include <stdio.h>

#include <stdlib.h>

#include <time.h>

time_t f1(time_t time_value)

{

for (int j =0; j < 10; ++j) {

++time_value;

if (j%5 == 0) {

double *p = new double;

delete p;

}

}

return time_value;

}

time_t f2(time_t time_value)

{

for (int j =0; j < 40; ++j) {

++time_value;

}

time_value=f1(time_value);

return time_value;

}

time_t process_request(time_t time_value)

{

for (int j =0; j < 10; ++j) {

int *p = new int;

delete p;

for (int m =0; m < 10; ++m) {

++time_value;

}

}

for (int i =0; i < 10; ++i) {

time_value=f1(time_value);

time_value=f2(time_value);

}

return time_value;

}

int main(int argc, char* argv2[])

{

int number_loops = argc > 1 ? atoi(argv2[1]) : 1;

time_t time_value = time(0);

printf("number loops %d\n", number_loops);

printf("time_value: %d\n", time_value );

for (int i =0; i < number_loops; ++i) {

time_value = process_request(time_value);

}

printf("time_value: %ld\n", time_value );

return 0;

}

原文

http://stackoverflow.com/questions/1777556/alternatives-to-gprof#comment3480484_1779343

转载于:https://www.cnblogs.com/mydomain/p/3204523.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值