Analyzing CPU Usage with Perf (2) - using perf and read flamgraph

Decide a Sampling Frequency

Perf is a sampling based analyzer. So, the amount of sampled data must be enough to produce reliable result.
The higher sampling frequency you use, the shorter sampling time it takes, and vice versa.

The sampling frequency is suggested to be an odd number lower than 99.
Currently, we suggest sampling at 61 hertz for 180 seconds.

Note

Perf is doing best effort sampling. So total samples will be less than “sample frequency multiplies sampling time”.

Profiling the Whole System

Using perf to record system activity for 180 seconds at 61 hertz sampling rate, both user- and kernel-level stacks are samples.
Save the perf output to the RAM disk /tmp to reduce system overhead.

$ perf record -F 61 -a -g -o /tmp/whole_system.data -- sleep 180
$ perf script -i /tmp/whole_system.data > $NFS/whole_system.perf

Profiling One Process

Using perf to record process activity for 180 seconds at 61 hertz sampling rate, both user- and kernel-level stacks are samples.
Save the perf output to the RAM disk /tmp to reduce system overhead.

$ perf record -F 61 -p <PID> -g -o /tmp/process_name.data -- sleep 180
$ perf script -i /tmp/process_name.data > $NFS/process_name.perf

Using Perf for to collect data.

Sample on-CPU functions for the specified command, at 61 Hertz.
$ perf record -F 61 <command>

Sample on-CPU functions for the specified PID, at 61 Hertz, until Ctrl-C.
$ perf record -F 61 -p <PID>

Sample on-CPU functions for the specified PID, at 61 Hertz, for 180 seconds.
$ perf record -F 61 -p <PID> -- sleep 180

Sample CPU stack traces (via frame pointers) for the specified PID, at 61 Hertz, for 180 seconds.
$ perf record -F 61 -p PID -g -- sleep 180

Sample CPU stack traces for the entire system, at 61 Hertz, for 180 seconds (< Linux 4.11).
$ perf record -F 61 -ag -- sleep 180

Sample CPU stack traces for the entire system, at 61 Hertz, for 180 seconds (>= Linux 4.11).
$ perf record -F 61 -g -- sleep 180

Using Perf for to generate report

Show perf.data in an Text User Interface.
$ perf report

Show perf.data with a column for sample count.
$ perf report -n

Show perf.data as a text report, with data coalesced and percentages.
Copy the perf.data from step one to pc and run this command Since busybox curses support is limited.
$ perf report --stdio

List all events from perf.data.
$ perf script > out.perf 

List all perf.data events, with customized fields (< Linux 4.1).
$ perf script -f time,event,trace

Dump raw contents from perf.data as hex (for debugging).
$ perf script -D

Disassemble and annotate instructions with percentages (needs some debuginfo).
$ perf annotate --stdio

Compare Performance Between Two Report

First generate two perf files such as whole_system_0815.perf and whole_system_0816.perf. Compare two perf report using PC tools such as beyound compare.

Generating Flamegraph

This section run flamegraph to generate flamegraph in SVG on PC
(Please run XXX.pl on PC not DUT)

Copy perf output to the host system and generate a flamegraph.
Same procedure if you profile just one process.

$ cd $FLAME_GRAPH_DIR
$ ./stackcollapse-perf.pl whole_system.perf > whole_system.folded
$ ./flamegraph.pl whole_system.folded > whole_system.svg

There is a batch script to generate graphs.

$ python ./perfToFlame.py --prog $FLAME_GRAPH_DIR --input $INPUT_FOLDER --out $OUTPUT_FOLDER

Note

Flamegraph can be downloaded from
https://github.com/brendangregg/FlameGraph. Script can be downloaded from perfToFlame.py.

Analyzing FlamegraphEdit

Confidence of Collected Data

Because Perf is doing best effort sampling, sometimes the total collected sample is not enough.
Please extend the sampling time to get more samples, otherwise the result is useless.

Based on our experience, total samples should be greater than X samples and the formula is listed below.

  • For whole systyem profiling, X >= max( 0.7 x sampling freq x sampling time, 1500 ).
  • For single process profiling, X >= max( 0.7 x sampling freq x sampling time x CPU usage of the process, 1500 ).

Viewing the Flamegraph

In a flamegraph, each block represents a function.
For each block, there could be another block under it, which means the lower function calls the upper function.

The bottom-most block is the main function or the entire system.
The top-most blocks are the functions actually running when perf take a sample.

The wider a block is, the more CPU time it and its child functions use.
So, look for top blocks that are “flat” and there could be CPU hungry logic in it.
Height is not related to CPU time but means the depth of calling stack.

Placing the mouse cursor over a block can get information about the block, including:

  • Function name
  • Number of samples
  • Percentage of samples comparing to the bottom-most block (i.e. whole system or the process)

Clicking a block brings you a zoom-in view of that block.
To zoom out, click the reset zoom label on top-left corner of the graph.

The following is a flamegraph of whole system running av_main.
The interpretation of this graph would be:

在这里插入图片描述

Note

Thread name is not available in whole system profiling.
Do not use Internet Explorer to view flamegraph, some SVG features are not supported.

Other Perf Capabilities

Other possible combinations that can be explored with perf.
The below features may require additional Kernel Config flags and compilation flags to work.

CommandDescription
annotateRead perf.data (created by perf record) and display annotated code.
archiveCreate archive with object files with build-ids found in perf.data file.
benchGeneral framework for benchmark suites.
buildid-cacheManage build-id cache.
buildid-listList the buildids in a perf.data file.
diffRead perf.data files and display the differential profile.
evlistList the event names in a perf.data file.
injectFilter to augment the events stream with additional information.
kmemTool to trace/measure kernel memory(slab) properties.
kvmTool to trace/measure kvm guest os.
listList all symbolic event types.
lockAnalyze lock events.
memProfile memory accesses.
recordRun a command and record its profile into perf.data.
reportRead perf.data (created by perf record) and display the profile.
schedTool to trace/measure scheduler properties (latencies).
scriptRead perf.data (created by perf record) and display trace output.
statRun a command and gather performance counter statistics.
testRuns sanity tests.
timechartTool to visualize total system behavior during a workload.
topSystem profiling tool.
traceStrace inspired tool.

Reference

official linux perf
What really is CPU Utilization?
application performance profiling
perf example
Linux Profiling at Netflix
Blazing Performance with Flame Graphs
How to read flamechart

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值