Analyzing CPU Usage with Perf (2) - using perf and read flamgraph

Chia-Te Kuan

于 2024-09-04 11:19:29 发布

阅读量1k

点赞数 16

分类专栏：分析工具經驗談文章标签：算法网络大数据

本文链接：https://blog.csdn.net/agathakuan/article/details/141888199

版权

經驗談同时被 2 个专栏收录

23 篇文章 0 订阅

订阅专栏

分析工具

22 篇文章 0 订阅

订阅专栏

content

Decide a Sampling Frequency
Profiling the Whole System
Profiling One Process
Using Perf for to collect data.
Using Perf for to generate report
Compare Performance Between Two Report
Generating Flamegraph
Analyzing FlamegraphEdit
Reference

Decide a Sampling Frequency

Perf is a sampling based analyzer. So, the amount of sampled data must be enough to produce reliable result.
The higher sampling frequency you use, the shorter sampling time it takes, and vice versa.

The sampling frequency is suggested to be an odd number lower than 99.
Currently, we suggest sampling at 61 hertz for 180 seconds.

Note

Perf is doing best effort sampling. So total samples will be less than “sample frequency multiplies sampling time”.

Profiling the Whole System

Using perf to record system activity for 180 seconds at 61 hertz sampling rate, both user- and kernel-level stacks are samples.
Save the perf output to the RAM disk /tmp to reduce system overhead.

$ perf record -F 61 -a -g -o /tmp/whole_system.data -- sleep 180
$ perf script -i /tmp/whole_system.data > $NFS/whole_system.perf

Profiling One Process

Using perf to record process activity for 180 seconds at 61 hertz sampling rate, both user- and kernel-level stacks are samples.
Save the perf output to the RAM disk /tmp to reduce system overhead.

$ perf record -F 61 -p <PID> -g -o /tmp/process_name.data -- sleep 180
$ perf script -i /tmp/process_name.data > $NFS/process_name.perf

Using Perf for to collect data.

Sample on-CPU functions for the specified command, at 61 Hertz.
$ perf record -F 61 <command>

Sample on-CPU functions for the specified PID, at 61 Hertz, until Ctrl-C.
$ perf record -F 61 -p <PID>

Sample on-CPU functions for the specified PID, at 61 Hertz, for 180 seconds.
$ perf record -F 61 -p <PID> -- sleep 180

Sample CPU stack traces (via frame pointers) for the specified PID, at 61 Hertz, for 180 seconds.
$ perf record -F 61 -p PID -g -- sleep 180

Sample CPU stack traces for the entire system, at 61 Hertz, for 180 seconds (< Linux 4.11).
$ perf record -F 61 -ag -- sleep 180

Sample CPU stack traces for the entire system, at 61 Hertz, for 180 seconds (>= Linux 4.11).
$ perf record -F 61 -g -- sleep 180

Using Perf for to generate report

Show perf.data in an Text User Interface.
$ perf report

Show perf.data with a column for sample count.
$ perf report -n

Show perf.data as a text report, with data coalesced and percentages.
Copy the perf.data from step one to pc and run this command Since busybox curses support is limited.
$ perf report --stdio

List all events from perf.data.
$ perf script > out.perf 

List all perf.data events, with customized fields (< Linux 4.1).
$ perf script -f time,event,trace

Dump raw contents from perf.data as hex (for debugging).
$ perf script -D

Disassemble and annotate instructions with percentages (needs some debuginfo).
$ perf annotate --stdio

Compare Performance Between Two Report

First generate two perf files such as whole_system_0815.perf and whole_system_0816.perf. Compare two perf report using PC tools such as beyound compare.

Generating Flamegraph

This section run flamegraph to generate flamegraph in SVG on PC
(Please run XXX.pl on PC not DUT)

Copy perf output to the host system and generate a flamegraph.
Same procedure if you profile just one process.

$ cd $FLAME_GRAPH_DIR
$ ./stackcollapse-perf.pl whole_system.perf > whole_system.folded
$ ./flamegraph.pl whole_system.folded > whole_system.svg

There is a batch script to generate graphs.

$ python ./perfToFlame.py --prog $FLAME_GRAPH_DIR --input $INPUT_FOLDER --out $OUTPUT_FOLDER

Note

Flamegraph can be downloaded from
https://github.com/brendangregg/FlameGraph. Script can be downloaded from perfToFlame.py.

Analyzing FlamegraphEdit

Confidence of Collected Data

Because Perf is doing best effort sampling, sometimes the total collected sample is not enough.
Please extend the sampling time to get more samples, otherwise the result is useless.

Based on our experience, total samples should be greater than X samples and the formula is listed below.

For whole systyem profiling, X >= max( 0.7 x sampling freq x sampling time, 1500 ).
For single process profiling, X >= max( 0.7 x sampling freq x sampling time x CPU usage of the process, 1500 ).

Viewing the Flamegraph

In a flamegraph, each block represents a function.
For each block, there could be another block under it, which means the lower function calls the upper function.

The bottom-most block is the main function or the entire system.
The top-most blocks are the functions actually running when perf take a sample.

The wider a block is, the more CPU time it and its child functions use.
So, look for top blocks that are “flat” and there could be CPU hungry logic in it.
Height is not related to CPU time but means the depth of calling stack.

Placing the mouse cursor over a block can get information about the block, including:

Function name
Number of samples
Percentage of samples comparing to the bottom-most block (i.e. whole system or the process)

Clicking a block brings you a zoom-in view of that block.
To zoom out, click the reset zoom label on top-left corner of the graph.

The following is a flamegraph of whole system running av_main.
The interpretation of this graph would be:

在这里插入图片描述

Note

Thread name is not available in whole system profiling.
Do not use Internet Explorer to view flamegraph, some SVG features are not supported.

Other Perf Capabilities

Other possible combinations that can be explored with perf.
The below features may require additional Kernel Config flags and compilation flags to work.

Command	Description
annotate	Read perf.data (created by perf record) and display annotated code.
archive	Create archive with object files with build-ids found in perf.data file.
bench	General framework for benchmark suites.
buildid-cache	Manage build-id cache.
buildid-list	List the buildids in a perf.data file.
diff	Read perf.data files and display the differential profile.
evlist	List the event names in a perf.data file.
inject	Filter to augment the events stream with additional information.
kmem	Tool to trace/measure kernel memory(slab) properties.
kvm	Tool to trace/measure kvm guest os.
list	List all symbolic event types.
lock	Analyze lock events.
mem	Profile memory accesses.
record	Run a command and record its profile into perf.data.
report	Read perf.data (created by perf record) and display the profile.
sched	Tool to trace/measure scheduler properties (latencies).
script	Read perf.data (created by perf record) and display trace output.
stat	Run a command and gather performance counter statistics.
test	Runs sanity tests.
timechart	Tool to visualize total system behavior during a workload.
top	System profiling tool.
trace	Strace inspired tool.