Linux Perf 的安装与使用
Ubuntu下安装Perf(Ubuntu18.04)
- 使用apt包工具安装和下载
# 下载linux-tools-common
zouren@ubuntu:~$ sudo apt-get install linux-tools-common
# 查看是否存在perf
zouren@ubuntu:~$ perf --version
# 如果不存在,可以下载特定的内核版本下的tools,根据命令行的提示,我的命令是, 这个命令需要下载特定内核版本的工具, 命令 uname -r 查看内核版本
zouren@ubuntu:~$ sudo apt-get install linux-tools-5.4.0-122-generic
# 再检查一下
zouren@ubuntu:~$ perf --version
perf version 5.4.192
详细过程可查:基于Ubuntu 18.04 安装perf工具 - 简书 (jianshu.com)
- 源码安装Perf, 拿到Perf的源码
# 查看自己内核的版本,到官网上去下载特定的内核源码
>$ uname -r
5.4.0-122-generic
# 去官网上下载内核源码,可以手动下载,也可以使用wget
>$ wget http://ftp.sjtu.edu.cn/sites/ftp.kernel.org/pub/linux/kernel/v5.x/linux-5.4.122.tar.gz
# 下载完毕之后,接下内核源代码
>$ tar -zxvf linux-5.4.122.tar.gz
# 进入如下目录
>$ cd linux-5.4.122/tools/perf/
# 源码级安装, 如有些依赖包没有安装,得安装一下,依赖包在下面第二个链接
>$ make -j10 && make install
# 查看perf的安装情况
>$ perf --version
perf version 5.4.192
Linux内核源码下载网站:Index of /sites/ftp.kernel.org/pub/linux/kernel/ (sjtu.edu.cn)
Perf源码安装的依赖:ubuntu源码安装性能分析工具perf - 知乎 (zhihu.com)
Perf的使用
>$ perf --help
usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS]
The most commonly used perf commands are:
annotate Read perf.data (created by perf record) and display annotated code
archive Create archive with object files with build-ids found in perf.data file
bench General framework for benchmark suites
buildid-cache Manage build-id cache.
buildid-list List the buildids in a perf.data file
c2c Shared Data C2C/HITM Analyzer.
config Get and set variables in a configuration file.
data Data file related processing
diff Read perf.data files and display the differential profile
evlist List the event names in a perf.data file
ftrace simple wrapper for kernel's ftrace functionality
inject Filter to augment the events stream with additional information
kallsyms Searches running kernel for symbols
kmem Tool to trace/measure kernel memory properties
kvm Tool to trace/measure kvm guest os
list List all symbolic event types
lock Analyze lock events
mem Profile memory accesses
record Run a command and record its profile into perf.data
report Read perf.data (created by perf record) and display the profile
sched Tool to trace/measure scheduler properties (latencies)
script Read perf.data (created by perf record) and display trace output
stat Run a command and gather performance counter statistics
test Runs sanity tests.
timechart Tool to visualize total system behavior during a workload
top System profiling tool.
version display the version of perf binary
probe Define new dynamic tracepoints
trace strace inspired tool
See 'perf help COMMAND' for more information on a specific command.
示例程序(矩阵的乘法)
// example.c
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define N 1024
double A[N][N];
double B[N][N];
double C[N][N];
int main(){
/* init srand */
srand((unsigned)time(NULL));
/* init matrix */
for(int i = 0 ; i < N ; ++i){
for(int j = 0 ; j < N ; ++j){
A[i][j] = rand()/32767.0;
B[i][j] = rand()/32767.0;
C[i][j] = 0;
}
}
/* mult matrix */
for(int i = 0 ; i < N ; ++i){
for(int j = 0 ; j < N ; ++j){
for(int k = 0 ; k < N ; ++k){
C[i][j] += A[i][k] * B[k][j];
}
}
}
return 0;
}
简单使用perf
# 1. 编译目标位可执行文件
>$ gcc -g example.c -o example
# 2. 使用perf分析性能
>$ perf stat exapmle
Performance counter stats for './example':
3,617.38 msec task-clock # 0.998 CPUs utilized
10 context-switches # 0.003 K/sec
0 cpu-migrations # 0.000 K/sec
6,189 page-faults # 0.002 M/sec
<not supported> cycles
<not supported> instructions
<not supported> branches
<not supported> branch-misses
3.624806187 seconds time elapsed
3.614249000 seconds user
0.003993000 seconds sys
# 如果没有上面的报告,报错了,可能要配置一下/etc/sysctl.conf,将里面的 kernel.perf_event_paranoid设置为-1, 具体操作如下
>$ sudo gedit /etc/sysctl.conf
# 然后将kernel.perf_event_paranoid = -1 写入,更新一下
>$ sudo sysctl -p