背景:
perf使用案例专题。
环境:
基于项目linux-ps项目(linux-ps · GitCode),在meta-ls(自建demo层)中添加自己的内核编译bb文件。已支持perf和stressng
Listing Events
Listing all currently known events:罗列出所有支持的event perf list
Listing sched tracepoints:罗列出sched的tracepoint perf list 'sched:*'
Counting Events
CPU counter statistics for the specified command:执行命令的cpu统计
perf stat command
Detailed CPU counter statistics (includes extras) for the specified command:执行命令的cpu详细统计
perf stat -d command
CPU counter statistics for the specified PID, until Ctrl-C:特定PID的cpu统计信息 perf stat -p PID
CPU counter statistics for the entire system, for 5 seconds:整个系统的cpu统计信息,持续5秒
perf stat -a sleep 5
Various basic CPU statistics, system wide, for 10 seconds:各种基本的CPU统计,系统范围,为10秒
perf stat -e cycles,instructions,cache-references,cache-misses,bus-cycles -a sleep 10
Various CPU level 1 data cache statistics for the specified command:针对指定命令的各种CPU level 1数据缓存统计信息
perf stat -e L1-dcache-loads,L1-dcache-load-misses,L1-dcache-stores command
Various CPU data TLB statistics for the specified command:
perf stat -e dTLB-loads,dTLB-load-misses,dTLB-prefetch-misses command
Various CPU last level cache statistics for the specified command:指定命令的各种CPU数据TLB统计
perf stat -e LLC-loads,LLC-load-misses,LLC-stores,LLC-prefetches command
Using raw PMC counters, eg, counting unhalted core cycles:指定命令的各种CPU最后一级缓存统计信息
perf stat -e r003c -a sleep 5
PMCs: counting cycles and frontend stalls via raw specification:Performance Monitoring Counters,
perf stat -e cycles -e cpu/event=0x0e,umask=0x01,inv,cmask=0x01/ -a sleep 5
Count syscalls per-second system-wide:计数每秒系统范围内的系统调用
perf stat -e raw_syscalls:sys_enter -I 1000 -a
Count system calls by type for the specified PID, until Ctrl-C:对指定PID的系统调用按类型计数,直到按Ctrl-C
perf stat -e 'syscalls:sys_enter_*' -p PID
Count system calls by type for the entire system, for 5 seconds:对整个系统按类型计数系统调用,持续5秒
perf stat -e 'syscalls:sys_enter_*' -a sleep 5
Count scheduler events for the specified PID, until Ctrl-C:计数指定PID的调度程序事件,直到Ctrl-C
perf stat -e 'sched:*' -p PID
Count scheduler events for the specified PID, for 10 seconds:计算指定PID的调度器事件,持续10秒
perf stat -e 'sched:*' -p PID sleep 10
Count ext4 events for the entire system, for 10 seconds:计数整个系统的ext4事件,持续10秒
perf stat -e 'ext4:*' -a sleep 10
Count block device I/O events for the entire system, for 10 seconds:计数整个系统的块设备I/O事件,持续10秒
perf stat -e 'block:*' -a sleep 10
Count all vmscan events, printing a report every second:统计所有vmscan事件,每秒打印一个报告
perf stat -e 'vmscan:*' -a -I 1000
Profiling
Sample on-CPU functions for the specified command, at 99 Hertz:以99赫兹的频率采样指定命令的cpu上函数
perf record -F 99 command
Sample on-CPU functions for the specified PID, at 99 Hertz, until Ctrl-C:采样cpu上的功能为指定的PID,在99赫兹
perf record -F 99 -p PID
Sample on-CPU functions for the specified PID, at 99 Hertz, for 10 seconds:采样cpu上的功能为指定的PID,在99赫兹
perf record -F 99 -p PID sleep 10
Sample CPU stack traces (via frame pointers) for the specified PID, at 99 Hertz, for 10 seconds:采样CPU堆栈跟踪(通过帧指针)为指定的PID,在99赫兹,为10秒
perf record -F 99 -p PID -g -- sleep 10
Sample CPU stack traces for the PID, using dwarf (dbg info) to unwind stacks, at 99 Hertz, for 10 seconds:采样PID的CPU堆栈跟踪,使用dwarf (dbg info)以99赫兹的速度展开堆栈,持续10秒
perf record -F 99 -p PID --call-graph dwarf sleep 10
Sample CPU stack traces for the entire system, at 99 Hertz, for 10 seconds (< Linux 4.11):
perf record -F 99 -ag -- sleep 10
Sample CPU stack traces for the entire system, at 99 Hertz, for 10 seconds (>= Linux 4.11):整个系统的CPU堆栈跟踪样本,99赫兹,持续10秒
perf record -F 99 -g -- sleep 10
If the previous command didn't work, try forcing perf to use the cpu-clock event:
perf record -F 99 -e cpu-clock -ag -- sleep 10
Sample CPU stack traces for a container identified by its /sys/fs/cgroup/perf_event cgroup:
perf record -F 99 -e cpu-clock --cgroup=docker/1d567f4393190204...etc... -a -- sleep 10
Sample CPU stack traces for the entire system, with dwarf stacks, at 99 Hertz, for 10 seconds:整个系统的CPU堆栈跟踪样本,使用dwarf堆栈,99赫兹
perf record -F 99 -a --call-graph dwarf sleep 10
Sample CPU stack traces for the entire system, using last branch record for stacks, ... (>= Linux 4.?):整个系统的CPU堆栈跟踪示例,使用堆栈的最后分支记录
perf record -F 99 -a --call-graph lbr sleep 10
Sample CPU stack traces, once every 10,000 Level 1 data cache misses, for 5 seconds:采样CPU堆栈跟踪,每10000次一级数据缓存丢失一次,持续5秒
perf record -e L1-dcache-load-misses -c 10000 -ag -- sleep 5
Sample CPU stack traces, once every 100 last level cache misses, for 5 seconds:采样CPU堆栈跟踪,每100个最后一级缓存丢失一次,持续5秒:
perf record -e LLC-load-misses -c 100 -ag -- sleep 5
Sample on-CPU kernel instructions, for 5 seconds:cpu内核指令,运行5秒
perf record -e cycles:k -a -- sleep 5
Sample on-CPU user instructions, for 5 seconds:cpu上的用户指令,持续5秒
perf record -e cycles:u -a -- sleep 5
Sample on-CPU user instructions precisely (using PEBS), for 5 seconds:精确采样 CPU 用户指令
perf record -e cycles:up -a -- sleep 5
Perform branch tracing (needs HW support), for 1 second:执行分支跟踪(需要硬件支持)
perf record -b -a sleep 1
Sample CPUs at 49 Hertz, and show top addresses and symbols, live (no perf.data file):采样cpu在49赫兹,并显示地址和符号,(没有perf数据文件)
perf top -F 49
Sample CPUs at 49 Hertz, and show top process names and segments, live:采样cpu在49赫兹,并显示进程名称和段
perf top -F 49 -ns comm,dso
Static Tracing
Trace new processes, until Ctrl-C:跟踪新增加的prosses
perf record -e sched:sched_process_exec -a
Sample (take a subset of) context-switches, until Ctrl-C:上下文切换
perf record -e context-switches -a
Trace all context-switches, until Ctrl-C:上下文切换 perf record -e context-switches -c 1 -a
Include raw settings used (see: man perf_event_open):上下文切换
perf record -vv -e context-switches -a
Trace all context-switches via sched tracepoint, until Ctrl-C:上下文切换
perf record -e sched:sched_switch -a
Sample context-switches with stack traces, until Ctrl-C:
perf record -e context-switches -ag
Sample context-switches with stack traces, for 10 seconds:
perf record -e context-switches -ag -- sleep 10
Sample CS, stack traces, and with timestamps (< Linux 3.17, -T now default):
perf record -e context-switches -ag -T
Sample CPU migrations, for 10 seconds:CPU迁移
perf record -e migrations -a -- sleep 10
Trace all connect()s with stack traces (outbound connections), until Ctrl-C:
perf record -e syscalls:sys_enter_connect -ag
Trace all accepts()s with stack traces (inbound connections), until Ctrl-C:
perf record -e syscalls:sys_enter_accept* -ag
Trace all block device (disk I/O) requests with stack traces, until Ctrl-C:
perf record -e block:block_rq_insert -ag
Sample at most 100 block device requests per second, until Ctrl-C:
perf record -F 100 -e block:block_rq_insert -a
Trace all block device issues and completions (has timestamps), until Ctrl-C:
perf record -e block:block_rq_issue -e block:block_rq_complete -a
Trace all block completions, of size at least 100 Kbytes, until Ctrl-C:
perf record -e block:block_rq_complete --filter 'nr_sector > 200'
Trace all block completions, synchronous writes only, until Ctrl-C:
perf record -e block:block_rq_complete --filter 'rwbs == "WS"'
Trace all block completions, all types of writes, until Ctrl-C:
perf record -e block:block_rq_complete --filter 'rwbs ~ "W"'
Sample minor faults (RSS growth) with stack traces, until Ctrl-C:
perf record -e minor-faults -ag
Trace all minor faults with stack traces, until Ctrl-C:
perf record -e minor-faults -c 1 -ag
Sample page faults with stack traces, until Ctrl-C:
perf record -e page-faults -ag
Trace all ext4 calls, and write to a non-ext4 location, until Ctrl-C:
perf record -e 'ext4:*' -o /tmp/perf.data -a
Trace kswapd wakeup events, until Ctrl-C:
perf record -e vmscan:mm_vmscan_wakeup_kswapd -ag
Add Node.js USDT probes (Linux 4.10+):
perf buildid-cache --add which node
Trace the node httpserverrequest USDT event (Linux 4.10+):
perf record -e sdt_node:httpserverrequest -a
Dynamic Tracing
Add a tracepoint for the kernel tcp_sendmsg() function entry ("--add" is optional):
perf probe --add tcp_sendmsg
Remove the tcp_sendmsg() tracepoint (or use "--del"):
perf probe -d tcp_sendmsg
Add a tracepoint for the kernel tcp_sendmsg() function return:
perf probe 'tcp_sendmsg%return'
Show available variables for the kernel tcp_sendmsg() function (needs debuginfo):
perf probe -V tcp_sendmsg
Show available variables for the kernel tcp_sendmsg() function, plus external vars (needs debuginfo):
perf probe -V tcp_sendmsg --externs
Show available line probes for tcp_sendmsg() (needs debuginfo):
perf probe -L tcp_sendmsg
Show available variables for tcp_sendmsg() at line number 81 (needs debuginfo):
perf probe -V tcp_sendmsg:81
Add a tracepoint for tcp_sendmsg(), with three entry argument registers (platform specific):
perf probe 'tcp_sendmsg %ax %dx %cx'
Add a tracepoint for tcp_sendmsg(), with an alias ("bytes") for the %cx register (platform specific): perf probe 'tcp_sendmsg bytes=%cx'
Trace previously created probe when the bytes (alias) variable is greater than 100:
perf record -e probe:tcp_sendmsg --filter 'bytes > 100'
Add a tracepoint for tcp_sendmsg() return, and capture the return value:
perf probe 'tcp_sendmsg%return $retval'
Add a tracepoint for tcp_sendmsg(), and "size" entry argument (reliable, but needs debuginfo):
perf probe 'tcp_sendmsg size'
Add a tracepoint for tcp_sendmsg(), with size and socket state (needs debuginfo):
perf probe 'tcp_sendmsg size sk->__sk_common.skc_state'
Tell me how on Earth you would do this, but don't actually do it (needs debuginfo):
perf probe -nv 'tcp_sendmsg size sk->__sk_common.skc_state'
Trace previous probe when size is non-zero, and state is not TCP_ESTABLISHED(1) (needs debuginfo):
perf record -e probe:tcp_sendmsg --filter 'size > 0 && skc_state != 1' -a
Add a tracepoint for tcp_sendmsg() line 81 with local variable seglen (needs debuginfo):
perf probe 'tcp_sendmsg:81 seglen'
Add a tracepoint for do_sys_open() with the filename as a string (needs debuginfo):
perf probe 'do_sys_open filename:string'
Add a tracepoint for myfunc() return, and include the retval as a string:
perf probe 'myfunc%return +0($retval):string'
Add a tracepoint for the user-level malloc() function from libc:
perf probe -x /lib64/libc.so.6 malloc
Add a tracepoint for this user-level static probe (USDT, aka SDT event):
perf probe -x /usr/lib64/libpthread-2.24.so %sdt_libpthread:mutex_entry
List currently available dynamic probes:
perf probe -l
Mixed
Trace system calls by process, showing a summary refreshing every 2 seconds:
perf top -e raw_syscalls:sys_enter -ns comm
Trace sent network packets by on-CPU process, rolling output (no clear):
stdbuf -oL perf top -e net:net_dev_xmit -ns comm | strings
Sample stacks at 99 Hertz, and, context switches:
perf record -F99 -e cpu-clock -e cs -a -g
Sample stacks to 2 levels deep, and, context switch stacks to 5 levels (needs 4.8):
perf record -F99 -e cpu-clock/max-stack=2/ -e cs/max-stack=5/ -a -g
Special
Record cacheline events (Linux 4.10+):
perf c2c record -a -- sleep 10
Report cacheline events from previous recording (Linux 4.10+): perf c2c report Reporting Show perf.data in an ncurses browser (TUI) if possible:
perf report
Show perf.data with a column for sample count:
perf report -n
Show perf.data as a text report, with data coalesced and percentages:
perf report --stdio
Report, with stacks in folded format: one line per stack (needs 4.4):
perf report --stdio -n -g folded
List all events from perf.data:
perf script
List all perf.data events, with data header (newer kernels; was previously default):
perf script --header
List all perf.data events, with customized fields (< Linux 4.1):
perf script -f time,event,trace
List all perf.data events, with customized fields (>= Linux 4.1):
perf script -F time,event,trace
List all perf.data events, with my recommended fields (needs record -a; newer kernels):
perf script --header -F comm,pid,tid,cpu,time,event,ip,sym,dso
List all perf.data events, with my recommended fields (needs record -a; older kernels):
perf script -f comm,pid,tid,cpu,time,event,ip,sym,dso
Dump raw contents from perf.data as hex (for debugging):
perf script -D
Disassemble and annotate instructions with percentages (needs some debuginfo):
perf annotate --stdio