ftrace
创建人:Yonatan Goldschmidt(By: Yonatan Goldschmidt)
Recently, while conducting research, I wanted to follow a specific kernel code flow: I was about to modify the behavior of an existing kernel mechanism, and I needed to understand the flow of data between different functions.
最近,在进行研究时,我想遵循一个特定的内核代码流程:我将要修改现有内核机制的行为,并且我需要了解不同功能之间的数据流。
As with any open source project you’re trying to analyze, your go-to assistant is the source code. Linux, besides being open source, provides tons of dynamic tools for debugging, and they are a great supplement to static reading of the code. In this post, I’ll focus on one of these tools, ftrace, and present a modification I made to make its filtering capabilities more versatile.
与您要分析的任何开源项目一样,您的辅助助手是源代码。 Linux除了是开放源代码之外,还提供了大量用于调试的动态工具,它们是静态读取代码的重要补充。 在本文中,我将重点介绍这些工具之一ftrace ,并介绍我进行的修改,以使其过滤功能更加通用。
ftrace (ftrace)
At its base, ftrace (Function Tracer) is a dynamic function instrumentation infrastructure. It can be used to set dynamic traces on virtually all kernel functions, and also supports a large set of static tracepoints, used to record core kernel events. It is available in most modern Linux distributions.
ftrace(Function Tracer)的基础是动态功能检测基础结构。 它可以用于在几乎所有内核功能上设置动态跟踪,还支持用于记录核心内核事件的大量静态跟踪点。 在大多数现代Linux发行版中都可用。
It’s a powerful tool for tracing flows and events in the kernel. Providing both in-kernel API and usermode control interface via tracefs
, it’s arguably one of the most comprehensive tracing tools in Linux.
它是跟踪内核中的流和事件的强大工具。 通过tracefs
提供内核内API和用户模式控制接口,可以说它是Linux中最全面的跟踪工具之一。
ftrace has many modes of operation, in this post I’ll focus on the function_graph
mode: It allows you to record the call graph originating from a given function, recursively.
ftrace具有多种操作模式,在本文中,我将重点介绍function_graph
模式:它允许您递归地记录源自给定函数的调用图。
In case you’ve never used it (or just as a slight reminder) — here’s how it looks when tracing the kernel path for a vfs_read
call on /proc/version
:
如果您从未使用过它(或只是提醒一下),这是在/proc/version
上跟踪vfs_read
调用的内核路径时的vfs_read
:
8) | vfs_read() {
8) | rw_verify_area() {
8) | security_file_permission() {
8) | apparmor_file_permission() {
8) | common_file_perm() {
8) 0.110 us | aa_file_perm();
8) 0.322 us | }
8) 0.483 us | }
8) 0.079 us | __fsnotify_parent();
8) 0.082 us | fsnotify();
8) 1.004 us | }
8) 1.180 us | }
8) | __vfs_read() {
8) | proc_reg_read() {
8) | seq_read() {
8) 0.106 us | rcu_all_qs();
8) | kvmalloc_node() {
8) | __kmalloc_node() {
8) 0.099 us | kmalloc_slab();
8) 0.086 us | rcu_all_qs();
8) 0.098 us | should_failslab();
8) 0.128 us | memcg_kmem_get_cache();
8) 0.092 us | memcg_kmem_put_cache();
8) 1.562 us | }
8) 1.743 us | }
8) 0.089 us | single_start();
8) | version_proc_show() {
8) | seq_printf() {
8) 0.836 us | seq_vprintf();
8) 1.004 us | }
8) 1.158 us | }
....
That call graph can provide great aid in understanding flow, alongside reading the code.
调用图可以帮助您理解流程,同时阅读代码。
This isn’t a starters guide for ftrace, so I will assume you know its basics from this point onward. If you want a longer introduction on ftrace and its additional modes, I recommend reading the articles in [1], [2] and [3].
这不是ftrace的入门指南,因此从现在开始,我将假定您了解它的基础知识。 如果您想对ftrace及其附加模式进行更长时间的介绍,建议阅读[1],[2]和[3]中的文章。
Back here, I was trying to understand the receive flow of a specific TCP packet (which functions does it go through).
回到这里,我试图了解特定TCP数据包的接收流程(它执行的功能)。
The TCP receive entry-point function is tcp_v4_rcv
. We can function_graph
it, but we’ll soon encounter a problem — there are heaps of data generated by graph-tracing this function: On my desktop, under a not-so-heavy network load, I was getting a 5MB/s trace file! That’s a lot of data.