For example, to monitor all mkdir calls made, the best I could come up with was:
#!/bin/sh
set -eux
d=debug/tracing
mkdir -p debug
if ! mountpoint -q debug; then
mount -t debugfs nodev debug
fi
# Stop tracing.
echo 0 > "${d}/tracing_on"
# Clear previous traces.
echo > "${d}/trace"
# Enable tracing mkdir
echo sys_enter_mkdir > "${d}/set_event"
# Set tracer type.
echo function > "${d}/current_tracer"
# Filter only sys_mkdir as a workaround.
echo SyS_mkdir > "${d}/set_ftrace_filter"
# Start tracing.
echo 1 > "${d}/tracing_on"
# Generate two mkdir calls.
rm -rf /tmp/a
rm -rf /tmp/b
mkdir /tmp/a
mkdir /tmp/b
# View the trace.
cat "${d}/trace"
# Stop tracing.
echo 0 > "${d}/tracing_on"
umount debug
And then after running with sudo it gives:
# tracer: function
#
# entries-in-buffer/entries-written: 4/4 #P:16
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
mkdir-31254 [015] .... 2010985.576760: sys_mkdir(pathname: 7ffc54b32c77, mode: 1ff)
mkdir-31254 [015] .... 2010985.576763: SyS_mkdir
mkdir-31255 [007] .... 2010985.578363: sys_mkdir(pathname: 7fff02d90c77, mode: 1ff)
mkdir-31255 [007] .... 2010985.578365: SyS_mkdir
My problem with this is that it output two lines for each syscall:
sys_mkdir which is the event that I want
SyS_mkdir which is the filtered function workaround, which I don't want to see
If I instead try to do:
echo > "${d}/set_ftrace_filter"
or don't touch that file at at all, then it shows a ton of functions and makes it hard to fint the syscall at all.
If there a nicer way to disable regular functions, and keep just syscall events?
I could use just SyS_mkdir and disable the syscall event I guess, but it feels cleaner if I could use the more specific event? Also:
the event shows arguments, which is nicer.
syscall function names change across kernel versions. E.g., it is already __x64_sys_mkdir instead of SyS_mkdir on Linux v4.18.
Related:
Tested on Ubuntu 18.04, Linux kernel 4.15.
解决方案
In addition, it's worth mention another concise way to gain such info. One can do something like:
stap -e 'probe syscall.mkdir { printf("%s[%d] -> %s(%s)\n", execname(), pid(), name, argstr) }'
The output:
systemd-journal[318] -> mkdir("/var/log/journal/c8d2562a041649cdbfd1ac5e24dbe0db", 0755)
systemd-journal[318] -> mkdir("/var/log/journal/c8d2562a041649cdbfd1ac5e24dbe0db", 0755)
mkdir[4870] -> mkdir("wtf", 0777)
...
Another way:
stap -e 'probe kernel.function("sys_mkdir") { printf("%s[%d] (%s)\n", execname(), pid(), $$parms) }'
The output:
systemd-journal[318] (pathname=0x55b74f7ab8b0 mode=0x1ed)
systemd-journal[318] (pathname=0x55b74f7ab8b0 mode=0x1ed)
mkdir[8532] (pathname=0x7ffcf30af761 mode=0x1ff)
...
You can customize the output as you like.
P.S. Systemtap is based on kprobes. Architecture doc will help to understand its internals.