目录
ftrace工具的使用
tracepoint愿意与应用
原理
tracepoint实质上就是内核通过添加trace点实现对函数等的trace功能,添加trace点就是通过一系列trace宏来实现的,常用的宏如下:
TRACE_EVENT、DEFINE_TRACE、DECLARE_TRACE、DECLARE_TRACE_NOARGS等
内核配置依赖:TRACE_EVENT不依赖特殊的内核配置,如果内核开启了ftrace(这个时候会自动开启COFNIG_TRACEPOINTS),此时,会需要一个全局变量__tracepoint_##name的使用,此时需要通过DECLARE_TRACE声明,这里需要注意的是,如果开启了CONFIG_TRACEING_EVENTS_GPIO,则就定义了NOTRACE,此时就不会定义TRACEPOINTS_ENABLED这个宏,TRACEPOINTS_ENABLED这个宏会影响DEFINE_TRACE的使用,如果TRACEPOINTS_ENABLED未定义,则DEFINE_TRACE是一个没有具体实现的宏。
应用
strace工具的使用
perf工具的使用
这里除了需要下载perf工具之外,还需要下载火焰图生成工具FlameGraph
sudo perf record -e cpu-clock -g -p 24905 -- sleep 100
sudo perf script -i perf.data &> perf.unfold
sudo /usr/bin/FlameGraph/stackcollapse-perf.pl perf.unfold &> perf.fold
sudo /usr/bin/FlameGraph/flamegraph.pl perf.fold &> perf.svg
perf的详细使用参考《Linux kernel profiling with perf》
火焰图主要反映了进程涉及到的函数调用情况,可以参考博客《perf + 火焰图分析程序》的使用
perf结果分析与使用:参考《Linux Perf 性能分析工具及火焰图浅析》
工具:FlameGraph
perf工具源码在Linux内核下的tools目录里面,内核交叉编译tools工具的perf方式:
首先进入tools/perf目录,执行下面命令编译:
make ARCH=arm64 CROSS_COMPILE=/aarch64_eabi_gcc6.2.0_glibc2.24.0_fp/bin/aarch64-unknown-linux-gnueabi- perf LDFLAGS+=--static NO_LIBELF=1 V=1 WERROR=0 NO_SLANG=1 NO_GTK2=1 NO_LIBAUDIT=1 NO_LIBNUMA=1 NO_LIBPERL=1 NO_STRLCPY=1
如果要记录多个事件,-e参数可以如下使用:
perf record -e cpu-clock,sched:,syscalls: …
这里的event可以通过perf list命令获取
crash工具的使用
Kdump的使用
Kdump可用的内核源码《github Kdump》和《k》
内核启动参数(也就是u-boot的启动参数)需要增加一个crashkernel=512M,表示捕获内核预留512M大小的内存,这个值可以根据单板内存做调整。
Kdump工程:Linux Crash之Kdump、centos7 Kernel Crash dump
kernel crash内核原理
panic实现过程
Linux内核panic是为了在内核出现异常错误(如空指针访问、内存越界、内存异常释放等)情况下,将系统挂起,并记录trace信息,panic内核实现过程如下:
/**
* panic - halt the system
* @fmt: The text string to print
*
* Display a message, then perform cleanups.
*
* This function never returns.
*/
void panic(const char *fmt, ...)
{
static char buf[1024];
va_list args;
long i, i_next = 0, len;
int state = 0;
int old_cpu, this_cpu;
/*
* crash_kexec_post_notifiers是一个内核参数,该变量在
* kexec_should_crash中使用,即该变量使能时,不会立即运行crash_kexec(),
* 而是在panic()中发起了panic notifier之后再运行crash_kexec()
*/
bool _crash_kexec_post_notifiers = crash_kexec_post_notifiers;
/*
* Disable local interrupts. This will prevent panic_smp_self_stop
* from deadlocking the first cpu that invokes the panic, since
* there is nothing to prevent an interrupt handler (that runs
* after setting panic_cpu) from invoking panic() again.
*/
local_irq_disable();//local中断影响死锁出现panic的CPU
preempt_disable_notrace();//关闭抢占,防止出现panic的CPU上的高优先级任务一直占用CPU
/*
* It's possible to come here directly from a panic-assertion and
* not have preempt disabled. Some functions called from here want
* preempt to be disabled. No point enabling it later though...
*
* Only one CPU is allowed to execute the panic code from here. For
* multiple parallel invocations of panic, all other CPUs either
* stop themself or will wait until they are stopped by the 1st CPU
* with smp_send_stop().
*
* `old_cpu == PANIC_CPU_INVALID' means this is the 1st CPU which
* comes here, so go ahead.
* `old_cpu == this_cpu' means we came from nmi_panic() which sets
* panic_cpu to this CPU. In this case, this is also the 1st CPU.
*/
this_cpu = raw_smp_processor_id();//同一时刻,只有一个CPU执行panic代码
/*
* atomic_cmpxchg是在panic_cpu.counter为PANIC_CPU_INVALID时,将
* this_cpu的值赋值给panic_cpu.counter,否则就不修改,最后返回
* panic_cpu.counter
*/
old_cpu = atomic_cmpxchg(&panic_cpu, PANIC_CPU_INVALID, this_cpu);
if (old_cpu != PANIC_CPU_INVALID && old_cpu != this_cpu)
panic_smp_self_stop();
console_verbose();
bust_spinlocks(1);
va_start(args, fmt);
len = vscnprintf(buf, sizeof(buf), fmt, args);
va_end(args);
if (len && buf[len - 1] == '\n')
buf[len - 1] = '\0';
pr_emerg("Kernel panic - not syncing: %s\n", buf);
#ifdef CONFIG_DEBUG_BUGVERBOSE
/*
* Avoid nested stack-dumping if a panic occurs during oops processing
*/
if (!test_taint(TAINT_DIE) && oops_in_progress <= 1)
dump_stack();
#endif
/*
* If kgdb is enabled, give it a chance to run before we stop all
* the other CPUs or else we won't be able to debug processes left
* running on them.
*/
kgdb_panic(buf);
/*
* If we have crashed and we have a crash kernel loaded let it handle
* everything else.
* If we want to run this after calling panic_notifiers, pass
* the "crash_kexec_post_notifiers" option to the kernel.
*
* Bypass the panic_cpu check and call __crash_kexec directly.
*/
if (!_crash_kexec_post_notifiers) {
printk_safe_flush_on_panic();
__crash_kexec(NULL);
/*
* Note smp_send_stop is the usual smp shutdown function, which
* unfortunately means it may not be hardened to work in a
* panic situation.
*/
smp_send_stop();
} else {
/*
* If we want to do crash dump after notifier calls and
* kmsg_dump, we will need architecture dependent extra
* works in addition to stopping other CPUs.
*/
crash_smp_send_stop();
}
/*
* Run any panic handlers, including those that might need to
* add information to the kmsg dump output.
*/
atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
/* Call flush even twice. It tries harder with a single online CPU */
printk_safe_flush_on_panic();
kmsg_dump(KMSG_DUMP_PANIC);
/*
* If you doubt kdump always works fine in any situation,
* "crash_kexec_post_notifiers" offers you a chance to run
* panic_notifiers and dumping kmsg before kdump.
* Note: since some panic_notifiers can make crashed kernel
* more unstable, it can increase risks of the kdump failure too.
*
* Bypass the panic_cpu check and call __crash_kexec directly.
*/
if (_crash_kexec_post_notifiers)
__crash_kexec(NULL);
#ifdef CONFIG_VT
unblank_screen();
#endif
console_unblank();
/*
* We may have ended up stopping the CPU holding the lock (in
* smp_send_stop()) while still having some valuable data in the console
* buffer. Try to acquire the lock then release it regardless of the
* result. The release will also print the buffers out. Locks debug
* should be disabled to avoid reporting bad unlock balance when
* panic() is not being callled from OOPS.
*/
debug_locks_off();
console_flush_on_panic(CONSOLE_FLUSH_PENDING);
panic_print_sys_info();
if (!panic_blink)
panic_blink = no_blink;
if (panic_timeout > 0) {
/*
* Delay timeout seconds before rebooting the machine.
* We can't use the "normal" timers since we just panicked.
*/
pr_emerg("Rebooting in %d seconds..\n", panic_timeout);
for (i = 0; i < panic_timeout * 1000; i += PANIC_TIMER_STEP) {
touch_nmi_watchdog();
if (i >= i_next) {
i += panic_blink(state ^= 1);
i_next = i + 3600 / PANIC_BLINK_SPD;
}
mdelay(PANIC_TIMER_STEP);
}
}
if (panic_timeout != 0) {
/*
* This will not be a clean reboot, with everything
* shutting down. But if there is a chance of
* rebooting the system it will be rebooted.
*/
if (panic_reboot_mode != REBOOT_UNDEFINED)
reboot_mode = panic_reboot_mode;
emergency_restart();
}
#ifdef __sparc__
{
extern int stop_a_enabled;
/* Make sure the user can actually press Stop-A (L1-A) */
stop_a_enabled = 1;
pr_emerg("Press Stop-A (L1-A) from sun keyboard or send break\n"
"twice on console to return to the boot prom\n");
}
#endif
#if defined(CONFIG_S390)
disabled_wait();
#endif
pr_emerg("---[ end Kernel panic - not syncing: %s ]---\n", buf);
/* Do not scroll important messages printed above */
suppress_printk = 1;
local_irq_enable();
for (i = 0; ; i += PANIC_TIMER_STEP) {
touch_softlockup_watchdog();
if (i >= i_next) {
i += panic_blink(state ^= 1);
i_next = i + 3600 / PANIC_BLINK_SPD;
}
mdelay(PANIC_TIMER_STEP);
}
}
EXPORT_SYMBOL(panic);