A quick first look at the kernel printk()

最新推荐文章于 2024-10-02 20:28:37 发布

巭犇

最新推荐文章于 2024-10-02 20:28:37 发布

阅读量375

点赞数 1

文章标签： linux 运维服务器

本文链接：https://blog.csdn.net/zyqash/article/details/127381620

版权

本文深入探讨了Linux内核的打印与日志系统，包括使用printk API进行内核消息记录，内存环形缓冲区的工作原理，以及如何通过dmesg和journalctl查看日志。此外，还介绍了不同级别的log级别及其用法，以及如何将内核消息直接写入控制台设备。最后讨论了用户空间生成内核消息的方法。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Understanding kernel logging and printk

Using the kernel memory ring buffer

Kernel logging and systemd's journalctl

Using printk log levels

The pr_ convenience macros

Wiring to the console

Generating kernel messages from the user space

To emit a message, the user space C developer will often use the trusty printf(3) glibc API (or perhaps the cout when writing C++ code). However, it's important to understand that in kernel space, there are no libraries. Hence, we simply do not have access to the good old printf() API. Instead, it has essentially been reimplemented within the kernel as the printk() kernel API (curious as to where its code is? its here within the kernel source tree: kernel/printk/printk.c:printk()).

Emitting a message via the printk() API is simple and very much similar to doing so with printf(3). In our simple kernel module, here's where the action occurs:

printk(KERN_INFO "Hello, world\n");

But the similarities end there. The key difference between printf and printk is this: the user space printf(3) library API works by formatting a text string as requested and invoking the write(2) system call, which in turn actually performs a write to the stdout device, which, by default, is the Terminal window (or console device). The kernel printk API also formats its text string as requested, but its output destination differs. It writes to at least one place – the first one in the following list – and possibly to a few more:

A kernel log buffer in RAM (volatile)
A log file, the kernel log file (non-volatile)
The console device

When you emit a message via printk, it's guaranteed that the output goes into a log buffer in kernel memory (RAM). This, in effect, constitutes the kernel log. It's important to note that you will never see the printk output directly when working in graphical mode with an X server process running (the default environment when working on a typical Linux distro). So, the obvious question here is: how do you see the kernel log buffer content? There are a few ways. For now, let's just make use of the quick and easy way.

Use the dmesg(1) utility! By default, dmesg will dump the entire kernel log buffer content to stdout.

Within the kernel log, as displayed by the dmesg(1) utility, the numbers in the leftmost column are a simple timestamp, in [seconds.microseconds] format, of time elapsed since system boot (it is not recommended to treat it as being perfectly accurate, though). By the way, this timestamp is a Kconfig variable – a kernel config option – named CONFIG_PRINTK_TIME; it can be overridden by the printk.time kernel parameter.

Understanding kernel logging and printk

There is still a lot to cover regarding the logging of kernel messages via the printk kernel API. This section delves into some of the details. It's important for a budding kernel developer like you to clearly understand these.

We saw earlier, in the A quick first look at the kernel printk section, the essentials of using the kernel printk API's functionality. Here, we explore a lot more with respect to the printk() API's usage. In our simple kernel module, here's the line of code that emits the "Hello, world" message:

printk(KERN_INFO "Hello, world\n");

Again, printk is similar to printf in terms of the format string and how that works – but the similarities end there. For emphasis, we repeat: a key difference between printf and printk is that printf(3) is a user space library API that works by invoking the write(2) system call, which writes to the stdout device, which by default is usually the Terminal window (or console device). The printk, on the other hand, is a kernel space API whose output instead goes to at least one place, the first one shown in the list below, and possibly to more places:

A kernel log buffer in RAM (volatile)
A log file, the kernel log file (non-volatile)
The console device

Using the kernel memory ring buffer

The kernel log buffer is simply a memory buffer within a kernel address space where the printk output is saved (logged). More technically, it's the global __log_buf[] variable. Its definition in the kernel source is as follows:

kernel/printk/printk.c:
#define __LOG_BUF_LEN (1 << CONFIG_LOG_BUF_SHIFT)
static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);

It's architected as a ring buffer; it has a finite size (__LOG_BUF_LEN bytes), and once it's full, it gets overwritten from byte zero. Hence, it's called a "ring" or circular, buffer). Here, we can see that the size is based on the Kconfig variable CONFIG_LOG_BUF_SHIFT This value is shown and can be overridden as part of the kernel (menu)config here: General Setup > Kernel log buffer size.

It's an integer value with a range of 12 - 25 (we can always search init/Kconfig and see its spec), with a default value of 18. So, the size of the log buffer = 2 18 = 256 KB. However, the actual runtime size is affected by other config directives as well, notably LOG_CPU_MAX_BUF_SHIFT, which makes the size a function of the number of CPUs on the system. Furthermore, the relevant Kconfig file says, "Also this option is ignored when the log_buf_len kernel parameter is used as it forces an exact (power of two) size of the ring buffer." So, that's interesting; we can often override defaults by passing a kernel parameter (via the bootloader)!

Kernel parameters are useful, many, and varied, and are well worth checking out. See the official documentation here: https://www.kernel.org/doc/html/latest/admin-guide/ke rnel-parameters.html. A snippet from the Linux kernel documentation on the log_buf_len kernel parameter reveals the details:

log_buf_len=n[KMG] 

Sets the size of the printk ring buffer,
in bytes. n must be a power of two and greater
than the minimal size. The minimal size is defined
by LOG_BUF_SHIFT kernel config parameter. There is
also CONFIG_LOG_CPU_MAX_BUF_SHIFT config parameter
that allows to increase the default size depending
on the number of CPUs. See init/Kconfig for more
details.

Whatever the size of the kernel log buffer, two issues when dealing with the printk API become obvious:

Its messages are being logged in volatile memory (RAM); if the system crashes or power cycles in any manner, we will lose the precious kernel log (often eliminating our ability to debug).
The log buffer isn't very large by default, typically just 256 KB; voluminous prints will overwhelm the ring buffer, making it wrap around, thus losing information.

Kernel logging and systemd's journalctl

An obvious solution to the previously mentioned issues is to write (append) the kernel printk to a file. This is precisely how most modern Linux distributions are set up. The location of the log file varies with the distro: conventionally, the Red Hatbased ones write into the /var/log/messages file and the Debian-based ones into /var/log/syslog. Traditionally, the kernel printk would hook into the user space system logger daemon (syslogd) to perform file logging, thus automatically getting the benefit of more sophisticated features, such as log rotation, compression, and archival.

Over the past several years, though, system logging has been completely taken over by a useful and powerful new framework for system initialization called systemd (it replaces, or often works in addition to, the old SysV init framework). Indeed, systemd is now routinely used on even embedded Linux devices. Within the systemd framework, logging is performed by a daemon process called systemd-journal, and the journalctl(1) utility is the user interface to it.

Using printk log levels

To understand and use printk log levels, let's begin by reproducing that single line of code – the first printk from our helloworld_lkm kernel module:

printk(KERN_INFO "Hello, world\n");

Let's now address the elephant in the room: what exactly does KERN_INFO mean? Firstly, be careful now: it's not what your knee-jerk reaction says it is – a parameter. No! Notice that there is no comma character between it and the format string; just white space. KERN_INFO is merely one of eight log levels that a kernel printk gets logged at. A key thing to understand right away is that this log level is not a priority of any sort; its presence allows us to filter messages based on log level. The kernel defines eight possible log levels for printk; here they are:

[root@ecs-3370 linux]# cat  kern_levels.h 
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef __KERN_LEVELS_H__
#define __KERN_LEVELS_H__

#define KERN_SOH	"\001"		/* ASCII Start Of Header */
#define KERN_SOH_ASCII	'\001'

#define KERN_EMERG	KERN_SOH "0"	/* system is unusable */
#define KERN_ALERT	KERN_SOH "1"	/* action must be taken immediately */
#define KERN_CRIT	KERN_SOH "2"	/* critical conditions */
#define KERN_ERR	KERN_SOH "3"	/* error conditions */
#define KERN_WARNING	KERN_SOH "4"	/* warning conditions */
#define KERN_NOTICE	KERN_SOH "5"	/* normal but significant condition */
#define KERN_INFO	KERN_SOH "6"	/* informational */
#define KERN_DEBUG	KERN_SOH "7"	/* debug-level messages */

#define KERN_DEFAULT	""		/* the default kernel loglevel */

/*
 * Annotation for a "continued" line of log printout (only done after a
 * line that had no enclosing \n). Only to be used by core/arch code
 * during early bootup (a continued line is not SMP-safe otherwise).
 */
#define KERN_CONT	KERN_SOH "c"

/* integer equivalents of KERN_<LEVEL> */
#define LOGLEVEL_SCHED		-2	/* Deferred messages from sched code
					 * are set to this special level */
#define LOGLEVEL_DEFAULT	-1	/* default (or last) loglevel */
#define LOGLEVEL_EMERG		0	/* system is unusable */
#define LOGLEVEL_ALERT		1	/* action must be taken immediately */
#define LOGLEVEL_CRIT		2	/* critical conditions */
#define LOGLEVEL_ERR		3	/* error conditions */
#define LOGLEVEL_WARNING	4	/* warning conditions */
#define LOGLEVEL_NOTICE		5	/* normal but significant condition */
#define LOGLEVEL_INFO		6	/* informational */
#define LOGLEVEL_DEBUG		7	/* debug-level messages */

#endif

So, now we see that the KERN_ log levels are merely strings ("0", "1", ..., "7") that get prefixed to the kernel message being emitted by printk; nothing more. This gives us the useful ability to filter messages based on log level. The comment on the right of each of them clearly shows the developer when to use which log level.

Let's quickly look at a couple of actual examples from within the Linux kernel source tree. When the kernel's hangcheck-timer device driver (somewhat akin to a software watchdog) determines that a certain timer expiry (60 seconds by default) was delayed for over a certain threshold (by default, 180 seconds), it restarts the system! Here we show the relevant kernel code – the place where the hangcheck-timer driver emits printk in this regard:

// drivers/char/hangcheck-timer.c
[...]
if (hangcheck_reboot) {
printk(KERN_CRIT "Hangcheck: hangcheck is restarting the machine.\n");
emergency_restart();
} else {
[...]

Check out how the printk API was called with log level set to KERN_CRIT.

The pr_ convenience macros

The convenience pr_() macros given here ease coding pain. The clunky printk(KERN_FOO ""); is replaced with the elegant pr_foo("");, where is the log level; their use is encouraged:

// include/linux/printk.h:
[...]
/*
* These can be used to print at the various log levels.
* All of these will print unconditionally, although note that pr_debug()
* and other debug macros are compiled out unless either DEBUG is defined
* or CONFIG_DYNAMIC_DEBUG is set.
*/
#define pr_emerg(fmt, ...) \
printk(KERN_EMERG pr_fmt(fmt), ##__VA_ARGS__)
#define pr_alert(fmt, ...) \
printk(KERN_ALERT pr_fmt(fmt), ##__VA_ARGS__)
#define pr_crit(fmt, ...) \
printk(KERN_CRIT pr_fmt(fmt), ##__VA_ARGS__)
#define pr_err(fmt, ...) \
printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
#define pr_warning(fmt, ...) \
printk(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
#define pr_warn pr_warning
#define pr_notice(fmt, ...) \
printk(KERN_NOTICE pr_fmt(fmt), ##__VA_ARGS__)
#define pr_info(fmt, ...) \
printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
[...]
/* pr_devel() should produce zero code unless DEBUG is defined */
#ifdef DEBUG
#define pr_devel(fmt, ...) \
printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
#else
#define pr_devel(fmt, ...) \
no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
#endif

Writing a kernel message directly to the console device is at times very useful; the next section deals with the details on how we can achieve this.

Wiring to the console

Recall that the printk output might go to up to three locations:

The first being the kernel memory log buffer (always)
The second being non-volatile log files
The last one (that we'll address here): the console device

Traditionally, the console device is a pure kernel feature, the initial Terminal window that the superuser logs into (/dev/console) in a non-graphical environment. Interestingly, on Linux, we can define several consoles – a teletype terminal (tty) window (such as /dev/console), a text-mode VGA, a framebuffer, or even a serial port served over USB (this being common on embedded systems during development; see more on Linux consoles in the Further reading section of this chapter). For example, when we connect a Raspberry Pi to an x86-64 laptop via a USB-toRS232 TTL UART (USB-to-serial) cable (see the Further reading section of this chapter for a blog article on this very useful accessory and how to set it up on the Raspberry Pi!) and then use minicom(1) (or screen(1)) to get a serial console, this is what shows up as the tty device – it's the serial port:

rpi # tty
/dev/ttyS0

The point here is that the console is often the target of important-enough log messages, including those originating from deep within the kernel. Linux's printk uses a proc-based mechanism for conditionally delivering its data to the console device. To understand this better, let's first check out the relevant proc pseudo-file:

[root@ecs-3370 linux]# cat /proc/sys/kernel/printk
4	4	1	7

We interpret the preceding four numbers as printk log levels (with 0 being the highest and 7 the lowest in terms of "urgency"). The preceding four-integer sequence's meaning is this:

The current (console) log level - The implication being that all messages less than this value will appear on the console device! The default level for messages that lack an explicit log level The minimum allowed log level The boot-time default log level

Generating kernel messages from the user space

A popular debug technique that we programmers use is to sprinkle prints at various points in the code, often allowing us to narrow down the source of an issue. This is indeed a useful debugging technique and is called instrumenting the code. Kernel developers often use the venerable printk API for just this purpose.

echo "test_script: msg 1" > /dev/kmsg