Because the examples in the following sections often refer to modules , a couple of initial concepts have to be made clear.
Kernel code can be either statically linked to the main image or loaded dynamically as a module when needed. Not all kernel components are suitable to be compiled as modules. Device drivers and extensions to basic functionalities are good examples of kernel components often compiled as modules. You can refer to Linux Device Drivers for a detailed discussion of the advantages and disadvantages of modules, as well as the mechanisms that the kernel can use to dynamically load them when they are needed and unload them when they are no longer needed.
Every module must provide two special functions, called init_module and cleanup_module. The first one is called at module load time to initialize the module. The second one is invoked by the kernel when removing the module, to release any resources (memory included) that have been allocated for use by the module.
The kernel provides two macros, module_init and module_exit, that allow developers to use arbitrary names for the two routines. The following snapshot is an example from the drivers/net/3c59x.c Ethernet driver:
module_init(vortex_init); module_exit(vortex_cleanup);
In the section "Memory Optimizations," we will see how those two macros are defined and how their definition can change based on the kernel configuration. Most of the kernel uses these two macros, but a few modules still use the old default names init_module and cleanup_module. In the rest of this chapter, I will use module_init and module_exit to refer to the initialization and cleanup functions.
Let's first see how module initialization code used to be written with older kernels, and then how the current kernel model, based on a set of new macros, works.
7.2.1. Old Model: Conditional Code
Regardless of whether a kernel component is compiled as a module or is built statically into the kernel, it needs to be initialized. Because of that, the initialization code of a kernel component may need to distinguish between the two cases by means of conditional directives to the compiler. In the old model, this forced developers to use conditional directives like #ifdef all over the place.
Here is a snapshot from the drivers/net/3c59x.c driver of kernel 2.2.14: note how many times #ifdef MODULE and #if defined (MODULE) are used.
... #if defined(MODULE) && LINUX_VERSION_CODE > 0x20115 MODULE_AUTHOR("Donald Becker <becker@cesdis.gsfc.nasa.gov>"); MODULE_DESCRIPTION("3Com 3c590/3c900 series Vortex/Boomerang driver"); MODULE_PARM(debug, "i"); ... #endif ... #ifdef MODULE ... int init_module(void) { ... } #else int tc59x_probe(struct device *dev) { ... } #endif /* not MODULE */ ... static int vortex_scan(struct device *dev, struct pci_id_info pci_tbl[]) { ... #if defined(CONFIG_PCI) || (defined(MODULE) && !defined(NO_PCI)) ... #ifdef MODULE if (compaq_ioaddr) { vortex_probe1(0, 0, dev, compaq_ioaddr, compaq_irq, compaq_device_id, cards_found++); dev = 0; } #endif return cards_found ? 0 : -ENODEV; } ... #ifdef MODULE void cleanup_module(void) { ... ... ... } #endif
This snapshot shows how the old model let a programmer specify some of the things done differently, depending on whether the code is compiled as a module or statically into the kernel image:
-
The snapshot shows that the cleanup_module routine is defined (and therefore used) only when the driver is compiled as a module.
-
For example, vortex_scan calls vortex_probe1 only when the driver is compiled as a module.
The initialization code is executed differently
Pieces of code could be included or excluded from the module
This model made source code harder to follow, and therefore to debug. Moreover, the same logic is repeated in every module.
7.2.2. New Model: Macro-Based Tagging
Now let's compare the snapshot from the previous section to its counterpart from the same file from a 2.6 kernel:
static char version[] _ _devinitdata = DRV_NAME " ... "; static struct vortex_chip_info { ... } vortex_info_tbl[] _ _devinitdata = { {"3c590 Vortex 10Mbps", ... ... ... } static int _ _init vortex_init (void) { ... } static void _ _exit vortex_cleanup (void) { ... } module_init(vortex_init); module_exit(vortex_cleanup);
You can see that #ifdef directives are no longer necessary.
To remove the mess introduced by conditional code, and therefore make code more readable, kernel developers introduced a set of macros that module developers now can use to write cleaner initialization code (most drivers are good candidates for the use of those macros). The snapshot just shown uses a few of them: _ _init, _ _exit, and _ _devinitdata.
Later sections describe how some of the new macros are used and how they work.
These macros allow the kernel to determine behind the scenes, for each module, what code is to be included in the kernel image, what code is to be excluded because it is not needed, what code is to be executed only at initialization time, etc. This removes the burden from each programmer to replicate the same logic in each module.[*]
[*] Note that the use of these macros does not eliminate completely the use of conditional directives. The kernel still uses conditional directives to set off options that the user can configure when compiling the kernel.
It should be clear that for these macros to allow programmers to replace the old conditional directives, as shown in the example of the previous section, they must be able to provide at least the following two services:
-
Define routines that need to be executed when a new kernel component is enabled, either because it is statically included in the kernel or because it is loaded at runtime as a module
-
Define some kind of order between initialization functions so that dependencies between kernel components can be respected and enforced
7.3. Optimized Macro-Based Tagging
The Linux kernel uses a variety of different macros to mark functions and data structures with special properties: for instance, to mark an initialization routine. Most of those macros are defined in include/linux/init.h. Some of those macros tell the linker to place code or data structures with common properties into specific, dedicated memory areas (memory sections) as well. By doing so, it becomes easier for the kernel to take care of an entire class of objects (routines or data structures) with a common property in a simple manner. We will see an example in the section "Memory Optimizations."
Figure 7-3 shows some of the kernel memory sections: on the left side are the names of the pointers that delimit the beginning and the end of each area section (when meaningful).
Figure 7-3. Some of the memory sections used by initialization code
On the right side are the names of the macros used to place data and code into the associated sections. The figure does not include all the memory sections and associated macros; there are too many to list conveniently.
Tables 7-1 and 7-2 list some of the macros used to tag routines and data structures, respectively, along with a brief description. We will not look at all of them for lack of space, but we will spend a few words on the xxx_initcall macros in the section "xxx_initcall Macros" and on _ _init and _ _exit in the section "_ _init and _ _exit Macros."
The purpose of this section is not to describe how the kernel image is built, how modules are handled, etc., but rather to give you just a few hints about why those macros exist, and how the ones most commonly used by device drivers work.
Macro | Kind of routines the macro is used for |
---|---|
_ _init | Boot-time initialization routine: for routines that are not needed anymore at the end of the boot phase. This information can be used to get rid of the routine under some conditions (see the later section "Memory Optimizations"). |
_ _exit | Counterpart to _ _init. Called when the associated kernel component is shut down. Often used to mark module_exit functions. This information can be used to get rid of the routine under some conditions (see the later section "Memory Optimizations"). |
core_initcall postcore_initcall arch_initcall subsys_initcall fs_initcall device_initcall late_initcall | Set of macros, listed in decreasing order of priority, used to tag initialization routines that need to be executed at boot time. See the later section "xxx_initcall Macros." |
_ _initcall | Obsolete macro, defined as an alias to device_initcall. See the later section "Legacy code." |
_ _exitcalla | One-shot exit function, called when the associated kernel component is shut down. So far, it has been used only to mark module_exit routines. See the later section "Memory Optimizations." |
a _ _exitcall and _ _initcall are defined on top of _ _exit_call and _ _init_call. |
Macro | Kind of data the macro is used for |
---|---|
_ _initdata | Initialized data structure used at boot time only. |
_ _exitdata | Data structure used only by routines tagged with _ _exitcall. It follows that if a routine tagged with _ _exitcall is not going to be used, the same is true of data tagged with _ _exitdata. The same kind of optimization can therefore be applied to _ _exitdata and _ _exitcall. |
Before we go into some more detail on a few of the macros listed in Tables 7-1 and 7-2, it is worth stressing the following points:
-
Most macros come in couples: one (or a set of them) takes care of initialization, and a sister macro (or a sister set) takes care of removal. For example, _ _exit is _ _init's sister; _ _exitcalls is _ _initcall's sister, etc.
-
Macros take care of two points (one or the other, not both): one is when a routine is to be executed (i.e., _ _initcall, _ _exitcall); the other is the memory section a routine or a data structure is to be placed in (i.e., _ _init, _ _exit).
-
The same routine can be tagged with more than one macro. For example, the following snapshot says that pci_proc_init is to be run at boot time (_ _initcall), and can be freed once it is executed (_ _init):
static int _ _init pci_proc_init(void) { ... } _ _initcall(pci_proc_init)
7.3.1. Initialization Macros for Device Initialization Routines
Table 7-3 lists a set of macros commonly used to tag routines used by device drivers to initialize their devices, and that can introduce memory optimizations when the kernel does not have support for Hotplug. In the section "Example of PCI NIC Driver Registration" in Chapter 6, you can find an example of their use. In the later section "Other Optimizations," you can see when the macros in Table 7-3 facilitate memory optimizations.
Name | Description |
---|---|
_ _devinit | Used to tag routines that initialize a device. For instance, for a PCI driver, the routine to which pci_driver->probe is initialized is tagged with this macro. Routines that are exclusively invoked by another routine tagged with _ _devinit are commonly tagged with _ _devinit as well. |
_ _devexit | Used to tag routines to be invoked when a device is removed. |
_ _devexit_p | Used to initialize pointers to routines tagged with _ _devexit. _ _devexit_p(fn) returns fn if the kernel supports both modules and Hotplug, and returns NULL otherwise. See the later section "Other Optimizations." |
_ _devinitdata | Used to tag initialized data structures that are used by functions that take care of device initialization (i.e., are tagged with _ _devinit), and that therefore share their properties . |
_ _devexitdata | Same as _ _devinitdata but associated with _ _devexit. |