Chapter 7. Kernel Infrastructure for Component InitializationTo fully understand a kernel component, you have to know not only what a given set of routines does, but also when those routines are invoked and by whom. The initialization of a subsystem is one of the basic tasks handled by the kernel according to its own model. This infrastructure is worth studying to help you understand how core components of the networking stack are initialized, including NIC device drivers. The purpose of this chapter is to show how the kernel handles routines used to initialize kernel components, both for components statically included into the kernel and those loaded as kernel modules, with a special emphasis on network devices. We will therefore see:
We will not cover all details of the initialization infrastructure, but you'll have a sufficient overview to navigate the source code comfortably.
7.1. Boot-Time Kernel OptionsLinux allows users to pass kernel configuration options to their boot loaders, which then pass the options to the kernel; experienced users can use this mechanism to fine-tune the kernel at boot time.[*] During the boot phase, as shown in Figure 5-1 in Chapter 5, the two calls to parse_args take care of the boot-time configuration input. We will see in the next section why parse_args is called twice, with details in the later section "Two-Pass Parsing."
parse_args is a routine that parses an input string with parameters in the form name_variable=value, looking for specific keywords and invoking the right handlers. parse_args is also used when loading a module, to parse the command-line parameters provided (if any). We do not need to know the details of how parse_args implements the parsing, but it is interesting to see how a kernel component can register a handler for a keyword and how the handler is invoked. To have a clear picture we need to learn:
All the parsing code is in kernel/params.c. We'll cover the points in the list one by one. 7.1.1. Registering a KeywordKernel components can register a keyword and the associated handler with the _ _setup macro, defined in include/linux/init.h. This is its syntax: _ _setup(string, function_handler)
where string is the keyword and function_handler is the associated handler. The example just shown instructs the kernel to execute function_handler when the input boot-time string includes string. string has to end with the = character to make the parsing easier for parse_args. Any text following the = will be passed as input to function_handler. The following is an example from net/core/dev.c, where netdev_boot_setup is registered as the handler for the neTDev= keyword: _ _setup("netdev=", netdev_boot_setup);
The same handler can be associated with different keywords. For instance net/ethernet/eth.c registers the same handler, netdev_boot_setup, for the ether= keyword. When a piece of code is compiled as a module, the _ _setup macro is ignored (i.e., defined as a no-op). You can check how the definition of the _ _setup macro changes in include/linux/init.h depending on whether the code that includes the latter file is a module. The reason why start_kernel calls parse_args twice to parse the boot configuration string is that boot-time options are actually divided into two classes, and each call takes care of one class:
Default options Early options The handling of boot-time options has changed with the 2.6 kernel, but not all the kernel code has been updated accordingly. Before the latest changes, there used to be only the _ _setup macro. Because of this, legacy code that is to be updated now uses the macro _ _obsolete_setup. When the user passes the kernel an option that is declared with the _ _obsolete_setup macro, the kernel prints a message warning about its obsolete status and provides a pointer to the file and source code line where the latter is declared. Figure 7-1 summarizes the relationship between the various macros: all of them are wrappers around the generic routine _ _setup_param. Note that the input routine passed to _ _setup is placed into the .init.setup memory section. The effect of this action will become clear in the section "Boot-Time Initialization Routines."
Figure 7-1. setup_param macro and its wrappers7.1.2. Two-Pass ParsingBecause boot-time options used to be handled differently in previous kernel versions, and not all of them have been converted to the new model, the kernel handles both models. When the new infrastructure fails to recognize a keyword, it asks the obsolete infrastructure to handle it. If the obsolete infrastructure also fails, the keyword and value are passed on to the init process that will be invoked at the end of the init kernel thread via run_init_process (shown in Figure 5-1 in Chapter 5). The keyword and value are added either to the arg parameter list or to the envp environment variable list. The previous section explained that, to allow early options to be handled in the necessary order, boot-string parsing and handler invocation are handled in two passes, shown in Figure 7-2 (the figure shows a snapshot from start_kernel, introduced in Chapter 5):
The second pass first checks whether there is a match with the options implemented according to the new infrastructure. These options are stored in kernel_param data structures, filled in by the module_param macro introduced in the section "Module Options" in Chapter 5. The same macro makes sure that all of those data structures are placed into a specific memory section (_ _param), delimited by the pointers _ _ start_ _ _param and _ _stop_ _ _param. When one of these options is recognized, the associated parameter is initialized to the value provided with the boot string. When there is no match for an option, unknown_bootoption tries to see whether the option should be handled by the obsolete model handler (Figure 7-2).
Figure 7-2. Two-pass option parsing
Obsolete and new model options are placed into two different memory areas:
_ _setup_start ... _ _setup_end _ _ start_ _ _param ... _ _ stop_ _ _param See Chapter 5 for more details on module parameters. Also note that all obsolete model options, regardless of whether they have the early flag set, are placed into the _ _setup_start ... _ _setup_end memory area. 7.1.3. .init.setup Memory SectionThe two inputs to the _ _setup macro we introduced in the previous section are placed into a data structure of type obs_kernel_param, defined in include/linux/init.h: struct obs_kernel_param { const char *str; int (*setup_func)(char*); int early; };
str is the keyword, setup_func is the handler, and early is the flag we introduced in the section "Two-Pass Parsing." The _ _setup_param macro places all of the obs_kernel_params instances into a dedicated memory area. This is done mainly for two reasons:
7.1.4. Use of Boot Options to Configure Network DevicesIn light of what we saw in the previous sections, let's see how the networking code uses boot options. We already mentioned in the section "Registering a Keyword" that both the ether= and netdev= keywords are registered to use the same handler, netdev_boot_setup. When this handler is invoked to process the input parameters (i.e., the string that follows the matching keyword), it stores the result into data structures of type neTDev_boot_setup, defined in include/linux/netdevice.h. The handler and the data structure type happen to share the same name, so make sure you do not confuse the two. struct netdev_boot_setup { char name[IFNAMSIZ]; struct ifmap map; };
name is the device's name, and ifmap, defined in include/linux/if.h, is the data structure that stores the input configuration: struct ifmap { unsigned long mem_start; unsigned long mem_end; unsigned short base_addr; unsigned char irq; unsigned char dma; unsigned char port; /* 3 bytes spare */ };
The same keyword can be provided multiple times (for different devices) in the boot-time string, as in the following example:
However, the maximum number of devices that can be configured at boot time with this mechanism is NEtdEV_BOOT_SETUP_MAX, which is also the size of the static array dev_boot_setup used to store the configurations: static struct netdev_boot_setup dev_boot_setup[NETDEV_BOOT_SETUP_MAX];
neTDev_boot_setup is pretty simple: it extracts the input parameters from the string, fills in an ifmap structure, and adds the latter to the dev_boot_setup array with netdev_boot_setup_add. At the end of the booting phase, the networking code can use the neTDev_boot_setup_check function to check whether a given interface is associated with a boot-time configuration. The lookup on the array dev_boot_setup is based on the device name dev->name: int netdev_boot_setup_check(struct net_device *dev) { struct netdev_boot_setup *s = dev_boot_setup; int i; for (i = 0; i < NETDEV_BOOT_SETUP_MAX; i++) { if (s[i].name[0] != '/0' && s[i].name[0] != ' ' && !strncmp(dev->name, s[i].name, strlen(s[i].name))) { dev->irq = s[i].map.irq; dev->base_addr = s[i].map.base_addr; dev->mem_start = s[i].map.mem_start; dev->mem_end = s[i].map.mem_end; return 1; } } return 0; }
Devices with special capabilities, features, or limitations can define their own keywords and handlers if they need additional parameters on top of the basic ones provided by ether= and netdev= (one driver that does this is PLIP).
|