include/linux/mod_devicetable.h
include/linux/pci.h
pci_device_id
Device identifier. This is not a local ID used by Linux, but an ID defined accordingly to the PCI standard
pci_dev
Each PCI device is assigned a pci_dev instance, just as network devices are assigned net_device instances. This is the structure used by the kernel to refer to a PCI device.
pci_driver
Defines the interface between the PCI layer and the device drivers. This structure consists mostly of function pointers
pci_driver
char *name
Name of the driver
const struct pci_device_id *id_table
Vector of IDs the kernel will use to associate devices to this driver
int (*probe)(struct pci_dev *dev, const struct pci_device_id *id)
Function invoked by the PCI layer when it finds a match between a device ID for which it is seeking a driver and the id_table mentioned previously. This function should enable the hardware, allocate the net_device structure, and initialize and register the new device.* In this function, the driver also allocates any additional data structures (e.g., buffer rings used during transmission or reception) that it may need to work properly.
void (*remove)(struct pci_dev *dev)
Function invoked by the PCI layer when the driver is unregistered from the kernel or when a hot-pluggable device is removed. It is the counterpart of probe and is used to clean up any data structure and state. Network devices use this function to release the allocated I/O ports and I/O memory, to unregister the device, and to free the net_device data structure and any other auxiliary data structure that could have been allocated by the device driver, usually in its probe function
int (*suspend)(struct pci_dev *dev, pm_message_t state)
int (*resume)(struct pci_dev *dev)
Functions invoked by the PCI layer when the system goes into suspend mode and when it is resumed, respectively
int (*enable_wake)(struct pci_dev *dev, u32 state, int enable)
With this function, a driver can enable or disable the capability of the device to wake the system up by generating specific Power Management Event signals
struct pci_dynids dynids
Dynamic IDs
Registering a PCI NIC Device Driver
PCI devices are uniquely identified by a combination of parameters, including vendor, model, etc. These parameters are stored by the kernel in a data structure of type pci_device_id, defined as follows:
struct pci_device_id {
unsigned int vendor, device;
unsigned int subvendor, subdevice;
unsigned int class, class_mask;
unsigned long driver_data;
};
Most of the fields are self-explanatory. vendor and device are usually sufficient to identify the device. subvendor and subdevice are rarely needed and are usually set to a wildcard value (PCI_ANY_ID). class and class_mask represent the class the device belongs to; NETWORK is the class that covers the devices. driver_data is not part of the PCI ID; it is a private parameter used by the driver.
Each device driver registers with the kernel a vector of pci_device_id instances that lists the IDs of the devices it can handle.
PCI device drivers register and unregister with the kernel with pci_register_driver and pci_unregister_driver, respectively
pci_register_driver requires a pci_driver data structure as an argument. Thanks to the pci_driver’s id_table vector, the kernel knows what devices the driver can handle, and thanks to all the virtual functions that are part of pci_driver, the kernel has a mechanism to interact with any device that will be associated with the driver
One of the great advantages of PCI is its elegant support for probing to find the IRQ and other resources each device needs. A module can be passed input parameters at load time to tell it how to configure all the devices for which it is responsible, but sometimes (especially with buses such as PCI) it is easier to let the driver itself check the devices on the system and configure the ones for which it is responsible. The user can still fall back on manual configuration if necessary
The /sys filesystem exports information about system buses (PCI, USB, etc.), including the various devices and relationships between them. /sys also allows an administrator to define new IDs for a given device driver so that besides the static IDs registered by the drivers with their pci_driver structures’ id_table vector, the kernel can use the user-configured parameters.
there are two types of probing:
Static
Given a device PCI ID, the kernel can look up the right PCI driver (i.e., the pci_driver instance) based on the id_table vectors. This is called static probing.
Dynamic
This is a lookup based on IDs the user configures manually, a rare practice but one that is occasionally useful, as for debugging. Dynamic refers to the system administrator’s ability to add an ID; it does not mean the ID can change on its own.
Since dynamic IDs are configured on a running system, they are useful only when the kernel is compiled with support for Hotplug.
Power Management and Wake-on-LAN
PCI power management events are processed by the suspend and resume functions of the pci_driver data structure. Besides taking care of the PCI state, by saving and restoring it, respectively, these functions need to take special steps in the case of NICs:
- suspend mainly stops the device egress queue so that no transmission will be allowed on the device
- resume re-enables the egress queue so that the device is available again for transmissions
Wake-on-LAN (WOL) is a feature that allows an NIC to wake up a system that’s in standby mode when it receives a specific type of frame. WOL is normally disabled by default. The feature can be turned on and off with pci_enable_wake.
When the WOL feature was first introduced, only one kind of frame could wake up a system: “Magic Packets.”* These special frames have two main characteristics
- The destination MAC address belongs to the receiving NIC (whether the address is unicast, multicast, or broadcast).
- Somewhere (anywhere) in the frame a sequence of 48 bits is set (i.e., FF:FF:FF:FF:FF:FF) followed by the NIC MAC address repeated at least 16 times in a row.
Now it is possible to allow other frame types to wake up the system, too. A handful of devices can enable or disable the WOL feature based on a parameter that can be set at module load time.The ethtool tool allows an administrator to configure what kind of frames can wake up the system.Whenever a WOL-enabled device recognizes a frame whose type is allowed to wake up the system, it generates a power management notification that does the job
Let’s use the Intel PRO/100 Ethernet driver in drivers/net/e100.c to illustrate a driver registration:
- The first field (which corresponds to vendor in the structure’s definition) has the fixed value of PCI_VENDOR_ID_INTEL which is initialized to the vendor ID assigned to Intel
- The third and fourth fields (subvendor and subdevice) are often initialized to the wildcard value PCI_ANY_ID, because the first two fields (vendor and device) are sufficient to identify the devices
- Many devices use the macro _ _devinitdata on the table of devices to mark it as initialization data, although e100_id_table does not
The module is initialized by e100_init_module, as specified by the module_init macro.* When the function is executed by the kernel at boot time or at module loading time, it calls pci_module_init,This function registers the driver, and, indirectly, all the associated NICs
The Big Picture
When the system boots, it creates a sort of database that associates each bus to a list of detected devices that use the bus. For example, the descriptor for the PCI bus includes, among other parameters, a list of detected PCI devices.each PCI device is uniquely identified by a large collection of fields in the structure pci_device_id, although only a few are usually necessary.We also saw how PCI device drivers define an instance of pci_driver and register with the PCI layer with pci_register_driver (or its alias, pci_module_init). By the time device drivers are loaded, the kernel has already built its database
When device driver A is loaded, it registers with the PCI layer by calling pci_register_driver and providing its instance of pci_driver. The pci_driver structure includes a vector with the IDs of those PCI devices it can drive. The PCI layer then uses that table to see what devices match in its list of detected PCI devices. It thus creates the driver’s device list shown in (b).In addition, for each matching device, the PCI layer invokes the probe function provided by the matching driver in its pci_driver structure. The probe function creates and registers the associated network device. In this case, device Dev3 needs an additional device driver, called B. When driver B eventually registers with the kernel, Dev3 will be assigned to it. (c)shows the results of loading the driver
When the driver is unloaded later, the module’s module_exit routine invokes pci_unregister_driver. The PCI layer then, thanks to its database, goes through all the devices associated with the driver and invokes the driver’s remove function. This function unregisters the network device.