1. 背景介绍
在偶然的一次实验中(具体是pinctrl实验),我发现有些平台的pincontroller驱动起得很晚,而pinctrl client驱动却起得很早,在设备驱动模型中probe之前又会进行管脚复用的相关设置,按照常理来讲,这就产生了某种依赖性: pincontroller必须尽早启动,否则pinctrl client无法使用管脚复用功能,但实际上的效果并非如此,尽管pincontroller驱动起得很晚,但是client仍然能够正常使用pinctrl子系统提供的复用功能,这就是延迟probe机制。
2. 提交说明
我在github上找到了probe延迟机制的提交,最原始的提交如下链接,后续有人陆续在上面修改BUG:
drivercore: Add driver probe deferral mechanism · torvalds/linux@d1c3414 · GitHub
我们可以看看他的提交描述,引入这一机制解决什么样的问题:
为了解决驱动间的顺序依赖,引入了该机制后驱动间的顺序依赖就解耦了。
3. probe延迟机制的具体说明
该机制的主要引入是引入在drivers/base/dd.c中的,并且为struct device结构引入了一个链表节点来挂载被延迟probe的设备 :
简述一下该机制:
1). 在dev与drv匹配成功后的really_probe()中,如果驱动与设备因为某种原因无法probe成功,那么probe返回-EPROBE_DEFER表示自己需要延迟probe。
2). 这个时候调用driver_deferred_probe_add(dev)将设备加入延迟probe的链表中。
3). 处理deferred_probe_pending_list链表有两个时间点
第一个时间点是某些dev和drv probe成功后的driver_bound()中:
但是这个时机一般driver_deferred_probe_trigger()是无效的:
第二个时间点是late_initcall(deferred_probe_initcall):
在这个时机不仅真正创建了执行延迟probe的工作队列deferwq,还真正处理了deferred_probe_pending_list中挂载的节点进行延迟probe。
可以看到这个时机真的很晚了(late_initcall), 正如描述中所说: "this initcall makes sure that deferred probing is delayed until late_initcall time"。
deferred_probe_initcall()
->driver_deferred_probe_trigger()
->deferred_probe_work_func()
->bus_probe_device()
4). late_initcall之后
类似于ko这种场景,deferred probe就靠driver_bound()去触发,因为此时driver_deferred_probe_enable为true,且工人队列已经建立,driver_deferred_probe_trigger()就生效了。
4. 实验
为了模拟上述deferred probe机制,我构造了两个驱动: driver1.c与driver2.c,并且使他们产生依赖: driver2.c依赖driver1.c,否则probe不成功。
代码如下:driver1.c
#include <linux/types.h>
#include <linux/kernel.h>
#include <linux/delay.h>
#include <linux/ide.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/errno.h>
#include <linux/gpio.h>
#include <linux/cdev.h>
#include <linux/device.h>
#include <linux/of.h>
#include <linux/of_address.h>
#include <linux/of_gpio.h>
#include <asm/mach/map.h>
#include <asm/uaccess.h>
#include <asm/io.h>
#include <linux/platform_device.h>
extern int g_val;
static int driver1_dummy_probe(struct platform_device *dev)
{
printk("[LJW]driver1_dummy_probe=====>\n");
/* driver1 modified the g_val */
g_val = 1;
printk("[LJW]driver1_dummy_probe<=====\n");
return 0;
}
static int driver1_dummy_remove(struct platform_device *dev)
{
printk("[LJW]driver1_dummy_remove=====>\n");
printk("[LJW]driver1_dummy_remove<=====\n");
return 0;
}
static struct platform_device driver1_dummy_device = {
.name = "driver1_compatible",
};
static struct platform_driver driver1_dummy_driver = {
.probe = driver1_dummy_probe,
.remove = driver1_dummy_remove,
.driver = {
.name = "driver1_compatible",
},
};
static int __init driver1_init(void)
{
int ret;
printk("[LJW]driver1_init=====>\n");
ret = platform_device_register(&driver1_dummy_device);
if (ret < 0) {
printk("[FAILED]platform_device_register failed for driver1_dummy_device!\n");
return -1;
}
printk("[SUCCESS]platform_device_register for driver1_dummy_device\n");
ret = platform_driver_register(&driver1_dummy_driver);
if (ret < 0) {
printk("[FAILED]platform_driver_register failed for driver1_dummy_driver!\n");
platform_device_unregister(&driver1_dummy_device);
return -1;
}
printk("[SUCCESS]platform_driver_register for driver1_dummy_driver\n");
return 0;
}
static void __exit driver1_exit(void)
{
platform_driver_unregister(&driver1_dummy_driver);
platform_device_unregister(&driver1_dummy_device);
return;
}
module_init(driver1_init);
module_exit(driver1_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("liaojunwu");
代码如下: driver2.c
#include <linux/types.h>
#include <linux/kernel.h>
#include <linux/delay.h>
#include <linux/ide.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/errno.h>
#include <linux/gpio.h>
#include <linux/cdev.h>
#include <linux/device.h>
#include <linux/of.h>
#include <linux/of_address.h>
#include <linux/of_gpio.h>
#include <asm/mach/map.h>
#include <asm/uaccess.h>
#include <asm/io.h>
#include <linux/platform_device.h>
int g_val = 0;
EXPORT_SYMBOL(g_val);
static int driver2_dummy_probe(struct platform_device *dev)
{
printk("[LJW]driver2_dummy_probe=====>\n");
/* driver1 modified the g_val */
if (g_val == 0) {
printk("[LJW]driver2 probe failed, return -EPROBE_DEFER!\n");
return -EPROBE_DEFER;
}
printk("[LJW]driver2_dummy_probe<=====\n");
return 0;
}
static int driver2_dummy_remove(struct platform_device *dev)
{
printk("[LJW]driver2_dummy_remove=====>\n");
printk("[LJW]driver2_dummy_remove<=====\n");
return 0;
}
static struct platform_device driver2_dummy_device = {
.name = "driver2_compatible",
};
static struct platform_driver driver2_dummy_driver = {
.probe = driver2_dummy_probe,
.remove = driver2_dummy_remove,
.driver = {
.name = "driver2_compatible",
},
};
static int __init driver2_init(void)
{
int ret;
printk("[LJW]driver2_init=====>\n");
ret = platform_device_register(&driver2_dummy_device);
if (ret < 0) {
printk("[FAILED]platform_device_register failed for driver2_dummy_device!\n");
return -1;
}
printk("[SUCCESS]platform_device_register for driver2_dummy_device\n");
ret = platform_driver_register(&driver2_dummy_driver);
if (ret < 0) {
printk("[FAILED]platform_driver_register failed for driver2_dummy_driver!\n");
platform_device_unregister(&driver2_dummy_device);
return -1;
}
printk("[SUCCESS]platform_driver_register for driver2_dummy_driver\n");
return 0;
}
static void __exit driver2_exit(void)
{
platform_driver_unregister(&driver2_dummy_driver);
platform_device_unregister(&driver2_dummy_device);
return;
}
module_init(driver2_init);
module_exit(driver2_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("liaojunwu");
makefile:
.PHONY: build clean
#Why change the SHELL, because default /bin/sh not support source command
SHELL := /bin/bash
KERNELDIR := /home/liaojunwu/linux/code/sdk_ori_code/sdk_ori
CURRENT_PATH := $(shell pwd)
obj-m := driver1.o
obj-m += driver2.o
build: pre_build kernel_modules
pre_build:
source /opt/fsl-imx-x11/4.1.15-2.1.0/environment-setup-cortexa7hf-neon-poky-linux-gnueabi
kernel_modules:
$(MAKE) -C $(KERNELDIR) M=$(CURRENT_PATH) modules
clean:
$(MAKE) -C $(KERNELDIR) M=$(CURRENT_PATH) clean
rm -rf *.mod.c *.o *.ko *.order *.symvers
实验结果:
通过实验结果可以很清楚地看到deferred probe的整个过程,可以看到虽然先加载driver2.ko但是driver2的probe是等到driver1的probe执行完毕才执行的,正是因为deferred probe机制为其保证了正确的执行顺序,同时也可以看到driver2的probe实际上是在工作队列中执行的,这时候与driver1的某些代码是并发关系(原始的提交在处理这种并发关系上有一些BUG,后人陆续有修改,具体可以去追溯github),最后在driver2的probe成功后在driver_bound()里面又触发了一次probe的defer。