Measuring kernel latencies to ensure real-time constraints

最新推荐文章于 2021-03-22 07:16:02 发布

Vincent_ywj

最新推荐文章于 2021-03-22 07:16:02 发布

阅读量896

点赞数

分类专栏： Debug

Debug 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

Device drivers in the kernel often need to perform some task in response to some events. To do this, there is not one but many different ways. These deferred execution methods include the Linux workqueue, the tasklet, the kernel thread and so on. Different methods have different scheduling priorities and thus different response latencies. They also differ in what execution context they are in (e.g., process vs. interrupt context), and may affect which method is more suitable for a specific purpose.

In this article, I describe what I find in experimenting the different methods, in terms of their response latencies, and how the system load, and user space task priorities affect them:

workqueue
tasklet
kernel thread

The latency here means the time after a task is invoked and before it is executed. It depends on Linux scheduler latency, the deferred execution method (workqueue vs. tasklet vs. kthread), and the priorities of competing tasks. The first item, the scheduler latency, means the time between a service being requested and the time the scheduler being executed. This was a significant issue for early Linux kernel because it was not preemptive, and thus the kernel scheduler might not get executed for a fairly long period of time after an event is raised. In the recent kernel, the scheduler latency has been greatly reduced due to preemption in the kernel. The caveat is, however, some synchronization techniques, such as the spinlock, may still prevent preemption from happening, and thus can still slow down kernel response in some conditions.

The latency being measured here is not for the scheduler, but for the scheduled task to be executed. This depends also on the scheduling algorithm, and the priority of the task and other competing tasks. One can find detailed discussion of Linux scheduler here (http://oreilly.com/catalog/linuxkernel/chapter/ch10.html). Now, it suffices to say that the scheduler will do three things in order:

execute deferred tasks in the task queue
execute the bottom half (deferred tasklets and soft irqs)
find some process to run based on their scheduling policies (SCHED_FIFO, SCHED_RR, or SCHED_OTHER), and their priorities.

From here, we can see that the tasklet has the highest priority — it is run even before the scheduler looks at the given priorities of any kernel task. Here, the kernel task means an execution unit with a kernel struct task_struct data structure. It includes any userspace process, any POSIX threads (implemented by native posix thread library, NPTL) and any kernel workqueue (either the kernel global queue or one created by a module). Naturally, any of the latter group would have a higher latency than a tasklet. We will see below that the tasklet is indeed the one with lowest latencies in all conditions, especially when the system load is high. However, it is not to say that we should always use the tasklet for everything, because the tasklet runs in an interrupt context, it cannot be used for operations that may sleep (e.g., some memory allocation and I/O), and it may prevent kernel premption and increase kernel latency itself.

To choose a method wisely, we can measure their runtime performance, and understand how quick each of the methods is, and how setting scheduling policies and priorities may change the latencies. Here, I describe some data I got in some tests. The tests are done with a kernel module, which implements three execution methods, the workqueue (without delay), the tasklet (without delay), and waking up an existing kthread. Each method is tested for N times (N = 10,000). The average and maximum latencies are taken. The outstanding system load is 10 real-time user-space threads (with policy SCHED_RR, priority 1). The test thread is running with policy SCHED_RR, priority 20. The priority is set higher than the background threads to avoid starvation.

System without high-priority load

Latency	Workqueue (global)	Workqueue (private)	Tasklet	Kthread	Userspace
Avg	6 us	6 us	5 us	5 us	8 us
Stdev	1.414 us	1.000 us	2.646 us	2.646 us	3.606 us
Max	21 us	19 us	135 us	30 us	12 us

System with high-priority load

Latency	Workqueue (global)	Workqueue (private)	Tasklet	Kthread	Userspace
Avg	101 us	101 us	6 us	195 us	49693 us
Stdev	222.948 us	242.535 us	99.895 us	333.742 us	43963.480 us
Max	950022 us	950070 us	9992 us	950160 us	49887 us

It can be seen that when the userspace has some high-priority load, the kernel performance is affected as well as the user space performance. The latencies of kernel tasks (as opposed to tasklets) are increased from microsecond levels to almost a second. The good thing, however, is that the kernel tasks are never blocked by userspace load, no matter how high their priority is. This is not the case for userspace threads. If the test real-time thread has lower priority than the real-time background threads, the test thread never gets enough time slices to execute.

Conclusion
Therefore, the conclusion from the result is kernel tasks (threads) have best-case latencies at microseconds level, and worst case performance around 1 second. The worst case for tasklets is lower and is at the millisecond level. The kernel tasks and threads are not easily starved for longer than a few seconds due to userspace workload, while userspace threads may.

Download
Source code of the kernel module and the test tool can be downloaded on github: http://github.com/dankex/tools/tree/master/linux-kernel/wake_latency/

Vincent_ywj

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Measuring kernel latencies to ensure real-time constraints

Device drivers in the kernel often need to perform some task in response to some events. To do this, there is not one but many different ways. These deferred execution methods include the Linux workqu
复制链接

扫一扫

专栏目录