原文:https://blog.csdn.net/21cnbao/article/details/8038279
Vanilla kernel的问题
Linux kernel在spinlock、irq上下文方面无法抢占,因此高优先级任务被唤醒到得以执行的时间并不能完全确定。同时,Linux kernel本身也不处理优先级反转。RT-Preempt Patch是在Linux社区kernel的基础上,加上相关的补丁,以使得Linux满足硬实时的需求。本文描述了该patch在PC上的实践。我们的测试环境为Ubuntu 10.10,默认情况下使用Ubuntu 10.10自带的kernel:
- barry@barry-VirtualBox:/lib/modules$ uname -a
- 2.6.35-32-generic #67-Ubuntu SMP Mon Mar 5 19:35:26 UTC 2012 i686 GNU/Linux
在Ubuntu 10.10,apt-get install rt-tests安装rt测试工具集,运行其中的cyclictest测试工具,默认创建5个SCHED_FIFO策略的realtime线程,优先级76-80,运行周期是1000,1500,2000,2500,3000微秒:
- barry@barry-VirtualBox:~/development/panda/android$ sudo cyclictest -p 80 -t5 -n
- [sudo] password for barry:
- policy: fifo: loadavg: 9.22 8.57 6.75 11/374 21385
- T: 0 (20606) P:80 I:1000 C: 18973 Min: 26 Act: 76 Avg: 428 Max: 12637
- T: 1 (20607) P:79 I:1500 C: 12648 Min: 31 Act: 68 Avg: 447 Max: 10320
- T: 2 (20608) P:78 I:2000 C: 9494 Min: 28 Act: 151 Avg: 383 Max: 9481
- T: 3 (20609) P:77 I:2500 C: 7589 Min: 29 Act: 889 Avg: 393 Max: 12670
- T: 4 (20610) P:76 I:3000 C: 6325 Min: 37 Act: 167 Avg: 553 Max: 13673
由此可见在标准Linux内,rt线程投入运行的jitter非常不稳定,最小值在26-37微秒,平均值为68-889微秒,而最大值则分布在9481-13673微秒之间。
我们还是运行这个测试,但是在运行这个测试的过程中引入更多干扰,如mount /dev/sdb1 ~/development,则结果变为:
- barry@barry-VirtualBox:~$ sudo cyclictest -p 80 -t5 -n
- policy: fifo: loadavg: 0.14 0.29 0.13 2/308 1908
- T: 0 ( 1874) P:80 I:1000 C: 28521 Min: 0 Act: 440 Avg: 2095 Max: 331482
- T: 1 ( 1875) P:79 I:1500 C: 19014 Min: 2 Act: 988 Avg: 2099 Max: 330503
- T: 2 ( 1876) P:78 I:2000 C: 14261 Min: 7 Act: 534 Avg: 2096 Max: 329989
- T: 3 ( 1877) P:77 I:2500 C: 11409 Min: 4 Act: 554 Avg: 2073 Max: 328490
- T: 4 ( 1878) P:76 I:3000 C: 9507 Min: 12 Act: 100 Avg: 2081 Max: 328991
mount过程中引入的irq、softirq和spinlock导致最大jitter明显地加大甚至达到了331482us,充分显示出了标准Linux内核中RT线程投入运行时间的不可预期性(硬实时要求意味着可预期)。
如果我们编译一份kernel,选择的是“Voluntary Kernel Preemption (Desktop)“,这类似于2.4不支持kernel抢占的情况,我们运行同样的case,时间的不确定性大地几乎让我们无法接受:
- barry@barry-VirtualBox:~$ sudo /usr/local/bin/cyclictest -p 80 -t5 -n
- # /dev/cpu_dma_latency set to 0us
- policy: fifo: loadavg: 0.23 0.30 0.15 3/247 5086
- T: 0 ( 5082) P:80 I:1000 C: 5637 Min: 60 Act:15108679 Avg:11195196 Max:15108679
- T: 1 ( 5083) P:80 I:1500 C: 5723 Min: 48 Act:12364955 Avg:6389691 Max:12364955
- T: 2 ( 5084) P:80 I:2000 C: 4821 Min: 32 Act:11119979 Avg:8061814 Max:11661123
- T: 3 ( 5085) P:80 I:2500 C: 3909 Min: 27 Act:11176854 Avg:4563549 Max:11176854
- T: 4 ( 5086) P:80 I:3000 C: 3598 Min: 37 Act:9951432 Avg:8761137 Max:116026155
RT-Preempt Patch使能
RT-Preempt Patch对Linux kernel的主要改造包括:
- Making in-kernel locking-primitives (using spinlocks) preemptible though reimplementation with rtmutexes:
- Critical sections protected by i.e. spinlock_t and rwlock_t are now preemptible. The creation of non-preemptible sections (in kernel) is still possible with raw_spinlock_t (same APIs like spinlock_t)
- Implementing priority inheritance for in-kernel spinlocks and semaphores. For more information on priority inversion and priority inheritance please consultIntroduction to Priority Inversion
- Converting interrupt handlers into preemptible kernel threads: The RT-Preempt patch treats soft interrupt handlers in kernel thread context, which is represented by a task_struct like a common userspace process. However it is also possible to register an IRQ in kernel context.
- Converting the old Linux timer API into separate infrastructures for high resolution kernel timers plus one for timeouts, leading to userspace POSIX timers with high resolution.
在本试验中,我们取的带RT-Preempt Patch的kernel tree是git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable- rt.git,使用其v3.4-rt-rebase branch,编译kernel时选中了"Fully Preemptible Kernel"抢占模型:
───────────────────────── Preemption Model ─────────────────────────┐
│ │ ( ) No Forced Preemption (Server)
│ │ ( ) Voluntary Kernel Preemption (Desktop)
│ │ ( ) Preemptible Kernel (Low-Latency Desktop)
│ │ ( ) Preemptible Kernel (Basic RT)
│ │ (X) Fully Preemptible Kernel (RT)
另外,kernel中需支持tickless和高精度timer:
┌───────────────────Processor type and features ─────────────────────────┐
│ │ [*] Tickless System (Dynamic Ticks)
│ │ [*] High Resolution Timer Support
make modules_install、make install、mkintramfs后,我们得到一个可以在Ubuntu中启动的RT kernel。具体编译方法可详见http://www.linuxidc.com/Linux/2012-01/50749.htm,根据该文修改版本号等信息即可,我们运行的命令包括:
安装模块
- barry@barry-VirtualBox:~/development/linux-2.6$ sudo make modules_install
- ....
- INSTALL /lib/firmware/whiteheat_loader.fw
- INSTALL /lib/firmware/whiteheat.fw
- INSTALL /lib/firmware/keyspan_pda/keyspan_pda.fw
- INSTALL /lib/firmware/keyspan_pda/xircom_pgs.fw
- INSTALL /lib/firmware/cpia2/stv0672_vp4.bin
- INSTALL /lib/firmware/yam/1200.bin
- INSTALL /lib/firmware/yam/9600.bin
- DEPMOD 3.4.11-rt19
安装kernel
- barry@barry-VirtualBox:~/development/linux-2.6$ sudo make install
- sh /home/barry/development/linux-2.6/arch/x86/boot/install.sh 3.4.11-rt19 arch/x86/boot/bzImage \
- System.map "/boot"
制作initrd
barry@barry-VirtualBox:~/development/linux-2.6$ sudo mkinitramfs 3.4.11-rt19 -o /boot/initrd.img-3.4.11-rt19
修改grub配置
在grub.conf中增加新的启动entry,仿照现有的menuentry,增加一个新的,把其中的相关版本号都变更为3.4.11-rt19,我们的修改如下:
- menuentry 'Ubuntu, with Linux 3.4.11-rt19' --class ubuntu --class gnu-linux --class gnu --class os {
- recordfail
- insmod part_msdos
- insmod ext2
- set root='(hd0,msdos1)'
- search --no-floppy --fs-uuid --set a0db5cf0-6ce3-404f-9808-88ce18f0177a
- linux /boot/vmlinuz-3.4.11-rt19 root=UUID=a0db5cf0-6ce3-404f-9808-88ce18f0177a ro quiet splash
- initrd /boot/initrd.img-3.4.11-rt19
- }
开机时选择3.4.11-rt19启动:
RT-Preempt Patch试用
运行同样的测试cyclictest benchmark工具,结果迥异:
- barry@barry-VirtualBox:~$ sudo cyclictest -p 80 -t5 -n
- WARNING: Most functions require kernel 2.6
- policy: fifo: loadavg: 0.71 0.42 0.17 1/289 1926
- T: 0 ( 1921) P:80 I:1000 C: 7294 Min: 7 Act: 89 Avg: 197 Max: 3177
- T: 1 ( 1922) P:79 I:1500 C: 4863 Min: 10 Act: 85 Avg: 186 Max: 2681
- T: 2 ( 1923) P:78 I:2000 C: 3647 Min: 15 Act: 93 Avg: 160 Max: 2504
- T: 3 ( 1924) P:77 I:2500 C: 2918 Min: 23 Act: 67 Avg: 171 Max: 2114
- T: 4 ( 1925) P:76 I:3000 C: 2432 Min: 19 Act: 134 Avg: 339 Max: 3129
我们还是运行这个测试,但是在运行这个测试的过程中引入更多干扰,如mount /dev/sdb1 ~/development,则结果变为:
- barry@barry-VirtualBox:~$ sudo cyclictest -p 80 -t5 -n
- # /dev/cpu_dma_latency set to 0us
- policy: fifo: loadavg: 0.11 0.12 0.13 1/263 2860
- T: 0 ( 2843) P:80 I:1000 C: 28135 Min: 5 Act: 198 Avg: 200 Max: 7387
- T: 1 ( 2844) P:80 I:1500 C: 18756 Min: 22 Act: 169 Avg: 188 Max: 6875
- T: 2 ( 2845) P:80 I:2000 C: 14067 Min: 7 Act: 91 Avg: 149 Max: 7288
- T: 3 ( 2846) P:80 I:2500 C: 11254 Min: 19 Act: 131 Avg: 155 Max: 6287
- T: 4 ( 2847) P:80 I:3000 C: 9378 Min: 25 Act: 58 Avg: 172 Max: 6121
时间在可预期的范围内,没有出现标准kernel里面jitter达到331482的情况。需要说明的是,这个jitter大到超过了我们的预期,达到了10ms量级,相信是受到了我们的测试都是在Virtualbox虚拟机进行的影响。按照其他文档显示,这个jitter应该在数十us左右。
我们在这个kernel里面运行ps aux命令,可以看出线程化了的irq:
- USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
- root 1 0.8 0.1 2880 1788 ? Ss 18:39 0:03 init
- root 2 0.0 0.0 0 0 ? S 18:39 0:00 kthreadd
- ...
- root 45 0.0 0.0 0 0 ? S 18:39 0:00 irq/14-ata_piix
- root 46 0.0 0.0 0 0 ? S 18:39 0:00 irq/15-ata_piix
- root 50 0.0 0.0 0 0 ? S 18:39 0:00 irq/19-ehci_hcd
- root 51 0.0 0.0 0 0 ? S 18:39 0:00 irq/22-ohci_hcd
- root 55 0.0 0.0 0 0 ? S 18:39 0:00 irq/12-i8042
- root 56 0.0 0.0 0 0 ? S 18:39 0:00 irq/1-i8042
- root 57 0.0 0.0 0 0 ? S 18:39 0:00 irq/8-rtc0
- root 863 0.0 0.0 0 0 ? S 18:39 0:00 irq/19-eth0
- root 864 0.0 0.0 0 0 ? S 18:39 0:00 irq/16-eth1
- root 1002 0.5 0.0 0 0 ? S 18:39 0:01 irq/21-snd_inte
- ...
在其中编写一个RT 线程的应用程序,通常需要如下步骤:
- Setting a real time scheduling policy and priority.
- Locking memory so that page faults caused by virtual memory will not undermine deterministic behavior
- Pre-faulting the stack, so that a future stack fault will not undermine deterministic behavior
例子test_rt.c,其中的mlockall是为了防止进程的虚拟地址空间对应的物理页面被swap出去,而stack_prefault()则故意提前导致stack往下增长8KB,因此其后的函数调用和局部变量的使用将不再导致栈增长(依赖于page fault和内存申请):
- #include <stdlib.h>
- #include <stdio.h>
- #include <time.h>
- #include <sched.h>
- #include <sys/mman.h>
- #include <string.h>
- #define MY_PRIORITY (49) /* we use 49 as the PRREMPT_RT use 50
- as the priority of kernel tasklets
- and interrupt handler by default */
- #define MAX_SAFE_STACK (8*1024) /* The maximum stack size which is
- guaranteed safe to access without
- faulting */
- #define NSEC_PER_SEC (1000000000) /* The number of nsecs per sec. */
- void stack_prefault(void) {
- unsigned char dummy[MAX_SAFE_STACK];
- memset(dummy, 0, MAX_SAFE_STACK);
- return;
- }
- int main(int argc, char* argv[])
- {
- struct timespec t;
- struct sched_param param;
- int interval = 50000; /* 50us*/
- /* Declare ourself as a real time task */
- param.sched_priority = MY_PRIORITY;
- if(sched_setscheduler(0, SCHED_FIFO, ¶m) == -1) {
- perror("sched_setscheduler failed");
- exit(-1);
- }
- /* Lock memory */
- if(mlockall(MCL_CURRENT|MCL_FUTURE) == -1) {
- perror("mlockall failed");
- exit(-2);
- }
- /* Pre-fault our stack */
- stack_prefault();
- clock_gettime(CLOCK_MONOTONIC ,&t);
- /* start after one second */
- t.tv_sec++;
- while(1) {
- /* wait until next shot */
- clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &t, NULL);
- /* do the stuff */
- /* calculate next shot */
- t.tv_nsec += interval;
- while (t.tv_nsec >= NSEC_PER_SEC) {
- t.tv_nsec -= NSEC_PER_SEC;
- t.tv_sec++;
- }
- }
- }
编译之:gcc -o test_rt test_rt.c -lrt。本节就到这里,后续我们会有一系列博文来描述RT-Preempt Patch对kernel的主要改动,以及其工作原理。