linux内核初始化卡死,Linux内核中断初始化

最新推荐文章于 2023-02-14 20:01:22 发布

Daniel FC

最新推荐文章于 2023-02-14 20:01:22 发布

阅读量504

点赞数

文章标签： linux内核初始化卡死

interrupt数组是存放在rodata段中的，该段内存在完成初始化idt之后，还有什么用处呢？是否回收呢？

interrupt数组定义在.init.rodata段，entry_32.S: .section.init.rodata,"a"ENTRY(interrupt).text

.init段中的数据会在init完成之后free: /* Init code and data - will be freed after init */.=ALIGN(PAGE_SIZE);.init.begin:AT(ADDR(.init.begin)-LOAD_OFFSET){__init_begin=.;/* paired with __init_end */}#if defined(CONFIG_X86_64) && defined(CONFIG_SMP)/** percpu offsets are zero-based on SMP. PERCPU_VADDR() changes the

* output PHDR, so the next output section - .init.text - should

* start another segment - init.

*/PERCPU_VADDR(0,:percpu)#endifINIT_TEXT_SECTION(PAGE_SIZE)#ifdefCONFIG_X86_64:init#endifINIT_DATA_SECTION(16)

INIT_DATA_SECTION宏定义于include/asm-generic/vmlinux.lds.h: #defineINIT_DATA_SECTION(initsetup_align)\.init.data:AT(ADDR(.init.data)-LOAD_OFFSET){\INIT_DATA\INIT_SETUP(initsetup_align)\INIT_CALLS\CON_INITCALL\SECURITY_INITCALL\INIT_RAM_FS\}

INIT_DATA同样定义于该文件： /* init and exit section handling */#defineINIT_DATA\*(.init.data)\DEV_DISCARD(init.data)\CPU_DISCARD(init.data)\MEM_DISCARD(init.data)\KERNEL_CTORS()\*(.init.rodata)\MCOUNT_REC()\DEV_DISCARD(init.rodata)\CPU_DISCARD(init.rodata)\MEM_DISCARD(init.rodata)

释放初始化内存的调用路径：

start_kernel()->rest_init()->new kernel thread: kernel_init()->init_post()->free_initmem();

softirq, tasklet, workqueue softirq在执行之前(在do_softirq中)会检查是否in_interrupt，如果是，则退出；

禁止中断；

在do_softirq中执行local_bh_disable，increase preempt_count，禁止本地cpu softirq；

这样做有如下效果：在同一个CPU上，所有的延迟函数串行执行；

在softirq执行期间不会发生进程切换；

由于执行softirq时，中断处于禁止状态，所以其中不能有睡眠发生；

因为do_softirq禁止的是本地的softirq，所以其他cpu上的softirq可以正常执行，另外，由于softirq_action中只有一个可重入的函数，并无数据结构需要跨CPU保护，所以即使同一类型的softirq也可以同时在不同的CPU上执行；但诚如上面所说，所有种类的延迟函数，在同一个CPU上，都是串行执行的；

tasklet是在softirq的基础上实现的，所以具有上述的大部分特点，只是tasklet_struct中包含需要跨CPU保护的data，所以在tasklet_action中，执行相应tasklet时会检查对应的标志，如果其他CPU，已经在执行，则重新插入本cpu的tasklet_head的链表中，等待下次执行。

如此tasklet具有了softirq的另外一个特性：同一类型的softirq同时只能在一个CPU上执行；当然，不同类型的tasklet可以同时在不同的CPU上执行。

workqueue在进程上下文执行——执行时并没有对中断作假设，所以可以睡眠。 TODO: TSS… TSS的概念，及中断时TSS的切换；

中断发生的时机：①发生在系统进程运行时，这个我们了解的已经很清楚了；

②发生在系统处理中断时，此时中断处理程序已经禁止了中断， A.发生可屏蔽中断；

B.发生不可屏蔽中断； TSS

Theprocessor transfers execution to another taskinone of four cases:Thecurrent program,task,orprocedure executes a JMPorCALL instruction to a TSS descriptorinthe GDT.Thecurrent program,task,orprocedure executes a JMPorCALL instruction to a task-gate descriptorinthe GDTorthe current LDT.Aninterruptorexception vector points to a task-gate descriptorinthe IDT.Thecurrent task executes an IRETwhenthe NT flaginthe EFLAGSregisterisset.

注意，并非所有的jmp/call都会引起task switch，同样，也并非所有的interrupt/exception/iret会引起task switch； jmp/call只有在操作符为TSS Descriptor/task-gate的时候才引起task switch；

interrupt/exception只有idt中的相应项为task gate的时候,才会引起，Linux的idt中只有一个task gate,它处理double fault；

iret只有在设置了nested task标志的时候，才会switch task to previous one.

Allof these methodsfordispatching a task identify the task to be dispatchedwitha segment selector that points to a task gateorthe TSSforthe task.Whendispatching a taskwitha CALLorJMP instruction,the selectorinthe instruction mayselectthe TSS directlyora task gate that holds the selectorforthe TSS.__When dispatching a task to handle an interruptorexception,the IDT entryforthe interruptorexception

must contain a task gate that holds the selectorforthe interrupt-orexceptionhandler TSS.__

以上引自Intel Manual 3A chap-7 TODO: TSS

TSS的关注点：哪里存放？GDT

何时使用？task switch: jmp/call/exec|intr/iret；

如何操作？

How many TSSs are there?

If TSS Descriptor saved in GDT, where TSSs were located?

由于在SMP系统中，GDT是per-cpu的，由上图可以看出每个CPU有一个通用TSSd和一个double fault专用TSSd;

FROM ULK3: 3.3.2. Task State Segment

The 80x86 architecture includes a specific segment type called the Task State Segment (TSS), to store hardware contexts.

Although Linux doesn't use hardware context switches, it is nonetheless forced to set up a TSS for each distinct CPU in the system.

This is done for two main reasons: When an 80x86 CPU switches from User Mode to Kernel Mode, it fetches the address of the Kernel Mode stack from the TSS (see the sections "Hardware Handling of Interrupts and Exceptions" in Chapter 4 and "Issuing a System Call via the sysenter Instruction" in Chapter 10). When a User Mode process attempts to access an I/O port by means of an in or out instruction, the CPU may need to access an I/O Permission Bitmap stored in the TSS to verify whether the process is allowed to address the port.

其中说，Linux并不使用hardware context switches!

但是，也没有禁止(PS.我目前不知道禁止Intel CPU task switch的方法)，所以所有的task(kernel path, or user processes)共用同一个TSS(d)，不要钻double fault的牛角尖，:)

用意在于避免禁止中断时间过长的软中断，执行时为何要禁止中断?

refer to: Intel Manual 3a: 6.8 ENABLING AND DISABLING INTERRUPTS

禁止中断，并不能禁止non-maskable interrupts & exceptions，于是造成了中断嵌套(nested interrupts).

when the IF flag is set, interrupts delivered to the INTR or through the local APIC pin are processed as normal external interrupts.

在中断禁止期间，并不会ack中断，清除INTR状态，那么在重新设置IF标志位之后，先前的INTR状态，是否能得到处理呢？

要弄清这个，需要理解： CPU通过INTR处理外部中断的机理；

APIC的工作原理(必要时可以看Linux中APIC的驱动)；

软中断做的是一些可延迟的费时间的事，当然不能在中断里执行了。

下面附有do_softirq代码，可以看到在执行可延迟函数第一件事就是开中断。但在开始之前，禁用了下半部中断(local_bh_disable)。这样就算被中断了，返回内核时也不会被抢占，还是执行这里的代码。也不会被调度。

那么这样的后果就是软中断上下文里的会一直执行下去，直到到达了限定次数，然后唤醒守护进程。

再返回看一下do_softirq()的代码，发现确实如此，在其实际执行softirq_action之前，确实是打开了中断的，所以可以说softirq在执行实际的延迟函数时，并没有禁用中断。

上面的分析，忽略了一个效果，就是local_bh_disable造成了在本地CPU上，softirq的串行执行，因为在do_softirq的最开始会判断是否in_interrupt.

其实，我还有另外一个不成熟的想法：

之所以，interrupt/exception handler必须尽量的短，是因为在执行完handler之后，才ack irq line，清除irq line的状态，让这条line上新的irq可以被识别到。

这里中断状态可以从两个角度观察： irq line, CPU外部；

cpu内部；

CPU内部可以通过clear IF flag来禁止CPU对外部中断的响应，但是外部中断依然可以发生，处不处理，irq line的状态就在那里，重新set IF flag之后，就会被看到；但是如果中断发生之后，不立即清除irq line的状态，即使有新的相同中断发生，也无法识别到irq line状态的改变。

这想法确实不成熟，模糊的地方在于ack irq line的时间，可以到do_IRQ中去看一下：

do_IRQ()->handle_irq()->eg. handle_level_irq(): voidhandle_level_irq(unsignedintirq,structirq_desc*desc){structirqaction*action;irqreturn_t action_ret;raw_spin_lock(&desc->lock);mask_ack_irq(desc,irq);******action=desc->action;if(unlikely(!action||(desc->status&IRQ_DISABLED)))gotoout_unlock;desc->status|=IRQ_INPROGRESS;raw_spin_unlock(&desc->lock);action_ret=handle_IRQ_event(irq,action);******if(!(desc->status&(IRQ_DISABLED|IRQ_ONESHOT)))unmask_irq(desc,irq);out_unlock:raw_spin_unlock(&desc->lock);}

可以看到在中断处理的最开始，就调用mask_ack_irq()对中断进行了ack，呵呵，但同时还多了一个mask，就是说，即使现在ack了，该中断也是被mask了的，这是对外部APIC该种中断的禁止，APIC如果再发现这种，也不用改变irq line线的状态了。 staticinlinevoidmask_ack_irq(structirq_desc*desc,intirq){if(desc->chip->mask_ack)desc->chip->mask_ack(irq);else{desc->chip->mask(irq);if(desc->chip->ack)desc->chip->ack(irq);}desc->status|=IRQ_MASKED;}

from ULK3:

Each IRQ line can be selectively disabled. Thus, the PIC can be programmed to disable IRQs. That is, the PIC can be told to stop issuing interrupts that refer to a given IRQ line, or to resume issuing them.Disabled interrupts are not lost; the PIC sends them to the CPU as soon as they are enabled again. This feature is used by most interrupt handlers, because it allows them to process IRQs of the same type serially.

Selective enabling/disabling of IRQs is not the same as global masking/unmasking of maskable interrupts. When the IF flag of the eflags register is clear, each maskable interrupt issued by the PIC is temporarily ignored by the CPU.

如何在real-mode将内核代码置于1M之上？

系统启动时，kernel的第二部分，被放在0x100000起始的位置，也就是1M以上。

这是如何做到的呢，此时CPU还处在real-mode？

答案很简单：kernel是bootloader放的，通过对u-boot代码的阅读，u-boot加载kernel image时是进入了protected-mode的，当加载完成之后，需要将控制权交给linux os kernel的之前那一刻，又将CPU带回real-mode.

OK, real-mode CPU可以寻址1M以上的空间，但只能寻到64k，很显然，这不够存放第二部分kernel image。

补：interrupt内核驱动架构：

Daniel FC

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
linux内核初始化卡死,Linux内核中断初始化

interrupt数组是存放在rodata段中的，该段内存在完成初始化idt之后，还有什么用处呢？是否回收呢？interrupt数组定义在.init.rodata段，entry_32.S: .section.init.rodata,"a"ENTRY(interrupt).text.init段中的数据会在init完成之后free: /* Init code and data - will be fr...
复制链接

扫一扫