Linux with GICv3

IRQ subsystem in linux kernel

.arch/arm64/kernel/head.S
__primary_switched:
	...
	adr_l   x8, vectors         // load VBAR_EL1 with virtual
	msr vbar_el1, x8            // vector table address
	...
.arch/arm64/kernel/entry.S
    .macro kernel_ventry, el, label, regsize = 64
    ...
    b   el\()\el\()_\label
    .endm
    
    .align  11
ENTRY(vectors)
    kernel_ventry   1, sync_invalid         // Synchronous EL1t
    kernel_ventry   1, irq_invalid          // IRQ EL1t
    kernel_ventry   1, fiq_invalid          // FIQ EL1t
    kernel_ventry   1, error_invalid        // Error EL1t

    kernel_ventry   1, sync             // Synchronous EL1h
    kernel_ventry   1, irq              // IRQ EL1h
    kernel_ventry   1, fiq_invalid          // FIQ EL1h
    kernel_ventry   1, error_invalid        // Error EL1h

    kernel_ventry   0, sync             // Synchronous 64-bit EL0
    kernel_ventry   0, irq              // IRQ 64-bit EL0
    kernel_ventry   0, fiq_invalid          // FIQ 64-bit EL0
    kernel_ventry   0, error_invalid        // Error 64-bit EL0

#ifdef CONFIG_COMPAT
    kernel_ventry   0, sync_compat, 32      // Synchronous 32-bit EL0
    kernel_ventry   0, irq_compat, 32       // IRQ 32-bit EL0
    kernel_ventry   0, fiq_invalid_compat, 32   // FIQ 32-bit EL0
    kernel_ventry   0, error_invalid_compat, 32 // Error 32-bit EL0
#else
    kernel_ventry   0, sync_invalid, 32     // Synchronous 32-bit EL0
    kernel_ventry   0, irq_invalid, 32      // IRQ 32-bit EL0
    kernel_ventry   0, fiq_invalid, 32      // FIQ 32-bit EL0
    kernel_ventry   0, error_invalid, 32        // Error 32-bit EL0
#endif
END(vectors)

el0_irq:
    kernel_entry 0
    ...
    ct_user_exit
    irq_handler
    ...
    b   ret_to_user
ENDPROC(el0_irq)

    .macro  irq_handler
    ldr_l   x1, handle_arch_irq
    mov x0, sp
    irq_stack_entry
    blr x1
    irq_stack_exit
    .endm

ret_to_user:
	...
    kernel_exit 0
ENDPROC(ret_to_user)

	.macro  kernel_exit, el	
	...
	eret

.arch/arm64/kernel/irq.c
void __init set_handle_irq(void (*handle_irq)(struct pt_regs *))
{
	...
    handle_arch_irq = handle_irq;
}

.drivers/irqchip/irq-gic-v3.c
static int __init gic_init_bases(...)
{
	...
	set_handle_irq(gic_handle_irq);
	...
}

static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
{
     u32 irqnr;

    irqnr = gic_read_iar();

    if (likely(irqnr > 15 && irqnr < 1020) || irqnr >= 8192) {
        int err; 

        if (static_branch_likely(&supports_deactivate_key))
            gic_write_eoir(irqnr);
        else
            isb();

        err = handle_domain_irq(gic_data.domain, irqnr, regs);
        if (err) {
            WARN_ONCE(true, "Unexpected interrupt received!\n");
            if (static_branch_likely(&supports_deactivate_key)) {
                if (irqnr < 8192)
                    gic_write_dir(irqnr);
            } else {
                gic_write_eoir(irqnr);
            }
        }
        return;
    }    
    if (irqnr < 16) {
        gic_write_eoir(irqnr);
        if (static_branch_likely(&supports_deactivate_key))
            gic_write_dir(irqnr);
#ifdef CONFIG_SMP
            /*
             * Unlike GICv2, we don't need an smp_rmb() here.
             * The control dependency from gic_read_iar to
             * the ISB in gic_write_eoir is enough to ensure
             * that any shared data read by handle_IPI will
             * be read after the ACK.
             */
            handle_IPI(irqnr, regs);
#else
            WARN_ONCE(true, "Unexpected SGI received!\n");
#endif
        }
}

在这里插入图片描述
Linux interrupt handle procedure

Brief history of the GIC architecture

在这里插入图片描述

GICv3 fundamentals

Interrupts types

  • SPI (Shared Peripheral Interrupt)
    This is a global peripheral interrupt that can be routed to a specified PE, or to one of a group of PEs.
  • PPI (Private Peripheral Interrupt)
    This is peripheral interrupt that targets a single, specific PE.
    An example of a PPI is an interrupt from the Generic Timer of a PE.
  • SGI (Software Generated Interrupt)
    SGIs are typically used for inter-processor communication, and are generated by a write to an SGI register in the GIC.
  • LPI (Locality-specific Peripheral Interrupt)

Interrupt Identifiers

在这里插入图片描述

Interrupt state machine

  • Inactive
    The interrupt source is not currently asserted.
  • Pending
    The interrupt source has been asserted, but the interrupt has not yet been acknowledged by a PE.
  • Active
    The interrupt source has been asserted, and the interrupt has been acknowledged by a PE.
  • Active and Pending
    An instance of the interrupt has been acknowledged, and another instance is now pending.

Affinity routing

The affinity of a PE is represented as four 8-bit fields:
<affinity level 3>.<affinity level 2>.<affinity level 1>.<affinity level 0>
<group of groups>. <group of processors>.<processor>.<core>
在这里插入图片描述
for example:
0.0.0.[0:3] Cores 0 to 3 of a Cortex-A53 processor
0.0.1.[0:1] Cores 0 to 1 of a Cortex-A57 processor

Programmers’ model

The register interface of a GICv3 interrupt controller is split into three groups:
Distributor (GICD_)
Redistributors (GICR_
)
CPU interfaces (ICC_*_ELn)
在这里插入图片描述

Configuring the GIC

Global settings

  • Enable Affinity routing (ARE bits)
  • Enables
    GICD_CTLR contains separate enable bits for Group 0, Secure Group 1 and Non-secure Group 1;

Individual PE settings

  • Redistributor configuration
  • CPU interface configuration
    1. Enable System register access(ICC_SRE_ELn).
    2. Set priority mask(ICC_PMR_EL1) and binary point registers(ICC_BPRn_EL1).
    3. Set EOI mode(Drop priority or Drop and deactivate).
    4. Enable signaling of each interrupt group.
  • PE configuration
    1. Routing controls(SCR_EL3 and HCR_EL2 of the PE).
    2. Interrupt masks(mask bits in PSTATE).
    3. Vector table(VBAR_ELn).

SPI, PPI and SGI configuration

SPIs are configured through the Distributor, using the GICD_* registers. PPIs and SGIs are configured through the individual Redistributors, using the GICR_* registers.
For each INTID, software must configure the following:

  • Priority (GICD_IPRIORITYn, GICR_IPRIORITYn)
  • Group (GICD_IGROUPn, GICD_IGRPMODn, GICR_IGROUP0, GICR_IGRPMOD0)
  • Edge-triggered/level-sensitive (GICD_ICFGRn, GICR_ICFGRn)
  • Enable (GICD_ISENABLERn, GICD_ICENABLER, GICR_ISENABLER0, GICR_ICENABLER0)
Setting the target PE for SPIs

For SPIs, the target of the interrupt must additionally be configured. This is controlled by GICD_IROUTERn. There is a GICD_IROUTERn register per SPI, and the Interrupt_Routing_Mode bit controls the routing policy. The options are:

  • GICD_IROUTERn.Interrupt_Routing_Mode == 0
    The SPI is to be delivered to the PE A.B.C.D, the affinity co-ordinates specified in the register.
  • GICD_IROUTERn.Interrupt_Routing_Mode == 1
    The SPI can be delivered to any connected PE that is participating in distribution of the interrupt group.

Handling Interrupts

when interrupt becomes pending

When an interrupt becomes pending, the interrupt controller decides whether to send the interrupt to one of the connected PEs ,depends on the following settings:

  • Group enables
  • Interrupt enables
  • Routing controls
  • Interrupt priority & priority mask
  • Running priority

Interrupt acknowledge

The CPU interface has two IARs. Reading the IAR returns the INTID, and advances the interrupt state machine.

  • ICC_IAR0_EL1 used to acknowledge Group 0 interrupts
  • ICC_IAR1_EL1 used to acknowledge Group 1 interrupts

Running priority & preemption

The PMR sets the minimum priority that an interrupt must have to be forwarded to a particular PE. The GICv3 architecture has the concept of a running priority. When a PE acknowledges an interrupt, its running priority becomes that of the interrupt. The running priority returns to its former value when the PE writes to one of the EOI registers.

End of interrupt

  • Priority drop
    This means dropping the running priority back to the value that it had before the interrupt was taken.
  • Deactivation
    In the GICv3 architecture priority drop and deactivation can happen together or separately ,depends on ICC_CTLR_ELn.EOImode.
    • mode 0
      A write to ICC_EOIR0_EL1 for Group 0 interrupts, or ICC_EOIR1_EL1 for Group 1 interrupts, performs both the priority drop and deactivation.
    • mode 1
      A write to ICC_EOIR_EL10 for Group 0 interrupts, or ICC_EOIR1_EL1 for Group 1 interrupts results in a priority drop. A separate write to ICC_DIR_EL1 is required for deactivation.

Virtualization support

Interface

在这里插入图片描述
Register interfaces without legacy support (GICv3 only)
The GIC architecture does not provide features for virtualizaing the Distributor, Redistributors or ITSs. Virtualization of these interfaces must be handled by software. The CPU Interface registers are split into three groups:
在这里插入图片描述
CPU interface registers with virtualization
The ICV and ICC registers have the same instruction encodings. At EL2, EL3 and Secure EL1, the ICC registers are always accessed. At Non-secure EL1, whether the ICC or the ICV registers are accessed is determined by the routing bits in HCR_EL2.
在这里插入图片描述
在这里插入图片描述
Example of forwarding a physical interrupt to a vPE

  1. A physical Non-secure Group 1 interrupt is forwarded to the physical CPU interface from the Redistributor.
  2. The physical CPU interface checks whether the physical interrupt can be forwarded to the PE. (Group enable? Interrupt enable? Routing Control? Interrupt priority&priority mask? Running priority?)
  3. The interrupt is taken to EL2. The hypervisor reads the IAR, which returns the pINTID. The pINTID is now in the Active state. The hypervisor determines that the interrupt is to be forwarded to the currently running vPE. The hypervisor writes the pINTID to ICC_EOIR1_EL1. With ICC_CTLR_EL1.EOImode==1, this only performs priority drop without deactiving the physical interrupt.
  4. The hypervisor writes one of the List register(ICH_LR{0,15}_EL2), in order to register a virtual interrupt as pending. The List register entry specifies the vINTID that is to be sent and the original pINTID. The hypervisor then performs an exception return, returning execution to the vPE.
  5. The virtual CPU interface checks whether the virtual interrupt can be forwared to the vPE. These checks are the same as for physical interrupts, other than that they use the ICV registers. In this instance, the checks pass and a virtual exception is asserted.
  6. The virtual exception is taken to Non-secure EL1. When software reads the IAR, the vINTID will be returned and the virtual interrupt is now in the Active state.
  7. The Guest OS handles the interrupt. When it has finished handling the interrupt, it writes the EOIR to perform a priority drop and deactivation. As the List register recorded the pINTID, this deactivates both the vINTID and pINTID.
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
### 回答1: "Linux Observability with BPF" 是一本关于使用BPF(Berkeley Packet Filter)来增强Linux可观测性的PDF书籍。BPF是一种能够在内核空间运行的虚拟机,它可以通过动态编程来对内核进行扩展和控制。这本书通过BPF来提供了一种全新的方法来监控和调试Linux系统。 这本书首先介绍了BPF的原理和基本概念,包括如何编写BPF程序和如何将其加载到内核中。然后,它详细介绍了各种用于增强Linux可观测性的BPF工具和技术,如BCC(BPF Compiler Collection)、BPF_trace、BPF_perf_event和BPF_ringbuf等。 通过使用这些工具和技术,读者可以了解和追踪系统的各种事件和性能指标,如系统调用、网络流量、硬件事件、存储和文件系统等。这些工具还可以用于实时监控和调试,以及进行深度分析和故障排查。 此外,这本书还介绍了如何使用BPF来实现安全监控和防御措施,并介绍了一些用于性能优化和资源管理的技术。它还包含了一些实际案例和场景,以帮助读者更好地理解和应用BPF和相关工具。 总的来说,"Linux Observability with BPF" 是一本深入介绍和探索如何使用BPF来增强Linux可观测性的实用指南。它为读者提供了丰富的工具和技术,帮助他们更好地理解和优化Linux系统的性能、安全性和可靠性。 ### 回答2: "Linux Observability with BPF"是一本关于使用BPF(Berkely Packet Filter)在Linux上进行可观察性工作的PDF书籍。BPF是一个强大的工具,可以在内核空间进行数据收集、分析和操作,以提供更好的系统可观察性和性能调优。 这本书以非常详细的方式介绍了BPF的概念、原理和使用方法。它涵盖了BPF在Linux系统中的各个方面,包括BPF程序的编写、加载和追踪,以及如何使用BPF来监控系统的各个组件,如网络、文件系统和性能指标等。 通过阅读这本书,读者可以学到如何使用BPF来解决实际的系统问题。例如,可以使用BPF来监控网络流量,检测和过滤恶意流量,或者分析系统性能瓶颈并进行优化。 此外,这本书还介绍了各种BPF工具和框架,如BCC(BPF Compiler Collection)和bpftool,以及如何使用这些工具来简化BPF的开发和调试过程。 总的来说,"Linux Observability with BPF"是一本对于想要深入了解和使用BPF来提升Linux系统可观察性和性能的读者非常有价值的书籍。它提供了详细而全面的指导,使读者能够充分发挥BPF的潜力,并应用于实际的系统管理和优化中。无论是初学者还是有经验的系统管理员,都可以从中获得很多实用的知识和技巧。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值