ARM SMMU v2

1 Registers
Refer to SMMU v2 datasheet.

2 SA8155 32bit SMMU v2
15-bit Stream ID, support 95 Stream Match Registers (SMR).

2.1 FT4232
The third port of 4-port FT4232 is used for SoC console, and SW4 for EDL mode.
reboot -f

2.2 Page 0
4KB Page
SMMU_GR0_BASE = 0x15000000
IDR0 = SMMU_GR0_BASE + 0x20
IDR1 = SMMU_GR0_BASE + 0x24

SMMU_SMR0 = SMMU_GR0_BASE + 0x800
SMMU_SMR1 = SMMU_GR0_BASE + 0x804
...
SMMU_SMR127 = SMMU_GR0_BASE + 0x9FC

SMMU_S2CR0 = SMMU_GR0_BASE + 0xC00
SMMU_S2CR1 = SMMU_GR0_BASE + 0xC04
...
SMMU_S2CR127 = SMMU_GR0_BASE + 0xDFC
SMMU_S2CRn.bit[17:16] is used to enable bypass mode.

2.3 Page 1
4KB Page
SMMU_GR0_BASE = 0x15000000
SMMU_CBA2R0 = SMMU_GR0_BASE + PAGESIZE + 0x800
SMMU_CBA2R1 = SMMU_GR0_BASE + PAGESIZE + 0x804
...
SMMU_CBA2R127 = SMMU_GR0_BASE + PAGESIZE + 0x9FC

2.4 Context Banks
4KB Page, each context bank has its own IRQ pin for SMMU v2.
x86 PCIe通过bus号在Root table里面找到相应的root_entry,然后再通过devfn在Context table(256 entries)里面找到对应的context_entry。
SMMU_GR0_BASE = 0x15000000
SMMU_CB_BASE = SMMU_GR0_BASE + (NUMPAGE x PAGESIZE) =
SMMU_GR0_BASE + (IDR1.bit31 x IDR1.bit[30:28]) =
0x15000000 + 4KB x 128 = 0x15080000

SMMU_CBn_TTBR0 = SMMU_CB_BASE + n x PAGESIZE + 0x20
SMMU_CBn_TTBR1 = SMMU_CB_BASE + n x PAGESIZE + 0x28
SMMU_CBn_TCR = SMMU_CB_BASE + n x PAGESIZE + 0x30
SMMU_CBn_FSYNR0 = SMMU_CB_BASE + n x PAGESIZE + 0x68, bit4 to identify DMA read data from memory or write data to memory

2.5 SMMU crash debugging
2.5.1 overlayFS
mount -t overlay -o lowerdir=/,upperdir=/tmp/upper,workdir=/tmp/workdir none /

2.5.2 showcase
arm-smmu 15000000.apps-smmu: FAR = 0x00000000284d3b0f
arm-smmu 15000000.apps-smmu: PAR = 0x0000000000000000
arm-smmu 15000000.apps-smmu: FSR = 0x40000402 [TF W SS ]
cb=32, SID=0x3c0 (SA8155)
cb=33, SID=0x7c0 (SA8195)

2.5.3 panic_notifier_list
register_die_notifier() for ARM64 SMMU crash notifier.

2.5.4 objdump
ARM64汇编中,x0 - x7用来传递函数第一到第七个参数,超出的参数通过堆栈来传递。
arm_smmu_context_fault+0xcf4/0xcf8
The first value 0xcf4 is the assembler offset address from entry arm_smmu_context_fault.
aarch64-poky-linux-objdump -d -S vmlinux > vmlinux.asm
aarch64-poky-linux-objdump -d -S xxx.ko > xxx.asm
gdb vmlinux
l *func_name+offset_addr: l means list

3 ARM64 memory barrier
ARM64 introduces the Store Buffer (not FIFO) for Store instruction data, Store Buffer is different from L1 Data Cache. The caches the same CPU cluster integrated are called Inner Sharable, the caches shared by all CPU clusters are called Outer Sharable.
LD: load-load/load-store
ST: store-store/store-load
SY: System, reads and writes
ISH: Inner sharable, reads and writes
ISHLD: Inner sharable Load, read only
ISHST: Inner sharable Store, writes only
OSH: Outer sharable, reads and writes

4 Linux ARM64 39-bit VA
4.1 39-bit VA Layout
Upper 25-bit is kernel FFFF_FF8 (TTBR1).
User space:
0x0000_0000_0000_0000
0x0000_007F_FFFF_FFFF

Kernel space:
0xFFFF_FF80_0000_0000
0xFFFF_FFFF_FFFF_FFFF

4.2 MMU
CONFIG_ARM64_PA_BITS_48
CONFIG_ARM64_VA_BITS_39
CONFIG_ARM64_VA_BITS_48
CONFIG_PGTABLE_LEVELS
3 level page table, every page table size is 4KB, and it has 512 entries, every entry size is 8 bytes, every page table uses 9 bits of VA to index the enrty, VA[11:0]  is used to index byte address.

entry offset address = entry_index * 8, entry bit[1:0] = Table descriptor or Block entry.

4.3 crash
https://www.kernel.org/
mainline - summary - Clone
git log --pretty=oneline
When VA causes crash, the kernel will print the 3-level page table 8-byte entry value (physical address) according to VA.

5 Abbreviations
CBAR: Context Bank Attribute Registers
S2CR: Stream-to-Context Register
SMR: Stream Match Register
TCR: Translation Control Register
TTBR: Translation Table Base Registers, per CPU registers, x86_64 calls CR3
PGD: Page Global Directory (38-30), TTBR0 if bit63 = 0 for user space, TTBR1 if bit63 = 1 for kernel space, VA[38:30] is an index to PGD entry
SMMU PRI: Page Request Interface

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
以下是一个使用SMMU的例子: ``` #include <linux/iommu.h> static struct iommu_domain *smmu_domain; static int smmu_probe(struct platform_device *pdev) { /* 初始化SMMU */ smmu_domain = iommu_domain_alloc(&platform_bus_type); if (!smmu_domain) { dev_err(&pdev->dev, "Failed to allocate SMMU domain\n"); return -ENOMEM; } /* 添加设备到SMMU域中 */ if (iommu_attach_device(smmu_domain, &pdev->dev)) { dev_err(&pdev->dev, "Failed to attach device to SMMU domain\n"); iommu_domain_free(smmu_domain); return -ENODEV; } return 0; } static int smmu_remove(struct platform_device *pdev) { /* 从SMMU域中移除设备 */ iommu_detach_device(smmu_domain, &pdev->dev); /* 释放SMMU域 */ iommu_domain_free(smmu_domain); return 0; } static const struct of_device_id smmu_of_match[] = { { .compatible = "arm,mmu-500" }, { /* sentinel */ } }; MODULE_DEVICE_TABLE(of, smmu_of_match); static struct platform_driver smmu_driver = { .probe = smmu_probe, .remove = smmu_remove, .driver = { .name = "smmu", .of_match_table = smmu_of_match, }, }; module_platform_driver(smmu_driver); ``` 这个代码片段是一个SMMU驱动程序的框架,它通过调用`iommu_domain_alloc()`函数分配一个SMMU域,然后通过调用`iommu_attach_device()`函数将设备添加到SMMU域中。当设备被移除时,它会调用`iommu_detach_device()`函数将设备从SMMU域中移除,并释放SMMU域。 在实际的驱动程序中,还需要为SMMU设备提供具体的操作函数,如`iommu_ops`结构体中的`map()`、`unmap()`、`flush_iotlb_all()`等。这些函数将被SMMU核心调用以执行特定的操作。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值