IOMMU is also known as the system MMU (SMMU). It performs memory management functions on behalf of masters that do not have their own MMU.
For example, nonprocessor masters. The IOMMU hardware block allows virtually contiguous memory to be backed by phvsicaly contiguous pages. Memory translation logic in the lOMMU is the same as the logic in the CPU MMU.
One of the most commonly seen IOMMU issues is the IOMMU page fault. IOMMU page faults result when the page to be accessed is mapped in the page table but is not found in memory. The fault handler gets the context bank instance of the IOMMU and dumps out the registers for this context.
The following log indicates an lOMMU page fault:
[ 47.228992] msm_iommu_v1: Unexpected IOMMU page fault! [ 47.233115] msm_iommu_v1: name = mdp_iommu [ 47.237238] msm_iommu_v1: context = mdp_0 (0) [ 47.241507] msm_iommu_v1: Interesting registers: [ 47.246149] msm_iommu_v1: FAR = 0000000000000000 [ 47.250970] msm_iommu_v1: PAR = 0000000000000000 [ 47.255834] msm_iommu_v1: FSR = 00000002 [TF ] [ 47.260540] msm_iommu_v1: FSYNR0 = 000005a1 FSYNR1 = 00030005 [ 47.266528] msm_iommu_v1: TTBR0 = 0000000071a28000 [ 47.271370] msm_iommu_v1: TTBR1 = 0000000000000000 [ 47.276248] msm_iommu_v1: SCTLR = 00001043 ACTLR = 70000000 [ 47.282221] msm_iommu_v1: CBAR = 00000000 CBFRSYNRA = 00000000 [ 47.288521] msm_iommu_v1: PRRR = ff0a81a8 NMRR = 40e040e0 [ 47.294461] msm_iommu_v1: NOTE: Value actually unknown for CBAR [ 47.300394] msm_iommu_v1: NOTE: Value actually unknown for CBFRSYNRA [ 47.306717] msm_iommu_v1: Page table in DDR shows PA = 0
The log message contains the following details:
- Name - Hardware block that took the fault
- FAR - Address at which the fault occurred
- FSR - Translation fault (TF), access permission fault (APF), stalled status (SS)
One of the most important registers in IOMMU debugging is the faut status register (FSR). This register has read/write-clear access. The read operation on this register reads the value in the register while the write operation clears the bits coresponding to 1s in the writen data, and leaves the bits corresponding to 0s unchanged in the written data. This prevents inadvertent clearing of new fauts when writing the register to clear an old fault. Someof the useful bits in this register are:
- [Bit 1]: TF – Translation fault (invalid page table entry)
- [Bit 2]: AFF – Access fault
- [Bit 3]: APF – Permission fault (write to read-only region, and so on)
- [Bit 4]: TLBMF – TLB miss fault
- [Bit 5]: HTWDEEF – Hardware table walk decode error external fault
- [Bit 6]: HTWSEEF – Hardware table walk slave error external fault
- [Bit 7]: MHF – Multiple hits in TLB
- [Bit 16]: SL – Second-level fault (fault occurred in second level of page table)
- [Bit 30]: SS – Stalled status
- [Bit 31]: MULTI – Multiple faults
The flags TF or APF and SL indicate normal operation. Whereas TLBMF, HTW*EEF, or MHF flags indicate that there is an issue.
IOMMU clocks
To modify the hardware registers, the IOMMU clocks must be turned on:
- When attaching or detaching the MMU hardware
- When flushing TLBs
IOMMU page table
The IOMMU page table dump provides a faulting address from the FAR and the register dump in the kernel log. This faulting address is the virtual address and the corresponding physical address can be acquired from the page table. From the page table dump, the address being accessed can be identified as mapped or not mapped.
Each IOMMU domain has a page table. The dump includes page tables for each of the domains. There are currently six domains.
The following is an example of domain 2 page table dump:
Domain: 2 [L2 cache redirect for page tables is OFF] 0x00000000--0x0001ffff [0x00020000] [UNMAPPED] 0x00020000--0x01807fff [0x017e8000] A:0x82a8e000--0x84275fff [0x017e8000] [R/W][4K] 0x01808000--0x01939fff [0x00132000] A:0xf0c24000--0xf0d55fff [0x00132000] [R/W][4K] 0x0193a000--0x0199ffff [0x00066000] A:0xf13fa000--0xf145ffff [0x00066000] [R/W][4K] 0x019a0000--0x01e85fff [0x004e6000] A:0xf966e000--0xf9b53fff [0x004e6000] [R/W][4K] 0x01e86000--0x01ffffff [0x0017a000] [UNMAPPED] 0x02000000--0x02feffff [0x00ff0000] A:0xf5a22000--0xf6a11fff [0x00ff0000] [R/W][4K]
In this example, the first column represents the virtual address, the second column represents the number of bytes in the corresponding region of contiguous physical addresses, and the third column refers to the physical addresses. The permissions are also mentioned for each of these regions.