arm64 页表以及映射分析

本文详细介绍了Linux6.10在Xilinx平台上的内存配置,包括48位虚拟地址和物理地址位宽、4KB页面粒度的配置。接着,深入探讨了arm64的4KB、16KB和64KB三种页面粒度的页表配置,以及各级页表描述符的差异。最后,文章讲解了LinuxARM64的页表映射过程,从__create_pgd_mapping开始,逐层配置PGD、PUD、PMD和PTE页表的过程。
摘要由CSDN通过智能技术生成

1 linux 6.10 xilinx内核的内存配置

在.config配置文件中可以看到如下的配置
配置内存的虚拟地址和物理地址总线位宽为48位,页面粒度为4K大小。

  • 内存page size设置为4K
CONFIG_ARM64_4K_PAGES=y
  • 虚拟地址位宽为48位
CONFIG_ARM64_VA_BITS_48=y                                                                                                                                                                                  
CONFIG_ARM64_VA_BITS=48
  • 物理地址位宽为48位
CONFIG_ARM64_PA_BITS_48=y
CONFIG_ARM64_PA_BITS=48
  • 页表级别的配置:配置为4级页表
CONFIG_PGTABLE_LEVELS=4

2 arm64不同粒度页的页表

有3种不同粒度的内存页面设置:4KB16KB64KB

2.1 4KB页面粒度的页表配置

When you use a 4kB granule size, the hardware can use a 4-level look up process. The 48-bit address has nine address bits per level translated, that is 512 entries each, with the final 12 bits selecting a byte within the 4kB coming directly from the original address.
Bits 47:39 of the Virtual Address index into the 512 entry L0 table. Each of these table entries spans a 512 GB range and points to an L1 table. Within that 512 entry L1 table, bits 38:30 are used as index to select an entry and each entry points to either a 1GB block or an L2 table. Bits 29:21 index into a 512 entry L2 table and each entry points to a 2MB block or next table level. At the last level, bits 20:12 index into a 512 entry L2 table and each entry points to a 4kB block.
在这里插入图片描述
在这里插入图片描述

2.2 16KB页面粒度的页表配置

When you use a 16kB granule size, the hardware can use a 4-level look up process. The 48-bit address has 11 address bits per level translated, that is 2048 entries each, with the final 14 bits selecting a byte within the 4kB coming directly from the original address. The level 0 table contains only two entries. Bit 47 of the Virtual Address selects a descriptor from the two entry L0 table. Each of these table entries spans a 128 TB range and points to an L1 table. Within that 2048 entry L1 table, bits 46:36 are used as an index to select an entry and each entrypoints to an L2 table. Bits 35:25 index into a 2048 entry L2 table and each entry points to a 32 MB block or next table level. At the final translation stage, bits 24:14 index into a 2048 entry L2 table and each entry points to a 16kB block.
在这里插入图片描述
在这里插入图片描述

2.3 64KB页面粒度的页表配置

When you use a 64kB granule size, the hardware can use a 3-level look up process. The level 1 table contains only 64 entries.
Bits 47:42 of the Virtual Address select a descriptor from the 64 entry L1 table. Each of these table entries spans a 4TB range and points to an L2 table. Within that 8192 entry L2 table, bits 41:29 are used as index to select an entry and each entry points to either a 512 MB block or an L2 table. At the final translation stage, bits 28:16 index into an 8192 entry L3 table and each entry points to a 64kB block.
在这里插入图片描述
在这里插入图片描述

3 页表描述符

In the VMSAv8-64 translation table format, the difference in the formats of the level 0, level 1 and level 2
descriptors is:

  • Whether a Block descriptor is permitted.
  • If a Block descriptor is permitted, the size of the memory region described by that entry.
  • The maximum OA size, depending on whether ARMv8.2-LPA is implemented.

These differences depend on the translation granule, as follows:

  • 4KB granule Level 0 translation tables do not support Block descriptors.
    • A block descriptor:
      • In a level 1 table describes the mapping of the associated 1GB input address range.
      • In a level 2 table describes the mapping of the associated 2MB input address range.
    • The maximum OA size of a lookup is 48 bits.
  • 16KB granule Level 0 and level 1 translation tables do not support Block descriptors.
    • A Block descriptor in a level 2 table describes the mapping of the associated 32MB input address range.
    • The maximum OA size of a lookup is 48 bits.
  • 64KB granule Level 0 lookup is not supported.
    • In ARMv8.7 LPA is default is implemented
      • A block descriptor:
        • In a level 1 table describes the mapping of the associated 4TB input address range.
        • In a level 2 table describes the mapping of the associated 512MB input address range.
    • The maximum OA size of a lookup is 48 bits.

3.1 无效页表描述符

如果页表描述符的最低位为0则表示当前页表描述符是一个无效的页表描述符,对于L0 ~ L3页表描述符表都适用。
在这里插入图片描述

3.2 L0~L2页表描述符

根据L0 ~ L2页表描述符表的bit 1位为0还是1来区分当前的输出是一个块地址还是一个指向下一级页表的地址

  • 0 表示当前是一个块类型页表描述符,输出的为一个块地址
    • The descriptor gives the base address of a block of memory, and the attributes for that memory region.
  • 1 表示当前是一个页表类型,指向下一级页表的地址
    • The descriptor gives the address of the next level of translation table, and for a stage 1 translation, some attributes for that translation.

在这里插入图片描述

3.3 L3页表描述符

L3页表描述符表根据页面page size设置的不同,描述符表的格式有略微的区别

  • For the 4KB granule size, each entry in a level 3 table describes the mapping of the associated 4KB input address range.
  • For the 16KB granule size, each entry in a level 3 table describes the mapping of the associated 16KB input address range.
  • For the 64KB granule size, each entry in a level 3 table describes the mapping of the associated 64KB input address range.
    在这里插入图片描述
    Descriptor bit[1] identifies the descriptor type, and is encoded as:
  • 0, Reserved, invalid
    • Behaves identically to encodings with bit[0] set to 0.
    • This encoding must not be used in level 3 translation tables.
  • 1, Page Gives the address and attributes of a 4KB, 16KB, or 64KB page of memory.
    • At this level, the only valid format is the Page descriptor. The other fields in the Page descriptor are:
    • Page descriptor
      Gives the output address of a page of memory, as follows:
      • 4KB translation granule
        • Bits[47:12] are bits[47:12] of the output address for a page of memory
      • 16KB translation granule
        • Bits[47:14] are bits[47:14] of the output address for a page of memory.
      • 64KB translation granule
        • bits[47:16] are bits[47:16] of the output address for a page of memory

4 linux arm64 页表映射

在linux系统中,arm64的页表映射是通过__create_pgd_mapping函数实现的,在linux 系统中,页表的级别分为为PGD,PUD,PMD,PTE。这分别和arm64的L0,L1,L2,L3相对应。
以4KB页面4级页表为例来分析
在这里插入图片描述

4.1 __create_pgd_mapping

__create_pgd_mapping函数__create_pgd_mapping_locked实现后续的页表映射工作。

static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys,
				 unsigned long virt, phys_addr_t size,
				 pgprot_t prot,
				 phys_addr_t (*pgtable_alloc)(int),
				 int flags)
{
	mutex_lock(&fixmap_lock);
	__create_pgd_mapping_locked(pgdir, phys, virt, size, prot,
				    pgtable_alloc, flags);
	mutex_unlock(&fixmap_lock);
}

4.2 __create_pgd_mapping_locked

  • pgd_t *pgdir 表示的pgd页表的起始地址
  • pgd_t *pgdp = pgd_offset_pgd(pgdir, virt);获取当前pgd页表的地址
  • next = pgd_addr_end(addr, end);获取当前pgd管理页表地址的结束地址,其管理范围为512G(2^39)
  • alloc_init_pud(pgdp, addr, next, phys, prot, pgtable_alloc, flags);分配并配置当前的pud页表映射
  • phys += next - addr;下次要配置的pgd页表项的地址
  • while (pgdp++, addr = next, addr != end)这段代码中的 addr = next表示要获取下一个pgd页表项的起始地址
static void __create_pgd_mapping_locked(pgd_t *pgdir, phys_addr_t phys,
					unsigned long virt, phys_addr_t size,
					pgprot_t prot,
					phys_addr_t (*pgtable_alloc)(int),
					int flags)
{
	unsigned long addr, end, next;
	pgd_t *pgdp = pgd_offset_pgd(pgdir, virt);

	/*
	 * If the virtual and physical address don't have the same offset
	 * within a page, we cannot map the region as the caller expects.
	 */
	if (WARN_ON((phys ^ virt) & ~PAGE_MASK))
		return;

	phys &= PAGE_MASK;
	addr = virt & PAGE_MASK;
	end = PAGE_ALIGN(virt + size);

	do {
		next = pgd_addr_end(addr, end);
		alloc_init_pud(pgdp, addr, next, phys, prot, pgtable_alloc,
			       flags);
		phys += next - addr;
	} while (pgdp++, addr = next, addr != end);
}

4.3 alloc_init_pud

alloc_init_pud函数就是要在配置pud(即L1)级别的页表项

  • p4d_none(p4d)判断当前的pgd页表项是否为空,如果为空则需要配置当前的pgd页表项
  • pud_phys = pgtable_alloc(PUD_SHIFT);分配pud页表项
  • __p4d_populate(p4dp, pud_phys, p4dval);将申请的pud页表的地址配置到pgd页表项中
  • pudp = pud_set_fixmap_offset(p4dp, addr);获取pud页表项的地址
  • next = pud_addr_end(addr, end);获取当前pud页表项管理的结束地址,其管理范围为1G (2^30)
  • pud_set_huge(pudp, phys, prot);如果当前的页表描述符表类型为块设备,则输出当前的内存地址为一个1G大小粒度的huge 内存块。
  • alloc_init_cont_pmd(pudp, addr, next, phys, prot, pgtable_alloc, flags);如果当前内存是一个连续的内存,则需要继续设置其下一级页表PMD
static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end,
			   phys_addr_t phys, pgprot_t prot,
			   phys_addr_t (*pgtable_alloc)(int),
			   int flags)
{
	unsigned long next;
	pud_t *pudp;
	p4d_t *p4dp = p4d_offset(pgdp, addr);
	p4d_t p4d = READ_ONCE(*p4dp);

	if (p4d_none(p4d)) {
		p4dval_t p4dval = P4D_TYPE_TABLE | P4D_TABLE_UXN;
		phys_addr_t pud_phys;

		if (flags & NO_EXEC_MAPPINGS)
			p4dval |= P4D_TABLE_PXN;
		BUG_ON(!pgtable_alloc);
		pud_phys = pgtable_alloc(PUD_SHIFT);
		__p4d_populate(p4dp, pud_phys, p4dval);
		p4d = READ_ONCE(*p4dp);
	}
	BUG_ON(p4d_bad(p4d));

	pudp = pud_set_fixmap_offset(p4dp, addr);
	do {
		pud_t old_pud = READ_ONCE(*pudp);

		next = pud_addr_end(addr, end);

		/*
		 * For 4K granule only, attempt to put down a 1GB block
		 */
		if (pud_sect_supported() &&
		   ((addr | next | phys) & ~PUD_MASK) == 0 &&
		    (flags & NO_BLOCK_MAPPINGS) == 0) {
			pud_set_huge(pudp, phys, prot);

			/*
			 * After the PUD entry has been populated once, we
			 * only allow updates to the permission attributes.
			 */
			BUG_ON(!pgattr_change_is_safe(pud_val(old_pud),
						      READ_ONCE(pud_val(*pudp))));
		} else {
			alloc_init_cont_pmd(pudp, addr, next, phys, prot,
					    pgtable_alloc, flags);

			BUG_ON(pud_val(old_pud) != 0 &&
			       pud_val(old_pud) != READ_ONCE(pud_val(*pudp)));
		}
		phys += next - addr;
	} while (pudp++, addr = next, addr != end);

	pud_clear_fixmap();
}

4.4 alloc_init_cont_pmd

alloc_init_cont_pmd函数用于设置其pmd页表

  • pud_none(pud)判断当前的pud页表是否为空,如果为空,则申请PMD页表,并将PMD页表的起始地址配置到pud页表项中
  • next = pmd_cont_addr_end(addr, end);获取当前pmd页表项管理的内存地址的结束地址,其范围为2M (2^21)
  • init_pmd(pudp, addr, next, phys, __prot, pgtable_alloc, flags);映射pmd页表项
static void alloc_init_cont_pmd(pud_t *pudp, unsigned long addr,
				unsigned long end, phys_addr_t phys,
				pgprot_t prot,
				phys_addr_t (*pgtable_alloc)(int), int flags)
{
	unsigned long next;
	pud_t pud = READ_ONCE(*pudp);

	/*
	 * Check for initial section mappings in the pgd/pud.
	 */
	BUG_ON(pud_sect(pud));
	if (pud_none(pud)) {
		pudval_t pudval = PUD_TYPE_TABLE | PUD_TABLE_UXN;
		phys_addr_t pmd_phys;

		if (flags & NO_EXEC_MAPPINGS)
			pudval |= PUD_TABLE_PXN;
		BUG_ON(!pgtable_alloc);
		pmd_phys = pgtable_alloc(PMD_SHIFT);
		__pud_populate(pudp, pmd_phys, pudval);
		pud = READ_ONCE(*pudp);
	}
	BUG_ON(pud_bad(pud));

	do {
		pgprot_t __prot = prot;

		next = pmd_cont_addr_end(addr, end);

		/* use a contiguous mapping if the range is suitably aligned */
		if ((((addr | next | phys) & ~CONT_PMD_MASK) == 0) &&
		    (flags & NO_CONT_MAPPINGS) == 0)
			__prot = __pgprot(pgprot_val(prot) | PTE_CONT);

		init_pmd(pudp, addr, next, phys, __prot, pgtable_alloc, flags);

		phys += next - addr;
	} while (addr = next, addr != end);
}

4.5 init_pmd

init_pmd函数用于配置pmd页表项

  • pmdp = pmd_set_fixmap_offset(pudp, addr);获取当前pmd页表的基地址
  • next = pmd_addr_end(addr, end);获取当前pmd页表项所管理范围的结束地址,其粒度为2M (2^21)
  • alloc_init_cont_pte(pmdp, addr, next, phys, prot, pgtable_alloc, flags);分配并映射pte页表
static void init_pmd(pud_t *pudp, unsigned long addr, unsigned long end,
		     phys_addr_t phys, pgprot_t prot,
		     phys_addr_t (*pgtable_alloc)(int), int flags)
{
	unsigned long next;
	pmd_t *pmdp;

	pmdp = pmd_set_fixmap_offset(pudp, addr);
	do {
		pmd_t old_pmd = READ_ONCE(*pmdp);

		next = pmd_addr_end(addr, end);

		/* try section mapping first */
		if (((addr | next | phys) & ~PMD_MASK) == 0 &&
		    (flags & NO_BLOCK_MAPPINGS) == 0) {
			pmd_set_huge(pmdp, phys, prot);

			/*
			 * After the PMD entry has been populated once, we
			 * only allow updates to the permission attributes.
			 */
			BUG_ON(!pgattr_change_is_safe(pmd_val(old_pmd),
						      READ_ONCE(pmd_val(*pmdp))));
		} else {
			alloc_init_cont_pte(pmdp, addr, next, phys, prot,
					    pgtable_alloc, flags);

			BUG_ON(pmd_val(old_pmd) != 0 &&
			       pmd_val(old_pmd) != READ_ONCE(pmd_val(*pmdp)));
		}
		phys += next - addr;
	} while (pmdp++, addr = next, addr != end);

	pmd_clear_fixmap();
}

4.6 alloc_init_cont_pte

alloc_init_cont_pte函数用于做pte页表的映射工作。

  • pmd_none(pmd)判断当前的pmd页表是否为空,如果为空,则分配pte页表并配置到pmd页表项中
  • init_pte(pmdp, addr, next, phys, __prot) pte页表项的映射配置
static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
				unsigned long end, phys_addr_t phys,
				pgprot_t prot,
				phys_addr_t (*pgtable_alloc)(int),
				int flags)
{
	unsigned long next;
	pmd_t pmd = READ_ONCE(*pmdp);

	BUG_ON(pmd_sect(pmd));
	if (pmd_none(pmd)) {
		pmdval_t pmdval = PMD_TYPE_TABLE | PMD_TABLE_UXN;
		phys_addr_t pte_phys;

		if (flags & NO_EXEC_MAPPINGS)
			pmdval |= PMD_TABLE_PXN;
		BUG_ON(!pgtable_alloc);
		pte_phys = pgtable_alloc(PAGE_SHIFT);
		__pmd_populate(pmdp, pte_phys, pmdval);
		pmd = READ_ONCE(*pmdp);
	}
	BUG_ON(pmd_bad(pmd));

	do {
		pgprot_t __prot = prot;

		next = pte_cont_addr_end(addr, end);

		/* use a contiguous mapping if the range is suitably aligned */
		if ((((addr | next | phys) & ~CONT_PTE_MASK) == 0) &&
		    (flags & NO_CONT_MAPPINGS) == 0)
			__prot = __pgprot(pgprot_val(prot) | PTE_CONT);

		init_pte(pmdp, addr, next, phys, __prot);

		phys += next - addr;
	} while (addr = next, addr != end);
}

4.7 init_pte

init_pte函数用于做pte页表的映射工作

  • ptep = pte_set_fixmap_offset(pmdp, addr);获取当前pte页表的起始地址
  • set_pte(ptep, pfn_pte(__phys_to_pfn(phys), prot));配置pte页表项
  • phys += PAGE_SIZE;每次往后移一个PAGE,即配置下一个内存页面。
static void init_pte(pmd_t *pmdp, unsigned long addr, unsigned long end,
		     phys_addr_t phys, pgprot_t prot)
{
	pte_t *ptep;

	ptep = pte_set_fixmap_offset(pmdp, addr);
	do {
		pte_t old_pte = READ_ONCE(*ptep);

		set_pte(ptep, pfn_pte(__phys_to_pfn(phys), prot));

		/*
		 * After the PTE entry has been populated once, we
		 * only allow updates to the permission attributes.
		 */
		BUG_ON(!pgattr_change_is_safe(pte_val(old_pte),
					      READ_ONCE(pte_val(*ptep))));

		phys += PAGE_SIZE;
	} while (ptep++, addr += PAGE_SIZE, addr != end);

	pte_clear_fixmap();
}
  • 4
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
ARM架构的Linux页表代码实现主要涉及到以下几个文件: 1. `arch/arm/include/asm/pgtable.h`:定义了ARM架构下的页表相关宏和数据结构,包括页表项(PTE)和页目录项(PMD)的定义。 2. `arch/arm/mm/pgtable.c`:实现了ARM架构下的页表相关函数,包括页表初始化、页表项/页目录项的创建和修改等。 3. `arch/arm/mm/mmu.c`:定义了ARM架构下的内存管理单元(MMU)的初始化和配置函数,包括页表的设置和切换等。 在ARM架构下,页表使用两级结构,包括页目录表(Page Directory Table)和页表(Page Table)。每个级别的表都有对下一级表的指针,最终指向物理内存中的页帧。 ARM架构中的页表项(PTE)和页目录项(PMD)由特定位字段组成,用于保存物理地址、标志位和其他控制信息。页表项和页目录项的结构可以在`arch/arm/include/asm/pgtable.h`中找到。 在ARM架构下,通过`pgd_offset`、`pmd_offset`、`pte_offset`等函数可以计算出对应虚拟地址在页表中的索引,并通过这些索引来获取或设置相应的页表项或页目录项。 初始化页表时,可以调用`pgd_alloc`、`pmd_alloc`、`pte_alloc`等函数来创建页表项和页目录项,并通过`pgd_populate`、`pmd_populate`等函数将物理页框与虚拟地址进行映射。 ARM架构中的MMU初始化和配置主要通过`setup_mm_for_reboot`、`init_mmu`和`__create_page_tables`等函数完成。这些函数会设置页表寄存器(TTBR0/TTBR1)和控制寄存器(CRn),从而将页表切换到对应的地址空间。 需要注意的是,不同的ARM架构版本和具体的SoC可能会有一些细微的差异,因此具体的代码分析还需要参考相关的架构文档和代码实现。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值