ARM32 linux内核启动打印
Linux内核在启动时会打印出内核内存空间的布局图,下面是ARM Vexpress平台打印出来的内存空间布局图:
Virtual kernel memory layout:
vector : 0xffff0000 - 0xffff1000 ( 4 kB)
fixmap : 0xffc00000 - 0xfff00000 (3072 kB)
vmalloc : 0xf0800000 - 0xff800000 ( 240 MB)
lowmem : 0xc0000000 - 0xf0000000 ( 768 MB)
pkmap : 0xbfe00000 - 0xc0000000 ( 2 MB)
modules : 0xbf000000 - 0xbfe00000 ( 14 MB)
.text : 0xc0008000 - 0xc0600000 (6112 kB)
.init : 0xc0800000 - 0xc0900000 (1024 kB)
.data : 0xc0900000 - 0xc092a7c0 ( 170 kB)
.bss : 0xc092c000 - 0xc0958438 ( 178 kB)
以上打印是在函数mem_init实现的
pr_notice("Virtual kernel memory layout:\n"
" vector : 0x%08lx - 0x%08lx (%4ld kB)\n"
#ifdef CONFIG_HAVE_TCM
" DTCM : 0x%08lx - 0x%08lx (%4ld kB)\n"
" ITCM : 0x%08lx - 0x%08lx (%4ld kB)\n"
#endif
" fixmap : 0x%08lx - 0x%08lx (%4ld kB)\n"
" vmalloc : 0x%08lx - 0x%08lx (%4ld MB)\n"
" lowmem : 0x%08lx - 0x%08lx (%4ld MB)\n"
#ifdef CONFIG_HIGHMEM
" pkmap : 0x%08lx - 0x%08lx (%4ld MB)\n"
#endif
#ifdef CONFIG_MODULES
" modules : 0x%08lx - 0x%08lx (%4ld MB)\n"
#endif
" .text : 0x%p" " - 0x%p" " (%4td kB)\n"
" .init : 0x%p" " - 0x%p" " (%4td kB)\n"
" .data : 0x%p" " - 0x%p" " (%4td kB)\n"
" .bss : 0x%p" " - 0x%p" " (%4td kB)\n",
MLK(UL(CONFIG_VECTORS_BASE), UL(CONFIG_VECTORS_BASE) +
(PAGE_SIZE)),
#ifdef CONFIG_HAVE_TCM
MLK(DTCM_OFFSET, (unsigned long) dtcm_end),
MLK(ITCM_OFFSET, (unsigned long) itcm_end),
#endif
MLK(FIXADDR_START, FIXADDR_END),
MLM(VMALLOC_START, VMALLOC_END),
MLM(PAGE_OFFSET, (unsigned long)high_memory),
#ifdef CONFIG_HIGHMEM
MLM(PKMAP_BASE, (PKMAP_BASE) + (LAST_PKMAP) *
(PAGE_SIZE)),
#endif
#ifdef CONFIG_MODULES
MLM(MODULES_VADDR, MODULES_END),
#endif
MLK_ROUNDUP(_text, _etext),
MLK_ROUNDUP(__init_begin, __init_end),
MLK_ROUNDUP(_sdata, _edata),
MLK_ROUNDUP(__bss_start, __bss_stop));
内核image内存空间
编译器在编译目标文件并且链接完成之后,就可以知道内核映像文件最终的大小,接下来打包成二进制文件,该操作由arch/arm/kernel/vmlinux.ld.S控制,其中也划定了内核的内存布局。内核image本身占据的内存空间从_text段到_end段,并且分为如下几个段。
- 代码段:_text和_etext为代码段的起始和结束地址,包含了编译后的内核代码。
- init段:__init_begin和__init_end为init段的起始和结束地址,包含了大部分模块初始化的数据。
- 数据段:_sdata和_edata为数据段的起始和结束地址,保存大部分内核的变量。
- BSS段:__bss_start和__bss_stop为BSS段的开始和结束地址,包含初始化为0的所有静态全局变量。
以上代码段的定义都可以在链接脚本vmlinux.ld.S中找到,内核编译完成之后,会生成一个System.map文件,查询这个文件可以找到这些地址的具体数值。
text代码段
c0008000 T _text
c0008000 T stext
c0600000 R _etext
init段
c0800000 T __init_begin
c0900000 D __init_end
数据段
c0900000 D _sdata
c092a7c0 D _edata
bss段
c092c000 B __bss_start
c0958438 B __bss_stop
这些虚拟地址信息跟内核启动时打印出来的一模一样。
内核模块modules虚拟地址空间
内核模块使用虚拟地址从MODULES_VADDR到MODULES_END的这段14MB大小的内存区域。
arch/arm/include/asm/memory.h
/*
* The module space lives between the addresses given by TASK_SIZE
* and PAGE_OFFSET - it must be within 32MB of the kernel text.
*/
#ifndef CONFIG_THUMB2_KERNEL --->未定义
#define MODULES_VADDR (PAGE_OFFSET - SZ_16M)
#else
/* smaller range for Thumb-2 symbols relocation (2^24)*/
#define MODULES_VADDR (PAGE_OFFSET - SZ_8M)
#endif
/*
* The highmem pkmap virtual space shares the end of the module area.
*/
#ifdef CONFIG_HIGHMEM --->高端内存定义
#define MODULES_END (PAGE_OFFSET - PMD_SIZE)
#else
#define MODULES_END (PAGE_OFFSET)
#endif
#define PMD_SIZE (1UL << PMD_SHIFT) ---->2M 大小
当用户空间和内核空间使用3:1的划分方法时,内核空间只有1GB大小,PAGE_OFFSET大小是0xc0000000,因此:
MODULES_VADDR = PAGE_OFFSET - SZ_16M = 0xbf000000
MODULES_END = PAGE_OFFSET - PMD_SIZE = 0xbfe00000
vmalloc虚拟地址空间
vmalloc使用虚拟地址空间范围是:VMALLOC_START~VMALLOC_END
#define VMALLOC_OFFSET (8*1024*1024)
#define VMALLOC_START (((unsigned long)high_memory + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1))
#define VMALLOC_END 0xff800000UL
由此可见,vmalloc的起始地址依赖于high_memory,也就是说高端内存地址再偏移8M空间。
high_memory大小的赋值在函数adjust_lowmem_bounds里面做的。
void __init adjust_lowmem_bounds(void)
{
phys_addr_t memblock_limit = 0;
u64 vmalloc_limit;
struct memblock_region *reg;
phys_addr_t lowmem_limit = 0;
/*
* Let's use our own (unoptimized) equivalent of __pa() that is
* not affected by wrap-arounds when sizeof(phys_addr_t) == 4.
* The result is used as the upper bound on physical memory address
* and may itself be outside the valid range for which phys_addr_t
* and therefore __pa() is defined.
*/
vmalloc_limit = (u64)(uintptr_t)vmalloc_min - PAGE_OFFSET + PHYS_OFFSET;
for_each_memblock(memory, reg) {
phys_addr_t block_start = reg->base;
phys_addr_t block_end = reg->base + reg->size;
if (reg->base < vmalloc_limit) {
if (block_end > lowmem_limit)
/*
* Compare as u64 to ensure vmalloc_limit does
* not get truncated. block_end should always
* fit in phys_addr_t so there should be no
* issue with assignment.
*/
lowmem_limit = min_t(u64,
vmalloc_limit,
block_end);
/*
* Find the first non-pmd-aligned page, and point
* memblock_limit at it. This relies on rounding the
* limit down to be pmd-aligned, which happens at the
* end of this function.
*
* With this algorithm, the start or end of almost any
* bank can be non-pmd-aligned. The only exception is
* that the start of the bank 0 must be section-
* aligned, since otherwise memory would need to be
* allocated when mapping the start of bank 0, which
* occurs before any free memory is mapped.
*/
if (!memblock_limit) {
if (!IS_ALIGNED(block_start, PMD_SIZE))
memblock_limit = block_start;
else if (!IS_ALIGNED(block_end, PMD_SIZE))
memblock_limit = lowmem_limit;
}
}
}
arm_lowmem_limit = lowmem_limit;
high_memory = __va(arm_lowmem_limit - 1) + 1;
if (!memblock_limit)
memblock_limit = arm_lowmem_limit;
/*
* Round the memblock limit down to a pmd size. This
* helps to ensure that we will allocate memory from the
* last full pmd, which should be mapped.
*/
memblock_limit = round_down(memblock_limit, PMD_SIZE);
if (!IS_ENABLED(CONFIG_HIGHMEM) || cache_is_vipt_aliasing()) {
if (memblock_end_of_DRAM() > arm_lowmem_limit) {
phys_addr_t end = memblock_end_of_DRAM();
pr_notice("Ignoring RAM at %pa-%pa\n",
&memblock_limit, &end);
pr_notice("Consider using a HIGHMEM enabled kernel.\n");
memblock_remove(memblock_limit, end - memblock_limit);
}
}
memblock_set_current_limit(memblock_limit);
}
关键代码是如下几行:
1. vmalloc_limit = (u64)(uintptr_t)vmalloc_min - PAGE_OFFSET + PHYS_OFFSET; //PHYS_OFFSET在Vexpress平台上定义的是0x60000000,也就是物理内存起始地址
2. static void * __initdata vmalloc_min =
(void *)(VMALLOC_END - (240 << 20) - VMALLOC_OFFSET);
3. lowmem_limit = min_t(u64, vmalloc_limit, block_end);
4. arm_lowmem_limit = lowmem_limit;
5. high_memory = __va(arm_lowmem_limit - 1) + 1;
由此可见:
vmalloc_limit=0xff800000UL-(240 << 20) -(810241024)-0xc0000000+0x60000000=0xff800000UL-0xF800000-0xc0000000+0x60000000=0x90000000
arm_lowmem_limit = lowmem_limit = vmalloc_limit
high_memory = 0x90000000 - 0x60000000 + 0xc0000000 = 0xF0000000
所以:VMALLOC_START = 0xF0000000 + (810241024) = 0xF0800000
__va函数是把物理地址转换为虚拟地址,当内核空间只有1GB大小时,其中有一部分用于直接映射物理地址,这个区域称为线性映射区。线性映射区的虚拟地址和物理地址相差PAGE_OFFSET,即3GB。内核中有相关的宏来实现线性映射区虚拟地址到物理地址的查找过程,例如__pa(x)和__va(x)。
static inline phys_addr_t __virt_to_phys(unsigned long x)
{
return (phys_addr_t)x - PAGE_OFFSET + PHYS_OFFSET;
}
static inline unsigned long __phys_to_virt(phys_addr_t x)
{
return x - PHYS_OFFSET + PAGE_OFFSET;
}
/*
* Drivers should NOT use these either.
*/
#define __pa(x) __virt_to_phys((unsigned long)(x))
#define __va(x) ((void *)__phys_to_virt((phys_addr_t)(x)))
lowmemx虚拟地址空间
该虚拟地址空间范围是:PAGE_OFFSET~high_memory,由上文可知该数值是 0xc0000000 - 0xf0000000 ( 768 MB)
该指大小与内核启动时传入的参数mem相关,如果物理内存小于768M,那么该数值就是传入的mem参数,如果物理物理内存大于768M,那么该虚拟地址空间的最大值也就是768M;
pkmap虚拟地址空间
只有使能CONFIG_HIGHMEM高端内存宏的时候,才有该虚拟地址空间,该地址空间范围是:PKMAP_BASE~(PKMAP_BASE) + (LAST_PKMAP) *
(PAGE_SIZE))
arch/arm/include/sm/highmem.h
#define PKMAP_BASE (PAGE_OFFSET - PMD_SIZE) ---->0xbfe00000
#define LAST_PKMAP PTRS_PER_PTE ----->512
所以pkmap的地址空间大小是2M,结束地址是0xc0000000
fixmap虚拟地址空间
固定映射区域,虚拟地址空间范围是:FIXADDR_START~FIXADDR_END
#define FIXADDR_START 0xffc00000UL
#define FIXADDR_END 0xfff00000UL
#define FIXADDR_TOP (FIXADDR_END - PAGE_SIZE)
vector虚拟地址空间
虚拟地址空间范围是:CONFIG_VECTORS_BASE~CONFIG_VECTORS_BASE + PAGE_SIZE
CONFIG_VECTORS_BASE在配置文件里面定义的,其大小是 CONFIG_VECTORS_BASE=0xffff0000
结束地址是:0xffff1000
为什么内核只线性映射768MB呢?剩下的256MB的虚拟地址空间用来做什么呢?
那是保留给vmalloc、fixmap和高端向量表等使用的。内核很多驱动使用vmalloc来分配连续虚拟地址的内存,因为有的驱动不需要连续物理地址的内存;除此以外,vmalloc还可以用于高端内存的临时映射。一个32bit系统中实际支持的内存数量会超过内核线性映射的长度,但是内核要具有对所有内存的寻找能力。
linux内核虚拟地址空间范围如下所示: