linux系统启动过程

首先,uboot根目录(具体版本如u-boot-2013.04-rc2)下的Makefile文件指导uboot的编译,其中
1)以下语句定位链接描述文件ldscript的位置:

# load other configuration
include $(TOPDIR)/config.mk

# If board code explicitly specified LDSCRIPT or CONFIG_SYS_LDSCRIPT, use
# that (or fail if absent).  Otherwise, search for a linker script in a
# standard location.

LDSCRIPT_MAKEFILE_DIR = $(dir $(LDSCRIPT))

ifndef LDSCRIPT
	#LDSCRIPT := $(TOPDIR)/board/$(BOARDDIR)/u-boot.lds.debug
	ifdef CONFIG_SYS_LDSCRIPT
		# need to strip off double quotes
		LDSCRIPT := $(subst ",,$(CONFIG_SYS_LDSCRIPT))
	endif
endif

# If there is no specified link script, we look in a number of places for it
ifndef LDSCRIPT
	ifeq ($(CONFIG_NAND_U_BOOT),y)
		LDSCRIPT := $(TOPDIR)/board/$(BOARDDIR)/u-boot-nand.lds
		ifeq ($(wildcard $(LDSCRIPT)),)
			LDSCRIPT := $(TOPDIR)/$(CPUDIR)/u-boot-nand.lds
		endif
	endif
	ifeq ($(wildcard $(LDSCRIPT)),)
		LDSCRIPT := $(TOPDIR)/board/$(BOARDDIR)/u-boot.lds
	endif
	ifeq ($(wildcard $(LDSCRIPT)),)
		LDSCRIPT := $(TOPDIR)/$(CPUDIR)/u-boot.lds
	endif
	ifeq ($(wildcard $(LDSCRIPT)),)
		LDSCRIPT := $(TOPDIR)/arch/$(ARCH)/cpu/u-boot.lds
		# We don't expect a Makefile here
		LDSCRIPT_MAKEFILE_DIR =
	endif
	ifeq ($(wildcard $(LDSCRIPT)),)
$(error could not find linker script)
	endif
endif

对于arm架构NUC970系列设备,具体指向arch/arm/cpu/u-boot.lds文件。

2)以下语句指定第一个被编译链接的源码文件,即CPU子目录下的start.s文件:

#########################################################################
# U-Boot objects....order is important (i.e. start must be first)

OBJS  = $(CPUDIR)/start.o
ifeq ($(CPU),ppc4xx)
OBJS += $(CPUDIR)/resetvec.o
endif
ifeq ($(CPU),mpc85xx)
OBJS += $(CPUDIR)/resetvec.o
endif

对于采用ARM926系列芯片的设备,具体指向arch/arm/cpu/arm926ejs/start.S文件。

在arch/arm/cpu/arm926ejs/start.S文件中,第一条语句是跳转到reset函数执行:

#ifdef CONFIG_SYS_DV_NOR_BOOT_CFG
.globl _start
_start:
.globl _NOR_BOOT_CFG
_NOR_BOOT_CFG:
	.word	CONFIG_SYS_DV_NOR_BOOT_CFG
	b	reset
#else
.globl _start
_start:
	b	reset
#endif

第一条语句之后是一些重要的中断向量(中断函数的地址),其后的reset函数实现如下:

/*
 * the actual reset code
 */

reset:
	/*
	 * set the cpu to SVC32 mode
	 */
	mrs	r0,cpsr
	bic	r0,r0,#0x1f
	orr	r0,r0,#0xd3
	msr	cpsr,r0

	/*
	 * we do sys-critical inits only at reboot,
	 * not when booting from ram!
	 */
#ifndef CONFIG_SKIP_LOWLEVEL_INIT
	bl	cpu_init_crit
#endif

	bl	_main

这里首先是调用cpu_init_crit()函数(只有重新启动时才需要),而后调用main()函数。

cpu_init_crit()函数定义也在start.S文件的后部分,在中断栈帧之前:

/*
 *************************************************************************
 *
 * CPU_init_critical registers
 *
 * setup important registers
 * setup memory timing
 *
 *************************************************************************
 */
#ifndef CONFIG_SKIP_LOWLEVEL_INIT
cpu_init_crit:
	/*
	 * flush D cache before disabling it
	 */
	mov	r0, #0
flush_dcache:
	mrc	p15, 0, r15, c7, c10, 3
	bne	flush_dcache

	mcr	p15, 0, r0, c8, c7, 0	/* invalidate TLB */
	mcr	p15, 0, r0, c7, c5, 0	/* invalidate I Cache */

	/*
	 * disable MMU and D cache
	 * enable I cache if CONFIG_SYS_ICACHE_OFF is not defined
	 */
	mrc	p15, 0, r0, c1, c0, 0
	bic	r0, r0, #0x00000300	/* clear bits 9:8 (---- --RS) */
	bic	r0, r0, #0x00000087	/* clear bits 7, 2:0 (B--- -CAM) */
#ifdef CONFIG_SYS_EXCEPTION_VECTORS_HIGH
	orr	r0, r0, #0x00002000	/* set bit 13 (--V- ----) */
#else
	bic	r0, r0, #0x00002000	/* clear bit 13 (--V- ----) */
#endif
	orr	r0, r0, #0x00000002	/* set bit 2 (A) Align */
#ifndef CONFIG_SYS_ICACHE_OFF
	orr	r0, r0, #0x00001000	/* set bit 12 (I) I-Cache */
#endif
	mcr	p15, 0, r0, c1, c0, 0

	/*
	 * Go setup Memory and board specific bits prior to relocation.
	 */
	mov	ip, lr		/* perserve link reg across call */
	bl	lowlevel_init	/* go setup pll,mux,memory */
	mov	lr, ip		/* restore link */
	mov	pc, lr		/* back to my caller */
#endif /* CONFIG_SKIP_LOWLEVEL_INIT */

可见其最后还调用了lowlevel_init()函数。

reset()函数最后调用的_main()函数定义在/arch/arm/lib/crt0.S文件中,如下:

/*
 * entry point of crt0 sequence
 */

.global _main

_main:

/*
 * Set up initial C runtime environment and call board_init_f(0).
 */

#if defined(CONFIG_NAND_SPL)
	/* deprecated, use instead CONFIG_SPL_BUILD */
	ldr	sp, =(CONFIG_SYS_INIT_SP_ADDR)
#elif defined(CONFIG_SPL_BUILD) && defined(CONFIG_SPL_STACK)
	ldr	sp, =(CONFIG_SPL_STACK)
#else
	ldr	sp, =(CONFIG_SYS_INIT_SP_ADDR)
#endif
	bic	sp, sp, #7	/* 8-byte alignment for ABI compliance */
	sub	sp, #GD_SIZE	/* allocate one GD above SP */
	bic	sp, sp, #7	/* 8-byte alignment for ABI compliance */
	mov	r8, sp		/* GD is above SP */
	mov	r0, #0
	bl	board_init_f

#if ! defined(CONFIG_SPL_BUILD)

/*
 * Set up intermediate environment (new sp and gd) and call
 * relocate_code(addr_sp, gd, addr_moni). Trick here is that
 * we'll return 'here' but relocated.
 */

	ldr	sp, [r8, #GD_START_ADDR_SP]	/* r8 = gd->start_addr_sp */
	bic	sp, sp, #7	/* 8-byte alignment for ABI compliance */
	ldr	r8, [r8, #GD_BD]		/* r8 = gd->bd */
	sub	r8, r8, #GD_SIZE		/* new GD is below bd */

	adr	lr, here
	ldr	r0, [r8, #GD_RELOC_OFF]		/* lr = gd->start_addr_sp */
	add	lr, lr, r0
	ldr	r0, [r8, #GD_START_ADDR_SP]	/* r0 = gd->start_addr_sp */
	mov	r1, r8				/* r1 = gd */
	ldr	r2, [r8, #GD_RELOCADDR]		/* r2 = gd->relocaddr */
	b	relocate_code
here:

/* Set up final (full) environment */

	bl	c_runtime_cpu_setup	/* we still call old routine here */

	ldr	r0, =__bss_start	/* this is auto-relocated! */
	ldr	r1, =__bss_end		/* this is auto-relocated! */

	mov	r2, #0x00000000		/* prepare zero to clear BSS */

clbss_l:cmp	r0, r1			/* while not at end of BSS */
	strlo	r2, [r0]		/* clear 32-bit BSS word */
	addlo	r0, r0, #4		/* move to next */
	blo	clbss_l

	bl coloured_LED_init
	bl red_led_on

#if defined(CONFIG_NAND_SPL)

	/* call _nand_boot() */
	ldr     pc, =nand_boot

#else

	/* call board_init_r(gd_t *id, ulong dest_addr) */
	mov	r0, r8			/* gd_t */
	ldr	r1, [r8, #GD_RELOCADDR]	/* dest_addr */
	/* call board_init_r */
	ldr	pc, =board_init_r	/* this is auto-relocated! */

#endif

	/* we should not return here. */

#endif

_main()函数的主要任务是建立C语言代码的运行环境,它首先分配了堆栈,在栈帧的顶部预留空间给全局标识符表GD,然后执行重定位代码relocate_code(),注意:后续执行的代码是经过重新定位之后的空间中执行!然后清空BSS数据段,然后根据预编译指令,如果是从nand flash启动,就执行nand_boot(),否则就调用board_init_r()函数。

board_init_r()函数的实现在文件common/board_r.c文件中,该文件中定义了一个初始化函数指针数组init_sequence_r[],在board_init_r()函数中会依次调用这些初始化函数完成整个boot的初始化工作。board_init_r()函数的实现如下:

void board_init_r(gd_t *new_gd, ulong dest_addr)
{
#ifndef CONFIG_X86
	gd = new_gd;
#endif
	if (initcall_run_list(init_sequence_r))
		hang();

	/* NOTREACHED - run_main_loop() does not return */
	hang();
}

init_sequence_r[]数组的第一项是initr_reloc(),最后一项是run_main_loop(),这些初始化函数都在common/board_r.c文件中实现。run_main_loop()函数就是一个无限循环,如下:

static int run_main_loop(void)
{
	/* main_loop() can return to retry autoboot, if so just run it again */
	for (;;)
		main_loop();
	return 0;
}

进一步,main_loop()函数的实现在common/main.c文件中,它在完成基本的boot环境参数初始化后,再次进入一个无限循环,始终等待着用户输入boot命令并解析执行。以上就是uboot的整个初始化流程序列。

uboot解析和执行用户命令的一般实现都在common/command.c文件中,而对于各个具体命令的实现则在common/cmd_xxx.c文件中,其中xxx与命令所要处理的对象的名称有着相对直观的关系,目前支持的boot命令如下:

0       - do nothing, unsuccessfully
1       - do nothing, successfully
?       - alias for 'help'
base    - print or set address offset
bdinfo  - print Board Info structure
boot    - boot default, i.e., run 'bootcmd'
bootd   - boot default, i.e., run 'bootcmd'
bootm   - boot application image from memory
bootp   - boot image via network using BOOTP/TFTP protocol
chpart  - change active partition
cls     - clear screen
cmp     - memory compare
coninfo - print console devices and information
cp      - memory copy
crc32   - checksum calculation
decrypt - Decrypt image(kernel)
dhcp    - boot image via network using DHCP/TFTP protocol
echo    - echo args to console
editenv - edit environment variable
env     - environment handling commands
exit    - exit script
go      - start application at address 'addr'
gpio    - input/set/clear/toggle gpio pins
help    - print command description/usage
iminfo  - print header information for application image
imxtract- extract a part of a multi-image
itest   - return true/false on integer compare
lcdecho - echo args to console
loadb   - load binary file over serial line (kermit mode)
loady   - load binary file over serial line (ymodem mode)
loop    - infinite loop on address range
md      - memory display
mm      - memory modify (auto-incrementing address)
msleep  - msleep execution for some time
mtdparts- define flash/nand partitions
mtest   - simple RAM read/write test
mw      - memory write (fill)
nand    - NAND sub-system
nboot   - boot from NAND device
nfs     - boot image via network using NFS protocol
nm      - memory modify (constant address)
ping    - send ICMP ECHO_REQUEST to network host
printenv- print environment variables
reset   - Perform RESET of the CPU
run     - run commands in an environment variable
saveenv - save environment variables to persistent storage
setenv  - set environment variables
sf      - SPI flash sub-system
showvar - print local hushshell variables
sleep   - delay execution for some time
sspi    - SPI utility command
test    - minimal test like /bin/sh
tftpboot- boot image via network using TFTP protocol
tftpput - TFTP put command, for uploading files to a server
timer   - access the system timer
version - print monitor, compiler and linker version

在这么多命令中,最为关键的是加载并运行操作系统的命令,即bootx(字母x表示启动的各种不同变种,如boot/bootd/bootm/bootp等),其中bootm表示从内存启动,对应的实现文件为common/cmd_bootm.c。各种启动命令其实都是从不同的渠道(如spiflash/nandflash/net/等)读取kernel镜像和rootfs文件系统镜像到内存中,然后运行bootm命令从内存加载包括解压kernel镜像,启动kernel并告诉kernel从内存中的既定位置加载rootfs文件系统。kernel启动之后,加载了根文件系统,对系统的各种资源就有了统一的定位(通过各种资源的驱动实现),在rootfs文件系统中/etc/init.d/rcS.d目录中存放了一系列的初始化脚本,用于加载外设/设置系统环境/启动各种基础服务等,最后启动用户登录界面等待用户登录。

以bootm为例,具体的执行函数为do_bootm()。
1)它首先调用bootm_start()函数,后者分别调用boot_get_kernel()和boot_get_ramdisk()来从bootm命令行参数中读取kernel镜像文件和ramdisk镜像文件的存储地址,并从指定地址读取镜像文件的头部内容,进一步获得kernel和ramdisk的起始地址/加载地址/内容长度/压缩类型等等信息。kernel和ramdisk可以打包在一起,也可以分别编译成独立的文件,都可以是压缩或者是未压缩的,其中ramdisk镜像实际就是rootfs文件系统镜像,读取的结果存放在bootm_headers_t类型的images全局变量中。
2)然后do_bootm()关掉中断,禁用网卡/usb和其他交互性外设(包括重置bootflash),然后调用bootm_load_os()从刚读取的kernel起始地址位置加载kernel,这里会区分kernel的压缩类型,如果是未压缩的镜像,则直接从镜像内容起始地址拷贝到加载地址去,如果是压缩的镜像,则调用相应解压缩算法将起始地址开始的压缩内容解压到加载地址开始的位置。
3)然后,对于STANDALONE类型的kernel,直接调用bootm_start_standalone()运行内核;否则,根据预先设定的与内核类型对应的启动函数,执行相应的启动函数,且不再返回,如果返回则表示启动失败,直接调用do_reset()重启,具体代码如下:

int do_bootm(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[])
{
......
	boot_fn = boot_os[images.os.os];

	if (boot_fn == NULL) {
		if (iflag)
			enable_interrupts();
		printf("ERROR: booting os '%s' (%d) is not supported\n",
			genimg_get_os_name(images.os.os), images.os.os);
		bootstage_error(BOOTSTAGE_ID_CHECK_BOOT_OS);
		return 1;
	}

	arch_preboot_os();

	boot_fn(0, argc, argv, &images);

	bootstage_error(BOOTSTAGE_ID_BOOT_OS_RETURNED);
#ifdef DEBUG
	puts("\n## Control returned to monitor - resetting...\n");
#endif
	do_reset(cmdtp, flag, argc, argv);
......
}

上面的boot_os启动函数数组的定义如下,都是预先静态定义的:
static boot_os_fn *boot_os[] = {
#ifdef CONFIG_BOOTM_LINUX
	[IH_OS_LINUX] = do_bootm_linux,
#endif
#ifdef CONFIG_BOOTM_NETBSD
	[IH_OS_NETBSD] = do_bootm_netbsd,
#endif
#ifdef CONFIG_LYNXKDI
	[IH_OS_LYNXOS] = do_bootm_lynxkdi,
#endif
#ifdef CONFIG_BOOTM_RTEMS
	[IH_OS_RTEMS] = do_bootm_rtems,
#endif
#if defined(CONFIG_BOOTM_OSE)
	[IH_OS_OSE] = do_bootm_ose,
#endif
#if defined(CONFIG_BOOTM_PLAN9)
	[IH_OS_PLAN9] = do_bootm_plan9,
#endif
#if defined(CONFIG_CMD_ELF)
	[IH_OS_VXWORKS] = do_bootm_vxworks,
	[IH_OS_QNX] = do_bootm_qnxelf,
#endif
#ifdef CONFIG_INTEGRITY
	[IH_OS_INTEGRITY] = do_bootm_integrity,
#endif
};
对于linux内核,对应的就是do_bootm_linux()函数,它是一个平台相关函数,arm平台对应的实现在arch/arm/lib/bootm.c文件中:
int do_bootm_linux(int flag, int argc, char *argv[], bootm_headers_t *images)
{
	/* No need for those on ARM */
	if (flag & BOOTM_STATE_OS_BD_T || flag & BOOTM_STATE_OS_CMDLINE)
		return -1;

	if (flag & BOOTM_STATE_OS_PREP) {
		boot_prep_linux(images);
		return 0;
	}

	if (flag & BOOTM_STATE_OS_GO) {
		boot_jump_linux(images);
		return 0;
	}

	boot_prep_linux(images);
	boot_jump_linux(images);
	return 0;
}

其实现很简单,因为do_bootm()传递的参数flag=0,因此实际就是顺序执行boot_prep_linux()和boot_jump_linux()函数,且参数都是前面读取到的kernel镜像信息结构体。其中boot_prep_linux()函数会根据bootargs环境变量以及boot镜像文件中的全局描述符表gd信息,建立一系列的linux启动参数struct tag *params;而boot_jump_linux()实际就是调用linux内核的第一个启动函数(位于加载地址的第一条指令),并为其准备适当的参数,代码如下:

static void boot_jump_linux(bootm_headers_t *images)
{
	unsigned long machid = gd->bd->bi_arch_number;
	char *s;
	void (*kernel_entry)(int zero, int arch, uint params);
	unsigned long r2;

	kernel_entry = (void (*)(int, int, uint))images->ep;

	s = getenv("machid");
	if (s) {
		strict_strtoul(s, 16, &machid);
		printf("Using machid 0x%lx from environment\n", machid);
	}

	debug("## Transferring control to Linux (at address %08lx)" \
		"...\n", (ulong) kernel_entry);
	bootstage_mark(BOOTSTAGE_ID_RUN_OS);
	announce_and_cleanup();

#ifdef CONFIG_OF_LIBFDT
	if (images->ft_len)
		r2 = (unsigned long)images->ft_addr;
	else
#endif
		r2 = gd->bd->bi_boot_params;

	kernel_entry(0, machid, r2);
}

对于STANDALONE类型的kernel镜像,对应的bootm_start_standalone()函数更简单,实际也是调用kernel的第一个函数,即前面在内核加载地址准备好的第一条指令,并将bootm的命令行参数传递给它,代码如下:

static int bootm_start_standalone(ulong iflag, int argc, char * const argv[])
{
	char  *s;
	int   (*appl)(int, char * const []);

	/* Don't start if "autostart" is set to "no" */
	if (((s = getenv("autostart")) != NULL) && (strcmp(s, "no") == 0)) {
		setenv_hex("filesize", images.os.image_len);
		return 0;
	}
	appl = (int (*)(int, char * const []))(ulong)ntohl(images.ep);
	(*appl)(argc-1, &argv[1]);
	return 0;
}

在arm平台上,链接的第一个linux内核源码文件为linux/arch/arm/kernel/head.S汇编文件,其第一条指令(入口函数)是stext,代码如下:

__HEAD
ENTRY(stext)

 THUMB(	adr	r9, BSYM(1f)	)	@ Kernel is always entered in ARM.
 THUMB(	bx	r9		)	@ If this is a Thumb-2 kernel,
 THUMB(	.thumb			)	@ switch to Thumb now.
 THUMB(1:			)

#ifdef CONFIG_ARM_VIRT_EXT
	bl	__hyp_stub_install
#endif
	@ ensure svc mode and all interrupts masked
	safe_svcmode_maskall r9

	mrc	p15, 0, r9, c0, c0		@ get processor id
	bl	__lookup_processor_type		@ r5=procinfo r9=cpuid
	movs	r10, r5				@ invalid processor (r5=0)?
 THUMB( it	eq )		@ force fixup-able long branch encoding
	beq	__error_p			@ yes, error 'p'

#ifdef CONFIG_ARM_LPAE
	mrc	p15, 0, r3, c0, c1, 4		@ read ID_MMFR0
	and	r3, r3, #0xf			@ extract VMSA support
	cmp	r3, #5				@ long-descriptor translation table format?
 THUMB( it	lo )				@ force fixup-able long branch encoding
	blo	__error_p			@ only classic page table format
#endif

#ifndef CONFIG_XIP_KERNEL
	adr	r3, 2f
	ldmia	r3, {r4, r8}
	sub	r4, r3, r4			@ (PHYS_OFFSET - PAGE_OFFSET)
	add	r8, r8, r4			@ PHYS_OFFSET
#else
	ldr	r8, =PLAT_PHYS_OFFSET		@ always constant in this case
#endif

	/*
	 * r1 = machine no, r2 = atags or dtb,
	 * r8 = phys_offset, r9 = cpuid, r10 = procinfo
	 */
	bl	__vet_atags
#ifdef CONFIG_SMP_ON_UP
	bl	__fixup_smp
#endif
#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
	bl	__fixup_pv_table
#endif
	bl	__create_page_tables

	/*
	 * The following calls CPU specific code in a position independent
	 * manner.  See arch/arm/mm/proc-*.S for details.  r10 = base of
	 * xxx_proc_info structure selected by __lookup_processor_type
	 * above.  On return, the CPU will be ready for the MMU to be
	 * turned on, and r0 will hold the CPU control register value.
	 */
	ldr	r13, =__mmap_switched		@ address to jump to after
						@ mmu has been enabled
	adr	lr, BSYM(1f)			@ return (PIC) address
	mov	r8, r4				@ set TTBR1 to swapper_pg_dir
 ARM(	add	pc, r10, #PROCINFO_INITFUNC	)
 THUMB(	add	r12, r10, #PROCINFO_INITFUNC	)
 THUMB(	mov	pc, r12				)
1:	b	__enable_mmu
ENDPROC(stext)
	.ltorg
#ifndef CONFIG_XIP_KERNEL
2:	.long	.
	.long	PAGE_OFFSET
#endif

该函数中
ldr r13, =__mmap_switched
这一行指令决定了mmu启用后的下一步动作,即跳转到__mmap_switched()函数执行,它的实现在文件linux/arch/arm/kernel/head-common.S文件中,代码如下:

/*
 * The following fragment of code is executed with the MMU on in MMU mode,
 * and uses absolute addresses; this is not position independent.
 *
 *  r0  = cp#15 control register
 *  r1  = machine ID
 *  r2  = atags/dtb pointer
 *  r9  = processor ID
 */
	__INIT
__mmap_switched:
	adr	r3, __mmap_switched_data

	ldmia	r3!, {r4, r5, r6, r7}
	cmp	r4, r5				@ Copy data segment if needed
1:	cmpne	r5, r6
	ldrne	fp, [r4], #4
	strne	fp, [r5], #4
	bne	1b

	mov	fp, #0				@ Clear BSS (and zero fp)
1:	cmp	r6, r7
	strcc	fp, [r6],#4
	bcc	1b

 ARM(	ldmia	r3, {r4, r5, r6, r7, sp})
 THUMB(	ldmia	r3, {r4, r5, r6, r7}	)
 THUMB(	ldr	sp, [r3, #16]		)
	str	r9, [r4]			@ Save processor ID
	str	r1, [r5]			@ Save machine type
	str	r2, [r6]			@ Save atags pointer
	cmp	r7, #0
	bicne	r4, r0, #CR_A			@ Clear 'A' bit
	stmneia	r7, {r0, r4}			@ Save control register values
	b	start_kernel
ENDPROC(__mmap_switched)

可见,其最后调用了start_kernel()函数,该函数是linux内核的第一个C语言函数,其实现在linux/init/main.c文件中。start_kernel()函数的最后调用rest_init()函数,后者创建了linux的第一个线程init和第二个线程kthreadd,然后一路执行到此的启动过程变身为idle线程,继续调用cpu_startup_entry()函数进入idle线程循环,启动线程调度,进入多任务linux运行环境。

init线程函数kernel_init()的实现也在linux/init/main.c文件中,其逻辑其实很简单,主要完成以下几个任务:
1)首先调用kernel_init_freeable()初始化一次性加载模块,其中最为熟知的是对do_basic_setup()的调用,后者进一步调用driver_init()和do_initcalls()等,driver_init()就是各种驱动模块启动的地方,do_initcalls()则是各种系统服务启动的地方,其中包括在linux/init/initramfs.c中实现的populate_rootfs()函数,它负责把前面读入内存的初始根文件系统镜像解压到加载地址指向的地方,然后释放该镜像占用的内存,这就是linux最初可用的根文件系统了。然后kernel_init_freeable()打开/dev/console接口,并判定ramdisk_execute_command全局变量是否已经赋值,如果未赋值,则给它赋初值为“/init”,即默认执行根目录下的init程序执行用户层的初始化,它通常是一个可执行脚本文件。
2)然后执行free_initmem()释放所有一次性加载模块的内存空间;
3)然后依次执行ramdisk_execute_command和execute_command所指的用户级初始化过程,如果执行成功则返回;
4)如果上一步执行出错,则继续依次执行/sbin/init,/etc/init,/bin/init,/bin/sh这4个用户级初始化过程,如果执行成功则返回,如果都出错,则调用panic()打印出错信息后挂起,这就是大名鼎鼎的“No init found”错误的出处!

static int __ref kernel_init(void *unused)
{
	kernel_init_freeable();
	/* need to finish all async __init code before freeing the memory */
	async_synchronize_full();
	free_initmem();
	mark_rodata_ro();
	system_state = SYSTEM_RUNNING;
	numa_default_policy();

	flush_delayed_fput();

	if (ramdisk_execute_command) {
		if (!run_init_process(ramdisk_execute_command))
			return 0;
		pr_err("Failed to execute %s\n", ramdisk_execute_command);
	}

	/*
	 * We try each of these until one succeeds.
	 *
	 * The Bourne shell can be used instead of init if we are
	 * trying to recover a really broken machine.
	 */
	if (execute_command) {
		if (!run_init_process(execute_command))
			return 0;
		pr_err("Failed to execute %s.  Attempting defaults...\n",
			execute_command);
	}
	if (!run_init_process("/sbin/init") ||
	    !run_init_process("/etc/init") ||
	    !run_init_process("/bin/init") ||
	    !run_init_process("/bin/sh"))
		return 0;

	panic("No init found.  Try passing init= option to kernel. "
	      "See Linux Documentation/init.txt for guidance.");
}

此后就是各个init程序的地盘了,有些系统的init程序是编译过的二进制程序,有些系统的init程序只是一段shell脚本,比如以下就是一个典型的/init脚本程序

#!/bin/sh
#
# preinit
#

export PATH=/usr/sbin:/usr/bin:/sbin:/bin

function switch_olroot()
{
    local upath=/mnt/userfs
    local opath=/mnt/olroot
  
    mount -t proc proc /proc
    mount -o remount,rw /
    mount -t sysfs sysfs /sys
    echo /sbin/mdev > /proc/sys/kernel/hotplug
    mdev -s
    
    mtdid=`grep "NDISK" /proc/mtd | cut -d ":" -f 0 | sed -e 's/mtd//g'`
    if [ -z "$mtdid" ]; then
        echo "preinit: not found mtd:NDISK"
        return 1
    fi
    /usr/sbin/ubiattach /dev/ubi_ctrl -m $mtdid > /tmp/preinit.log
    if [ $? != 0 ]; then
        echo "preinit: ubiattach NDISK($mtdid) failed"
        return 1
    fi
    sleep 1

    if [ ! -e /dev/ubi0_0 ]; then
        echo "preinit: ubifs setup volumes"
        /usr/sbin/ubimkvol /dev/ubi0 -N userfs -s 64MiB >> /tmp/preinit.log
        /usr/sbin/ubimkvol /dev/ubi0 -N data -m >> /tmp/preinit.log
    fi

    test -d $upath || mkdir -p $upath
    test -d $opath || mkdir -p $opath

    mount -t ubifs ubi0:userfs $upath >> /tmp/preinit.log
    if [ $? != 0 ]; then
        echo "preinit: mount mtd($mtdblock=>$upath) failed"
        return 1
    fi
    
    mount -t overlayfs overlayfs -o rw,noatime,lowerdir=/,upperdir=$upath $opath
    if [ $? != 0 ]; then
        echo "preinit: mount overlay($upath=>$opath) failed"
        return 1
    fi
    
    umount -f /proc
    umount -f /sys
    
    echo "mount overlay ... success"    
    cd $opath
    test -d mnt/userfs || mkdir -p mnt/userfs
    mount -o noatime,--move $upath mnt/userfs
    umount $upath
    
    exec chroot . /sbin/init
}


switch_olroot

echo "starting rootfs..."
exec /sbin/init

其主要功能是在nandflash上加载ubi文件系统,并将其以overlay的方式加载到/根文件系统,完成rootfs文件系统的定制,最后执行/sbin/init程序,在多数嵌入式系统中,其/sbin/init是busybox软件包的一部分。

以上/init脚本中需要注意的是,如果switch_olroot()函数执行过程中出现任何错误,函数会返回1,然后脚本会继续显示"starting rootfs…"提示并执行/sbin/init程序,此时是在原始的根目录下执行;如果switch_olroot()函数执行顺利,没有错误发生,最后也会以exec的方式执行/sbin/init程序,替换了当前执行线程,因为在此之前已经重新映射了文件系统,并将当前目录修改为新的根目录,因此此时是在新的根目录下执行/sbin/init程序,也因此,这两种情况下执行的/sbin/init可能并不是同一个程序!

busybox软件包中的/sbin/init程序在busybox/init/init.c文件中实现的,入口函数为init_main(),它在准备了一些系统环境变量、初始化console接口以后,就尝试读取/etc/inittab中预先配置的初始化项目,如果没有这个文件,就默认创建几个最重要的项目,然后按照SYSINIT->WAIT->ONCE->RESPAWN的顺序执行读取到的初始化项目,其中RESPAWN项目是在一个while循环中执行的,也就是说,RESPAWN项目会反复启动执行下去,不会返回。
以下是一个典型的/etc/inittab文件的内容:

# Startup the system
::sysinit:/bin/mount -t proc proc /proc
::sysinit:/bin/mount -o remount,rw /
::sysinit:/bin/mount -t tmpfs -o size=64k,mode=0755 tmpfs /dev
::sysinit:/bin/mkdir -p /dev/pts
::sysinit:/bin/mkdir -p /dev/shm
::sysinit:/bin/mount -a
::sysinit:/bin/hostname -F /etc/hostname
::sysinit:/bin/echo /sbin/mdev >/proc/sys/kernel/hotplug
::sysinit:/sbin/mdev -s

# now run any rc scripts
::sysinit:/etc/init.d/rcS

# Put a getty on the serial port
ttyS0::respawn:/sbin/getty -L ttyS0 115200 vt100 # GENERIC_SERIAL

# Stuff to do for the 3-finger salute
::ctrlaltdel:/sbin/reboot

# Stuff to do before rebooting
::shutdown:/etc/init.d/rcK
::shutdown:/sbin/swapoff -a
::shutdown:/bin/umount -a -r

可见,其中只有一个respawn项目,就是控制终端上的启动用户登录程序/sbin/getty,它同样是busybox软件包的一个子程序。需要注意的是,其中一个很重要的系统初始化项,就是::sysinit:/etc/init.d/rcS这一项,它会按照/etc/rcS.d/目录中的脚本文件顺序,逐个执行脚本,每个脚本都是一项重要的系统服务或系统资源的初始化过程,它们在第一个用户登录之前被执行,而这些脚本都是对/etc/init.d/目录下的相应脚本的符号链接;类似地,在系统关机(reboot/shudown/halt等)时,则是执行/etc/inittab文件中的::shutdown项目,其中也包括了按/etc/rcK.d目录中的脚本顺序完成系统服务和系统资源的清理。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值