The Creation of Process 0 & 1 on ARM Architecture_the creation of the shell process-CSDN博客

本文链接：https://blog.csdn.net/TechMax/article/details/50434619

This post is used to depict how process 0 & 1 were created, based on TCC8900
linux BSP, the kernel’s version is 2.6.28.

Many details are still vague to me and I marked them with symble ‘[MARK]‘,
which means needs further study.

This post also quoted ‘Understanding The Linux Kernel, 3rd Edition, English
version’ a lot. These references are marked with symbol ‘[QUOTE Chapter-Number]‘.

Let’s begin.

PROCESS 0

[QUOTE 3.4.2.2]:

The ancestor of all processes, called process 0, the idle process, or, for
historical reasons, the swapper process, is a kernel thread created from
scratch during the initialization phase of Linux (see Appendix A). This
ancestor process uses the following statically allocated data structures
(data structures for all other processes are dynamically allocated).

Process 0’s task descriptor, init_task, is statically defined in
arch/arm/kernel/init_task.c. The structure is initialized by macro
INIT_TASK, which defined in include/linux/init_task.h.
Some complicated/important fields of the structure, like mm/fs/files/…,
are initialized by macros INIT_MM/INIT_FS/INIT_FILES/… These macros are
defined in seperated files within kernel.

So far, I have no idea where/how the process 0 was created [MARK]. But,

[QUOTE A.4]

The second startup_32( ) function sets up the execution environment for the
first Linux process (process 0). The function performs the following operations:
…
Jumps to the start_kernel( ) function.

Instead of startup_32() on x86 platform, on ARM architecture, the function
start_kernel() is invoked by procedure __mmap_switched in
arch/arm/kernel/head-common.S:

__mmap_switched:
    adr r3, __switch_data + 4
    ldmia   r3!, {r4, r5, r6, r7}
    cmp r4, r5              @ Copy data segment if needed
    ...
    ldmia   r3, {r4, r5, r6, r7, sp}
    str r9, [r4]            @ Save processor ID
    str r1, [r5]            @ Save machine type
    str r2, [r6]            @ Save atags pointer
    bic r4, r0, #CR_A           @ Clear 'A' bit
    stmia   r7, {r0, r4}            @ Save control register values
    b   start_kernel
    ENDPROC(__mmap_switched)

What we’ve relized untill now is that the execution environment for the 1st
Linux process (process 0) is ready. Let’s proceed.

[QUOTE A.5]

The start_kernel( ) function completes the initialization of the Linux
kernel. Nearly every kernel component is initialized by this function;
we mention just a few of them:
…
The kernel thread for process 1 is created by invoking the kernel_thread( )
function. In turn, this kernel thread creates the other kernel threads and
executes the /sbin/init program.

To my surprise, the author didn’t mention process 0 anymore here, just told
us the start_kernel() will create process 1 eventually via kernel_thread().

[MARK]
Where is the process 0 body(main function)?
I GUESS the purpose of creating process 0 is just to create and initialize
the object init_task, who will be used as a template by its descendants.

Anyway, now we can move on to explore how process 1 is created.

Process 1

start_kernel(void) is quite HUGE since it is responsible for initializing
almost all of the kernel components: memory, scheduler, timers, signals,
slab, irq, system time as so forth. It is actually a collection of
xxxx_init() function calls. Given that this post focuses on process
creation, we will put eyes on related subroutine as below:

//init/main.c
asmlinkage void __init start_kernel(void)
{
    ...
    setup_arch(&command_line); //kernel page tables is created here.
    ...
    rest_init();
}

The definition of rest_init():

// init/main.c
static void noinline __init_refok rest_init(void)
    __releases(kernel_lock)
{
    int pid;

    kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_SIGHAND);
    numa_default_policy();
    pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES);
    kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns);
    unlock_kernel();

    /*
    * The boot idle thread must execute schedule()
    * at least once to get things moving:
    */
    init_idle_bootup_task(current);
    preempt_enable_no_resched();
    schedule();
    preempt_disable();

    /* Call into cpu_idle with preempt disabled */
    cpu_idle();
}

Let’s take a look at cpu_idle() first. The function is defined in
arch/arm/kernel/process.c. Basically, it is a while(1) loop.

[QUOTE 3.4.2.2]

After having created the init process, process 0 executes the cpu_idle( )
function, which essentially consists of repeatedly executing the hlt
assembly language instruction with the interrupts enabled (see Chapter 4).
Process 0 is selected by the scheduler only when there are no other
processes in the TASK_RUNNING state

[MARK]
Given ‘process 0 executes the cpu_idle( )’, is it true that therest_init(), or even start_kernel() itself is the process 0?
-I GUESS process 0 is just a conceptual (no function body) process whose
job is creating the init_task object, that’s all.

Ahead of cpu_idle(), two kernel threads were created within rest_init(),
1. static int __init kernel_init() and
2. int kthreadd(void *unused)

kthreadd(kernel/kthread.c) is a ‘common’ kernel thread which is a for() loop.

kernel_init(), on the other hand, is quite weird to me: Unlike any other
threads/process I’ve seen, it is neither a while(1) nor a for(;;), it
returned! How come?

//init/main.c:  
static int __init kernel_init(void * unused)
{
    lock_kernel();

    printk("[1]\n");

    init_pid_ns.child_reaper = current;
    cad_pid = task_pid(current);
    smp_prepare_cpus(setup_max_cpus);

    do_pre_smp_initcalls();
    start_boot_trace();
    smp_init();
    sched_init_smp();
    cpuset_init_smp();
    do_basic_setup();

    printk("[2]\n");

    /*
     * check if there is an early userspace init.  If yes, let it do all
     * the work
     */
    if (!ramdisk_execute_command)
        ramdisk_execute_command = "/init";

    if (sys_access((const char __user *) ramdisk_execute_command, 0) != 0) {
        ramdisk_execute_command = NULL;
        prepare_namespace();
    }

    /*
     * Ok, we have completed the initial bootup, and
     * we're essentially up and running. Get rid of the
     * initmem segments and start the user-mode stuff..
     */
    stop_boot_trace();
    init_post();
    return 0;
}

And, according to the log (printk), after it was created via
kernel_thread(), it was scheduled immediately (print ‘[1]’). But the
scheduler removed it from RUNNING task list soon (without printing ‘[2]’)
and come back to rest_init(), went to cpu_idle() eventually.

[MARK]
I GUESS the reason lies in some operations between 1 and 2 are BLOCKED.

The scheduler scheduled kernel_init() quite a while later, after loading
a lot of device drivers. It could NOT access file “/init” on fs therefore
assign NULL to variable ramdisk_execute_command again and jump into
prepare_namespace().

[MARK]
I GUESS the rootfs has not been loaded yet.

/*
 * Prepare the namespace - decide what/where to mount, load ramdisks, etc.
 */
// init/do_mounts.c
void __init prepare_namespace(void)
{
    int is_floppy;

    if (root_delay) {
        printk(KERN_INFO "Waiting %dsec before mounting root device...\n",
               root_delay);
        ssleep(root_delay);
    }

    /* wait for the known devices to complete their probing */
    while (driver_probe_done() != 0)
        msleep(100);

    md_run_setup();

    if (saved_root_name[0]) {
        root_device_name = saved_root_name;
        if (!strncmp(root_device_name, "mtd", 3) ||
            !strncmp(root_device_name, "ubi", 3)) {
            mount_block_root(root_device_name, root_mountflags);
            goto out;
        }
        ROOT_DEV = name_to_dev_t(root_device_name);
        if (strncmp(root_device_name, "/dev/", 5) == 0)
            root_device_name += 5;
    }

    if (initrd_load())
        goto out;

    /* wait for any asynchronous scanning to complete */
    if ((ROOT_DEV == 0) && root_wait) {
        printk(KERN_INFO "Waiting for root device %s...\n",
            saved_root_name);
        while (driver_probe_done() != 0 ||
                    (ROOT_DEV = name_to_dev_t(saved_root_name)) == 0)
            msleep(100);
    }

    is_floppy = MAJOR(ROOT_DEV) == FLOPPY_MAJOR;

    if (is_floppy && rd_doload && rd_load_disk(0))
        ROOT_DEV = Root_RAM0;

    mount_root();
out:
    sys_mount(".", "/", NULL, MS_MOVE, NULL);
    sys_chroot(".");
}

The code is unfamiliar to me so I just follow logs, turns out it calls
initrd_load() and NEVER return.

// init/do_mounts_initrd.c
nt __init initrd_load(void)
{
    if (mount_initrd) {
        create_dev("/dev/ram", Root_RAM0);
        /*
         * Load the initrd data into /dev/ram0. Execute it as initrd
         * unless /dev/ram0 is supposed to be our actual root device,
         * in that case the ram disk is just set up here, and gets
         * mounted in the normal path.
         */
        if (rd_load_image("/initrd.image") && ROOT_DEV != Root_RAM0) {
            sys_unlink("/initrd.image");
            handle_initrd();
            return 1;
        }
    }
    sys_unlink("/initrd.image");
    return 0;
}

initrd_load() load rootfs from NAND flash, log shows:
RAMDISK: ext2 filesystem found at block 0
RAMDISK: Loading 32768KiB [1 disk] into ram disk… done.

Then invoke handle_initrd().

// init/do_mounts_initrd.c
static void __init handle_initrd(void)
{
    int error;
    int pid;

    real_root_dev = new_encode_dev(ROOT_DEV);
    create_dev("/dev/root.old", Root_RAM0);
    /* mount initrd on rootfs' /root */
    mount_block_root("/dev/root.old", root_mountflags & ~MS_RDONLY);
    sys_mkdir("/old", 0700);
    root_fd = sys_open("/", 0, 0);
    old_fd = sys_open("/old", 0, 0);
    /* move initrd over / and chdir/chroot in initrd root */
    sys_chdir("/root");
    sys_mount(".", "/", NULL, MS_MOVE, NULL);
    sys_chroot(".");

    /*
     * In case that a resume from disk is carried out by linuxrc or one of
     * its children, we need to tell the freezer not to wait for us.
     */
    current->flags |= PF_FREEZER_SKIP;

    pid = kernel_thread(do_linuxrc, "/linuxrc", SIGCHLD);
    if (pid > 0)
        while (pid != sys_wait4(-1, NULL, 0, NULL))
            yield();

    ...
}

The point of this routine is that it created another kernel thread,
do_linuxrc. ( Log denotes the pid is 489, and the ps command shows
that it’s name is ‘init’).
Like kernel_init(), do_linuxrc() is scheduled to run list immediately.

// init/do_mounts_initrd.c
static int __init do_linuxrc(void * shell)
{
    static char *argv[] = { "linuxrc", NULL, };
    extern char * envp_init[];

    sys_close(old_fd);sys_close(root_fd);
    sys_close(0);sys_close(1);sys_close(2);
    sys_setsid();
    (void) sys_open("/dev/console",O_RDWR,0);
    (void) sys_dup(0);
    (void) sys_dup(0);
    return kernel_execve(shell, argv, envp_init);
}

The input parameter, shell, is “/linuxrc”, which is a symbolic link
of busybox, /linuxrc -> bin/busybox.
The kernel executed it finally, kernel_execve(shell,…).
Right after that, the log shows:
init started: BusyBox v1.4.2 ……
Starting pid 494, console /dev/console: ‘/etc/init.d/rcS’
TCC8900_NAND output
…
Starting pid 524, console /dev/console: ‘/usr/etc/rc.local’
Starting pid 527, console /dev/ttySAC0: ‘bin/sh’
dm9k output
…
The shell promption shows!