Linux Kernel :: Boot Process_linux kernel boot process-CSDN博客

本文链接：https://blog.csdn.net/lionxingyuanchen/article/details/79617159

bootloader
Today Linux kernel has to be booted by bootloaders like GRUB2. They are all designed following boot protocol.
header.S
The header.S includes legacy boot sector and setup code. Bootloader will jump to position at an offset of 0x200 (over the legacy part) where is labeled as _start.

_start:
        # Explicitly enter this as bytes, or the assembler
        # tries to generate a 3-byte jump here, which causes
        # everything else to push off to the wrong offset.
        .byte   0xeb        # short (2-byte) jump
        .byte   start_of_setup-1f
1:
        //rest of header.S

Then the address of next executed instruction will be start_of_setup.

Note: 1f means local label 1. A short jump contains opcode and offset from next instruction(pc). 
In our case, the offset is start_of_setup - 1:, and the value of pc is address of label 1.

Now, we get into start_of_setup.

start_of_setup:
    movw    %ds, %ax
    movw    %ax, %es
    cld

This snippet dose nothing but assigning %ds to %es.

    movw    %ss, %dx
    cmpw    %ax, %dx    # %ds == %ss?
    movw    %sp, %dx
    je  2f      # -> assume %sp is reasonably set

After assignment, CPU will compare ss and ds, and assign sp to dx. If ss equals ds, CPU will jump to 2:. If not, the ss is invalid (like some old versions of LILO did), a new one have to be made up. Now, let’s go deep into the hardest part in header.S.

    movw    $_end, %dx
    testb   $CAN_USE_HEAP, loadflags
    jz  1f
    movw    heap_end_ptr, %dx
1:  addw    $STACK_SIZE, %dx
    jnc 2f
    xorw    %dx, %dx    # Prevent wraparound

When ss is invalid, kernel will put the value of _end (the address of the end of the setup code, which is declared in setup.ld) into dx and check the loadflags header field using the testb instruction to see whether we can use the heap. loadflags is defined as:

#define LOADED_HIGH (1<<0)
#define QUIET_FLAG (1<<5)
#define KEEP_SEGMENTS (1<<6)
#define CAN_USE_HEAP (1<<7)

and, as we can read in the boot protocol:

Field name: loadflags
This field is a bitmask.
Bit 7 (write): CAN_USE_HEAP
Set this bit to 1 to indicate that the value entered in the
heap_end_ptr is valid. If this field is clear, some setup code
functionality will be disabled.

So, if CAN_USE_HEAP bit of loadflags is set to 1, then assign heap_end_ptr to dx and execute next instruction whose address is 1:. If not, we just jump to 1:.

1:  addw    $STACK_SIZE, %dx
    jnc 2f
    xorw    %dx, %dx    # Prevent wraparound

There are two scenarios:

1. dx = heap_end_ptr + $STACK_SIZE (CAN_USE_HEAP is set)
2. dx = _end + $STACK_SIZE (CAN_USE_HEAP is not set)

heap_end_ptr = _end+STACK_SIZE-512, and _end is the address of the end of the setup code.

Note: heap_end_ptr seems to get changed several times, so you may check it out when you read this 
article.

Finally, we have a stack now, but we have to align it.

2:  # Now %dx should point to the end of our stack space
    andw    $~3, %dx  # dword align (might as well...)
    jnz 3f
    movw    $0xfffc, %dx   # Make sure we're not zero
3:  movw    %ax, %ss
    movzwl  %dx, %esp   # Clear upper half of %esp
    sti         # Now we should have a working stack

$~3 means inverting 3 (11b) to 0 (00b). 00b ‘and’ dx will set the last 2 bits of dx to 0 so that dx can be divided by 4 without remainder, which is so called 4 bytes alignment. The value of ax in 3: equals that in ds. Sti means start interrupt, so we enable system interruption now.

Tip: If you wonder why we call that 4 bytes alignment when an address can be divided by 4 without
remainder, let's do some calculation. If address A is N bytes alignment,  so is A + N. 
Since the unit of memory size in computers is byte, there are N bytes between A and A + N. 
You get your answer now. As to why we have to do alignment, it is a little bit complicated, 
so just google, xD.

As Linux boot protocol stated, registers will be initialized with these values:

segment = grub_linux_real_target >> 4;
state.gs = state.fs = state.es = state.ds = state.ss = segment;
state.cs = segment + 0x20;

When cs is (segment + 0x20), the first executed instruction will be located 0x200 offset from real model code. In other word, we skip legacy boot sector, and execute the short jump instruction. Therefore, we try to normalize cs.

# We will have entered with %cs = %ds+0x20, normalize %cs so
# it is on par with the other segments.
    pushw   %ds
    pushw   $6f
    lretw

Lretw has 2 parameters, the first is segment of return address, and the second is offset. Now, cs = ds, ip = 6f.

6:
# Check signature at end of setup
    cmpl    $0x5a5aaa55, setup_sig
    jne setup_bad

I’m not sure how dose the signature work, Sorry.

# Zero the bss
    movw    $__bss_start, %di
    movw    $_end+3, %cx #for 4 bytes alignment
    xorl    %eax, %eax
    subw    %di, %cx
    shrw    $2, %cx
    rep; stosl

“subw %di, %cx” means put (cx - di) into cx, which is size of bss section. Then make cx be dived by 4. The stosl instruction is used to repeatedly store the value of eax (zero) into es:di , automatically increasing(if CLD called) di by 4 until cx(minus one per time) reaches zero. The net effect of this code is that zeros are written through all words in memory from __bss_start to _end.

# Jump to C code (should not return)
    calll   main

Finally, we reach the C program part! The rest part of header.S is:

# Setup corrupt somehow...
setup_bad:
    movl    $setup_corrupt, %eax
    calll   puts
    # Fall through...

    .globl  die
    .type   die, @function
die:
    hlt
    jmp die

    .size   die, .-die

    .section ".initdata", "a"
setup_corrupt:
    .byte   7
    .string "No setup signature found...\n"