Let us write a Kernel

最新推荐文章于 2022-10-31 19:16:30 发布

啵啵j

最新推荐文章于 2022-10-31 19:16:30 发布

阅读量367

点赞数

lunix 专栏收录该内容

11 篇文章 0 订阅

订阅专栏

Hello World,

Let us write a simple kernel which could be loaded with the GRUB bootloader on an x86 system. This kernel will display a message on the screen and then hang.

How does an x86 machine boot

Before we think about writing a kernel, let’s see how the machine boots up and transfers control to the kernel:

The x86 CPU is hardwired to begin execution at the physical address [0xFFFFFFF0]. It is in fact, the last 16 bytes of the 32-bit address space. This address just contains a jump instruction to the address in memory where BIOS has copied itself.

Thus, the BIOS code starts its execution. BIOS first searches for a bootable device in the configured boot device order. It checks for a certain magic number to determine if the device is bootable or not.

Once the BIOS has found a bootable device, it copies the contents of the device’s first sector into RAM starting from physical address [0x7c00]; and then jumps into the address and executes the code just loaded. This code is called the bootloader.

The bootloader then loads the kernel at the physical address [0x100000]. The address [0x100000] is used as the start-address for all big kernels on x86 machines.

What all do we need?

* An x86 computer (of course)
* Linux
* NASM assembler
* gcc
* ld (GNU Linker)
* grub

Source Code

Source code is available at my Github repository - mkernel

The entry point using assembly

We like to write everything in C, but we cannot avoid a little bit of assembly. We will write a small file in x86 assembly-language that serves as the starting point for our kernel. All our assembly file will do is invoke an external function which we will write in C, and then halt the program flow.

How do we make sure that this assembly code will serve as the starting point of the kernel?

We will use a linker script that links the object files to produce the final kernel executable. (more explained later) In this linker script, we will explicitly specify that we want our binary to be loaded at the address [0x100000]. This address, as I have said earlier, is where the kernel is expected to be. Thus, the bootloader will take care of firing the kernel’s entry point.

Here’s the assembly code:

;;kernel.asm
bits 32
section .text
        ;multiboot spec
        align 4
        dd 0x1BADB002              	;magic
        dd 0x00                    	;flags
        dd - (0x1BADB002 + 0x00)   	;checksum. m+f+c should be zero

global start
extern kmain 				;this is defined in the c file

start:
	cli 				;block interrupts
	mov esp, stack_space		;set stack pointer
	call kmain
	hlt 				;halt the CPU

section .bss
resb 8192				;8KB for stack
stack_space:

The first instruction bits 32 is not an x86 assembly instruction. It’s a directive to the NASM assembler that specifies it should generate code to run on a processor operating in 32 bit mode. It is not mandatorily required in our example, however is included here as it’s good practice to be explicit.

The second line begins the text section (aka code section). This is where we put all our code.

global is another NASM directive to set symbols from source code as global. By doing so, the linker knows where the symbol start is; which happens to be our entry point.

kmain is our function that will be defined in our kernel.c file. extern declares that the function is declared elsewhere.

Then, we have the start function, which calls the kmain function and halts the CPU using the hlt instruction. Interrupts can awake the CPU from an hlt instruction. So we disable interrupts beforehand using cli instruction. cli is short for clear-interrupts.

The kernel in C

In kernel.asm, we made a call to the function kmain(). So our C code will start executing at kmain():

/*
*  kernel.c
*/

void kmain(void)
{
        const char *str = "my first kernel";
        /* video memory begins at address 0xb8000 */
        char *vidptr = (char*)0xb8000;
        unsigned int i = 0;
        unsigned int j = 0;
        unsigned int screensize;

        /* this loops clears the screen
         * there are 25 lines each of 80 columns; each element takes 2 bytes */

        screensize = 80 * 25 * 2;
        while(j < screensize) {
                /* blank charater */
                vidptr[j] = ' ';
                /* attribute-byte */
                vidptr[j+1] = 0x07;
                j = j + 2;
        }

        j = 0;

        /* this loop writes the string to video memory */
        while(str[j] != '\0') {
                /* the character's ascii */
                vidptr[i] = str[j];
                /* attribute-byte: give character black bg and light grey fg */
                vidptr[i+1] = 0x07;
                ++j;
                i = i + 2;

        }

        return;

}

All our kernel will do is clear the screen and write to it the string “my first kernel”.

First we make a pointer vidptr that points to the address [0xb8000]. This address is the start of video memory in protected mode. The screen’s text memory is simply a chunk of memory in our address space. The memory mapped input/output for the screen starts at [0xb8000] and supports 25 lines, each line contain 80 ascii characters.

Each character element in this text memory is represented by 16 bits (2 bytes), rather than 8 bits (1 byte) which we are used to. The first byte should have the representation of the character as in ASCII. The second byte is the attribute-byte. This describes the formatting of the character including attributes such as color.

To print the character s in green color on black background, we will store the character s in the first byte of the video memory address and the value [0x02] in the second byte.
0 represents black background and 2 represents green foreground.

Have a look at table below for different colors:

0 - Black, 1 - Blue, 2 - Green, 3 - Cyan, 4 - Red, 
5 - Magenta, 6 - Brown, 7 - Light Grey, 8 - Dark Grey, 9 - Light Blue, 
10/a - Light Green, 11/b - Light Cyan, 12/c - Light Red, 
13/d - Light Magenta, 14/e - Light Brown, 15/f – White.

In our kernel, we will use light grey character on a black background. So our attribute-byte must have the value [0x07].

In the first while loop, the program writes the blank character with [0x07] attribute all over the 80 columns of the 25 lines. This thus clears the screen.

In the second while loop, characters of the null terminated string “my first kernel” are written to the chunk of video memory with each character holding an attribute-byte of [0x07].

This should display the string on the screen.

The linking part

We will assemble kernel.asm with NASM to an object file; and then using GCC we will compilekernel.c to another object file. Now, our job is to get these objects linked to an executable bootable kernel.

For that, we use an explicit linker script, which can be passed as an argument to ld (our linker).

/*
 *  link.ld
*/

OUTPUT_FORMAT(elf32-i386)
ENTRY(start)
SECTIONS
{
        . = 0x100000;
        .text : { *(.text) }
        .data : { *(.data) }
        .bss  : { *(.bss) }
}

First, we set the output format of our output executable to be 32 bit Executable and Linkable Format (ELF). ELF is the standard binary file format for Unix-like systems on x86 architecture.

ENTRY takes one argument. It specifies the symbol name that should be the entry point of our executable.

SECTIONS is the most important part for us. Here, we define the layout of our executable. We could specify how the different sections are to be merged and at what location each of these is to be placed.

Within the braces that follow the SECTIONS statement, the period character (.) represents the location counter.
The location counter is always initialized to [0x0] at beginning of the SECTIONS block. It can be modified by assigning a new value to it.

Remember, earlier I told you that kernel’s code should start at the address [0x100000]. So, we set the location counter to [0x100000].

Have look at the next line .text : { *(.text) }

The asterisk (*) is a wildcard character that matches any file name. The expression *(.text)thus means all .text input sections from all input files.

So, the linker merges all text sections of the object files to the executable’s text section, at the address stored in the location counter. Thus, the code section of our executable begins at [0x100000].

After the linker places the text output section, the value of the location counter will become
0x1000000 + the size of the text output section.

Similarly, the data and bss sections are merged and placed at the then values of location-counter.

Grub and Multiboot

Now, we have all our files ready to build the kernel. But, since we like to boot our kernel with the GRUB bootloader, there is one step left.

There is a standard for loading various x86 kernels using a boot loader; called as Multiboot specification.

GRUB will only load our kernel if it complies with the Multiboot spec.

According to the spec, the kernel must contain a header (known as Multiboot header) within its first 8 KiloBytes.

Further, This Multiboot header must contain 3 fields that are 4 byte aligned namely:

a magic field: containing the magic number [0x1BADB002], to identify the header.
a flags field: We will not care about this field. We will simply set it to zero.
a checksum field: the checksum field when added to the fields ‘magic’ and ‘flags’ must give zero.

;;kernel.asm
bits 32       ;nasm directive - 32 bit 
section .text
        ;multiboot spec
        align 4
        dd 0x1BADB002               ;magic
        dd 0x00                     ;flags
        dd - (0x1BADB002 + 0x00)    ;checksum. m+f+c should be zero 
global start
extern kmain                        ;this is defined in the c file

start:
        cli                         ;block interrupts
        mov esp, stack_space        ;set stack pointer
        call kmain
        hlt                         ;halt the CPU

section .bss

The dd defines a double word of size 4 bytes.

Building the kernel

We will now create object files from kernel.asm and kernel.c and then link it using our linker script.

nasm -f elf32 kernel.asm -o kasm.o

will run the assembler to create the object file kasm.o in ELF-32 bit format.

gcc -m32 -c kernel.c -o kc.o

The ‘-c ’ option makes sure that after compiling, linking doesn’t implicitly happen.

ld -m elf_i386 -T link.ld -o kernel kasm.o kc.o

will run the linker with our linker script and generate the executable named kernel.

Configure your grub and run your kernel

GRUB requires your kernel to be of the name pattern kernel-<version>. So, rename the kernel. I renamed my kernel executable to kernel-701.

Now place it in the /boot directory. You will require superuser privileges to do so.

In your GRUB configuration file grub.cfg you should add an entry, something like:

title myKernel
	root (hd0,0)
	kernel /boot/kernel-701 ro

Don’t forget to remove the directive hiddenmenu if it exists.

Reboot your computer, and you’ll get a list selection with the name of your kernel listed.

Select it and you should see:

That’s your kernel!!

PS:
* It’s always advisable to get yourself a virtual machine for all kinds of kernel hacking.

* To run this on grub2 which is the default bootloader for newer distros, your config should look like this (Thanks to Rubén Laguna from comments for the config):

menuentry 'kernel 7001' {
	set root='hd0,msdos1'
	multiboot /boot/kernel-7001 ro
}

* Also, if you want to run the kernel on the qemu emulator instead of booting with GRUB, you can do so by:

qemu-system-i386 -kernel kernel

Also, see the next article in the Kernel series:
Kernel 201 - Let’s write a Kernel with keyboard and screen support

啵啵j

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录