Interpretations of Linux Kernel Codes

Richard1905

已于 2023-11-07 23:53:23 修改

阅读量86

点赞数

分类专栏： Linux Kernel 文章标签： linux

于 2023-10-29 17:56:51 首次发布

本文链接：https://blog.csdn.net/qq_38624569/article/details/134104651

版权

Linux Kernel 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

Linux 0.00

boot.s

jmpi go, #BOOTSEG

jmpi EIP, CS is an inter-segment(far) jump instruction of x86 real mode; after this instruction, BOOTSEG is loaded to CS, and mark go is loaded to EIP, and then the processor will execute instruction at BOOTSEG : go(CS : EIP)

int 0x13

int 0x13 is a BIOS interruption instruction. The interrupt number 0x13 points to a disk service procedure. In this case, before this instruction executes, several registers have been assigned for specific purpose here, these registers are the following:
ah: function number, here it is assigned to 02, which function number 2 represents sector reading. To read codes of the head from disk to memory.
al: the quantity of the sectors needs to be read, here is 17h.
dh: The head number of the disk to be read.
dl: The drive letter(A, B, C, D, E, …) of the disk to be read.
ch: The lower 8 bits of the track number, the track number is 10 bits in total.
cl: bit 7 and bit 6 are the higher 2 bits of the track number, and bit 5 ~ bit 0 indicate the beginning sector number(counting from 1).
es : bx: the buffer position in the memory that the codes will be read in, here is 0x1000 : 0x0000.
CF: the flag bit CF will be set to 1 if any error occurs, and CF=0 indicates that no errors happens.

jnc ok_load

if CF=0 then jump to ok_load, otherwise do not jump and execute the next instruction sequentially.

rep movw

movw: moves a word from DS : SI to ES : DI, and after one move, the SI and DI automatically add or subtract a word size, which is 2 bytes here, that is, after movw, SI = SI + 2 and DI = DI +2.
rep: repeats the movw instruction cx times.
Before the rep movw instruction, several registers are well assigned as following:
ds: the source segment that contains the moving words.
si: the beginning position in the source segment. As the uint of si is byte, so after one moving, si is self-added by 2.
es: the destination segment that receives the moving words.
di: the beginning position in the destination segment. As the uint of si is byte, so after one moving, si is self-added by 2.
cx: the number of times to repeat for the instruction rep.

lidt idt_48

idt_48: .word 0
.word 0, 0

To load IDT base address register IDTR, and the operand of lidt is of size 6 bytes, the first 2 bytes indicate the length of the table(in byte uint), and the next 4 bytes indicate the base linear address of the table.

lgdt gdt_48

gdt_48: .word 0x7ff
.word 0x7c00+gdt, 0
somewhere in boot.s is the following:
gdt: .word 0, 0, 0, 0 ! the first item of the segment descriptor table, and is not used
… ! other items

To load GDT base address register GDTR, and the operand of lgdt is of size 6 bytes, the first 2 bytes indicate the length of the table(in byte uint), and the next 4 bytes indicate the base linear address of the table.

.org 510

This is a pseudo instruction that locates the following bytes to the position 510. Here in boot.s, the valid flag word for boot sector 0xAA55 is put in this position, that is,
.org 510
.word 0xAA55
to make the boot.s an effective boot sector.

head.s

AT&T Assembler Language.
At the beginning of the codes, several symbolic constants are defined to indicate the segment selectors of screen showing memory segment, task state segment(TSS) and local descriptor table(LDT) segment about task0 and task1.

#  head.s contains the 32-bit startup code.
#  Two L3 task multitasking. The code of tasks are in kernel area, 
#  just like the Linux. The kernel code is located at 0x10000. 
SCRN_SEL	= 0x18
TSS0_SEL	= 0x20
LDT0_SEL	= 0x28
TSS1_SEL	= 0X30
LDT1_SEL	= 0x38

.code32
.global startup_32
.text
startup_32:
	movl $0x10,%eax
	mov %ax,%ds
	lss init_stack,%esp

...
end_gdt:
	.fill 128,4,0
init_stack:                          # Will be used as user stack for task0.
	.long init_stack
	.word 0x10

.code32

This is a assembler directive, and is used to make as assembler adopt 32-bit compilation for the following codes.

.global startup_32

This assembler directive makes the linker ld can perceive the symbol startup_32, that is, if the symbol startup_32 is defined in this object file, then the value of the symbol can be used in other object files, and what’s more, if the symbol startup_32 is not defined in this object file, then the content of the symbol will be obtained from the same name symbol in other object files. This is a bit like the keyword extern in C.

.text

In an object file, the text section begins at address 0, then follows the data section, then is the bss section. Section, is also called segment, part, is used to present an address range, and the operating system will treat and process data informations of a same section on a same way. Above all, the directive .text is used to indicate the following codes are in a text section, and normally it’s a read-only section.

.fill 128, 4, 0

.fill repeat, size, value, this directive will produce a total number of repeat copies of value with size size. Here is used to allocate a stack space for this program.

lss init_stack, %esp

– “LSS (load far pointer using SS) load a far pointer from memory into a segment register and a general-purpose general register. The segment selector part of the far pointer is loaded into the selected segment register and the offset is loaded into the selected general-purpose register.”
In this example, the first 32 bits beginning at init_stack is the offset just indicates the position where the mark init_stack is, and this position is the stack bottom, and the stack is growing upwise, that is, the stack expands in a address reducing direction. Then the following 16 bits, that is .word 0x10, is loaded into SS register as stack segment selector.

# setup base fields of descriptors.
	call setup_idt
	call setup_gdt
	movl $0x10,%eax		# reload all the segment registers
	mov %ax,%ds			# after changing gdt. 
	mov %ax,%es
	mov %ax,%fs
	mov %ax,%gs
	lss init_stack,%esp

...
setup_gdt:
	lgdt lgdt_opcode
	ret

setup_idt:
	lea ignore_int,%edx
	movl $0x00080000,%eax
	movw %dx,%ax		/* selector = 0x0008 = cs */
	movw $0x8E00,%dx	/* interrupt gate - dpl=0, present */
	lea idt,%edi
	mov $256,%ecx
rp_sidt:
	movl %eax,(%edi)
	movl %edx,4(%edi)
	addl $8,%edi
	dec %ecx
	jne rp_sidt
	lidt lidt_opcode
	ret

write_char:
	push %gs
	pushl %ebx
#	pushl %eax
	mov $SCRN_SEL, %ebx
	mov %bx, %gs
	movl scr_loc, %ebx
	shl $1, %ebx
	movb %al, %gs:(%ebx)
	shr $1, %ebx
	incl %ebx
	cmpl $2000, %ebx
	jb 1f
	movl $0, %ebx
1:	movl %ebx, scr_loc	
#	popl %eax
	popl %ebx
	pop %gs
	ret

.align 2
ignore_int:
	push %ds
	pushl %eax
	movl $0x10, %eax
	mov %ax, %ds
	movl $67, %eax            /* print 'C' */
	call write_char
	popl %eax
	pop %ds
	iret

...
.align 2
lidt_opcode:
	.word 256*8-1		# idt contains 256 entries
	.long idt		# This will be rewrite by code. 
lgdt_opcode:
	.word (end_gdt-gdt)-1	# so does gdt 
	.long gdt		# This will be rewrite by code.

.align 8
idt:	.fill 256,8,0		# idt is uninitialized

gdt:	.quad 0x0000000000000000	/* NULL descriptor */
	.quad 0x00c09a00000007ff	/* 8Mb 0x08, base = 0x00000 */
	.quad 0x00c09200000007ff	/* 8Mb 0x10 */
	.quad 0x00c0920b80000002	/* screen 0x18 - for display */

	.word 0x0068, tss0, 0xe900, 0x0	# TSS0 descr 0x20
	.word 0x0040, ldt0, 0xe200, 0x0	# LDT0 descr 0x28
	.word 0x0068, tss1, 0xe900, 0x0	# TSS1 descr 0x30
	.word 0x0040, ldt1, 0xe200, 0x0	# LDT1 descr 0x38
end_gdt:
	.fill 128,4,0
init_stack:                          # Will be used as user stack for task0.
	.long init_stack
	.word 0x10

lea ignore_int, %edx

"The LEA (load effective address) instruction computes the effective address in memory (offset within a segment) of a source operand and places it in a general-purpose register. This instruction can interpret any of the processor’s addressing modes and can perform any indexing or scaling that may be needed. "
The lea instruction here is to load the address(offset in segment whose selector is 0x0008) of the procedure ignore_int to the general perpose register edx. And this procedure is used to as the default procedure of all the 256 interruption gates.

movl $0x00080000, %eax

movl, move long(4 bytes or 32 bits) type operand, the instruction here is to load a constant 0x00080000 to eax, but only the high 16 bits of the constant, that is 0x0008, is used as the segment selector part of the interruption gate descripter.

movw %dx, %ax

movw, move word(2 bytes or 16 bits) type operand, the instruction here is to load the low 16 bits of edx to cover the low 16 bits of eax, and then the contant in the eax becomes the lower-address 32 bits of the interruption descripter whose total size is 64 bits.

movw $0x8E00, %dx

This instruction move the constant 0x8E00 to the lower 16 bits of edx, then the register edx becomes the higher-address 32 bits of the interruption descripter. These 16 bits here indicate the properties of the descripter, such as illustrating this descripter here is an interruption descripter.

lea idt, %edi

lea idt, %edi
.align 8
idt: .fill 256, 8, 0
The above instructions load the address where idt indicating to the register edi, which is the location of the Interruption Descripter Table. The beginning address of IDT is 8-byte aligned, and is filled 256 8-byte size items and initialized to 0 by instruction " .fill repeat, size, value "
------------------------------------------------------- about .align 8 --------------------------------------------------------------
.align abs-expr, abs-expr, abs-expr
Pad the location counter (in the current subsection) to a particular storage boundary.
The first expr ession (which must be absolute) is the alignment required, as described below.
The second expression (also absolute) gives the fill value to be stored in the padding bytes. It (and the comma) may be omitted. If it is omitted, the padding bytes are normally zero. However, on some systems, if the section is marked as containing code and the fill value is omitted, the space is filled with no-op instructions.
The third expression is also absolute, and is also optional. If it is present, it is the maximum number of bytes that should be skipped by this alignment directive. If doing the alignment would require skipping more bytes than the specified maximum, then the alignment is not done at all. You can omit the fill value (the second argument) entirely by simply using two commas after the required alignment; this can be useful if you want the alignment to be filled with no-op instructions when appropriate.
The way the required alignment is specified varies from system to system.
For the a29k, hppa, m68k, m88k, w65, sparc, and Hitachi SH, and i386 using ELF format, the first expression is the alignment request in bytes. For example ‘.align 8’ advances the location counter until it is a multiple of 8. If the location counter is already a multiple of 8, no change is needed.
For other systems, including the i386 using a.out format, and the arm and strongarm, it is the number of low-order zero bits the location counter must have after advancement. For example ‘.align 3’ advances the location counter until it a multiple of 8. If the location counter is already a multiple of 8, no change is needed.
This inconsistency is due to the different behaviors of the various native assemblers for these systems which GAS must emulate. GAS also provides .balign and .p2align directives, described later, which have a consistent behavior across all architectures (but are specific to GAS).
------------------------------------------------------- about .align 8 --------------------------------------------------------------

mov $256, %ecx

Set the count of the following loop that begins with rp_sidt.

rp_sidt:

rp_sidt:
movl %eax, (%edi)
movl %edx, 4(%edi)
addl $8, %edi
dec %ecx
jne rp_sidt

– movl %eax, (%edi)
move the content(32 bits) stored in register eax to the memory at the address that stored in register edi, that is, (%edi) means indirect addressing. Note that what stored in eax is the lower-address 32 bits of the interruption descripter, the next instruction
– movl %edx, 4(%edi)
moves the other part of the interruption description, which is alsp 32 bits and is stored in edx, to the memory at the address edi+4, which means that the address is 4 bytes after the address stored in edi. 4(%edi) means base addressing, that is, the address the syntax indicated is of two parts: base address and offset, what stored in edi is the base address and the 4 at the beginning is the offset, and the final address is the base address plus to the offset.
– addl $8, %edi
addl, the instruction is used for long(32 bits) type operands.
As procedure running here, it means an item of the idt is filled completed, the size of which is 8 bytes. So in this instruction, a number 8 is added to edi, preparing for the next item to be filled.
– dec %ecx
decrease the loop count stored in ecx by 1.
– jne rp_sidt
jne, jump if not equal, the condition flag of jne is ZF(Zero Flag), if a zero result is generated, ZF will be set to 1. As ecx is not decreased to 0, the ZF flag will not be set to 1, and jne means jump if(ZF==0), so the procedure will jump to rp_sidt.
jne and jnz are the same instructions, both can be literally regarded as jump if the last instruction hasn’t generated a zero result, that is, jump if not zero, and the ZF flag can be simply skipped.

lidt lidt_opcode

lidt_opcode:
.word 256*8-1
.long idt
…
idt: .fill 256,8,0

“Loads the values in the source operand into the global descriptor table register (GDTR) or the interrupt descriptor table register (IDTR). The source operand specifies a 6-byte memory location that contains the base address (a linear address) and the limit (size of table in bytes) of the global descriptor table (GDT) or the interrupt descriptor table (IDT). If operand-size attribute is 32 bits, a 16-bit limit (lower 2 bytes of the 6-byte data operand) and a 32-bit base address (upper 4 bytes of the data operand) are loaded into the register. If the operand-size attribute is 16 bits, a 16-bit limit (lower 2 bytes) and a 24-bit base address (third, fourth, and fifth byte) are loaded. Here, the high-order byte of the operand is not used and the high-order byte of the base address in the GDTR or IDTR is filled with zeros.
The LGDT and LIDT instructions are used only in operating-system software; they are not used in application programs. They are the only instructions that directly load a linear address (that is, not a segment-relative address) and a limit in protected mode. They are commonly executed in real-address mode to allow processor initialization prior to switching to protected mode.”

ret

Return the program execution flow to the calling place.

The above is the call setup_idt part, and the following is the call setup_gdt part.

lgdt lgdt_opcode

setup_gdt:
	lgdt    lgdt_opcode
	ret
...
lgdt_opcode:
 	.word (end_gdt - gdt) - 1	 # so does gdt 
 	.long  gdt		 # This will be rewrite by code.
...
gdt:	.quad 0x0000000000000000	/* NULL descriptor */
	.quad 0x00c09a00000007ff	/* 8Mb 0x08, base = 0x00000 */
	.quad 0x00c09200000007ff	/* 8Mb 0x10 */
	.quad 0x00c0920b80000002	/* screen 0x18 - for display */

	.word 0x0068, tss0, 0xe900, 0x0	# TSS0 descr 0x20
	.word 0x0040, ldt0, 0xe200, 0x0	# LDT0 descr 0x28
	.word 0x0068, tss1, 0xe900, 0x0	# TSS1 descr 0x30
	.word 0x0040, ldt1, 0xe200, 0x0	# LDT1 descr 0x38
end_gdt:

...
tss0:	.long 0 			/* back link */
	.long krn_stk0, 0x10		/* esp0, ss0 */
	.long 0, 0, 0, 0, 0		/* esp1, ss1, esp2, ss2, cr3 */
	.long 0, 0, 0, 0, 0		/* eip, eflags, eax, ecx, edx */
	.long 0, 0, 0, 0, 0		/* ebx esp, ebp, esi, edi */
	.long 0, 0, 0, 0, 0, 0 		/* es, cs, ss, ds, fs, gs */
	.long LDT0_SEL, 0x8000000	/* ldt, trace bitmap */

The instruction lgdt is explained above in the lidt lidt_opcode chapter, in this chapter, the gdt, global descripter table, is considered about. See TSS0 descripter with offset 0x20, which is also responding to the selector of this TSS(Task-State Segment), the following picture shows the structure of TSS descripter:
在这里插入图片描述
The first word of ,the so called, TSS0 descr “0x0068” indicates the segment limit length is 104 bytes. “0x0068” is stored to the Segment Limit 15:00 part.
The second symbol “tss0” represents an offset in codes of head.s, and is of 32-bit size. As finally all the instructions of head.s is moved to the address beginning with 0 in memory, the higher 16 bits of the offset indicated by “tss0” must be zeros, so only the effective lower 16 bits are stored to the Base Address 15:00 part.
The third word “0xe900”, which converting to binary numbers is “1110 1001 0000 0000”. The first 1 in binary numbers coresponds to P=1, which means this segment is present in memory, and the following 2 bits means DPL=3, which indicates that the segment is of the least privileged level. The last 8 bits of the third word, which is also of the lower address, are all of course zeros, because these bits are the higher part of the segment base address as well as the corresponding part of the forth word.
The forth word 0x0, contains a bit that means G=0, which indicates the unit of the segment limit is byte.
Now look at the contents leb by mark tss0, which is a Task-State Segment. Note that there are total 26 long-type fields in this segment, which occupies 26 x 4 = 104 bytes of memory. This is actually equal to the segment limit in the segment descripter of TSS0.

References

Intel® 64 and IA-32 Architectures Software Developer’s Manual

Richard1905

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
Interpretations of Linux Kernel Codes

jmpi EIP, CS is an inter-segment(far) jump instruction of x86 real mode; after this instruction, BOOTSEG is loaded to CS, and mark go is loaded to EIP, and then the processor will execute instruction at BOOTSEG : goint 0x13 is a BIOS interr
复制链接

扫一扫