构建64位操作系统-多核

raindayinrain

已于 2023-09-29 08:44:16 修改

阅读量583

点赞数 1

分类专栏：构建64位操作系统文章标签：多核

于 2023-03-15 21:57:22 首次发布

本文链接：https://blog.csdn.net/x13262608581/article/details/129570514

版权

构建64位操作系统专栏收录该内容

12 篇文章

订阅专栏

1.论述

1.1.名词解释

超线程技术：

可将两个逻辑处理单元融入到一个处理器核心，两个逻辑处理单元大部分寄存器独立，有独立的Local APIC，共享处理器核心的执行引擎，处理器缓存，总线接口。

处于同一核心的两个逻辑处理单元内的任务只能并发执行。

多核技术：

增加处理器数量方式来提高多任务的处理速度。

多线程技术：

整合了超线程技术，多核技术，实现硬件层面的多任务并行处理。

处理器上电或重启后，硬件系统将动态选择一个逻辑处理单元作为BSP（引导处理器），其他逻辑处理单元则作为AP（应用处理器）使用。

BSP：

硬件平台上电后启动的第一个处理器，负责执行引导程序来配置APIC执行环境，配置系统运行环境，初始化并启动AP。选定BSP后，只有BSP的IA32_APIC_BASE.BSP[8]会被置位。

AP：

处理器上电或重启后，AP逻辑处理单元将完成最小集的自我配置工作，随后等待BSP处理器发送Start-up IPI消息。当AP处理器收到Start-up IPI消息后，将从Start-up IPI消息提供的引导程序起始地址开始执行。

无论是BSP还是AP逻辑处理单元，处理器上电或重启时，均被指派了唯一的APIC ID值。

x2APIC模式下，APIC ID是32位。

1.2.多核IPI通信

多核处理器间的IPI通信机制以Local APIC和I/O APIC为载体，借助中断投递方式与其他处理器通信。

多核处理器为IPI通信机制提供ICR寄存器来配置投递行为。

处理器会根据ICR配置情况，有选择地使用其他附属寄存器将IPI消息发送至目标处理器。

ICR：

交互方式：

x2APIC下，0x830+MSR

含义解释：

编号[0,7]区域存储中断向量号

编号[8, 10]区域存储投递模式

编号[11]区域存储目标模式。0，物理。1，逻辑。

编号[14]区域存储信号驱动电平

编号[15]区域存储触发模式

编号[18,19]区域存储投递目标速记值

在X2APIC模式下，

编号[32,63]区域存储投递目标

信号驱动电平，对De-assert级别的INIT投递模式，需为0。其他情况，需为1。

投递模式：

000 Fixed

010 SMI

100 NMI

101 INIT

110 Start Up

投递模式为Start Up时，BSP逻辑处理单元通过此模式可向目标处理器发送引导程序起始地址。

引导程序起始地址：

假设中断向量号区域为0xXY

则引导程序起始地址为0x000XY000

投递目标速记值：

00 不使用

01 只向自身发送消息

10 向所有处理器发送消息

11 向除自身外所有处理器发送消息

ICR的64位由两个32位的寄存器组成，向低32位写入数据，会立即发出IPI消息。所以，操作时，向写入高32位，再写入低32位。

SELF-IPI：

x2APIC下，支持。

编号[0,7]区域存储中断向量号

软件只需向SELF-IPI写入中断向量号，即可发送一个边沿触发模式的中断消息到其所在处理器。且，SELF-IPI发送的IPI消息在IRR，ISR，TMR中均有记录。

SELF-IPI是只写的。

SMP系统结构下AP处理器启动时许：

2.实践

void IPI_0x200(unsigned long nr, unsigned long parameter, struct pt_regs * regs)
{
	color_printk(WHITE, BLACK, "IPI_0x200_%#010x\n", SMP_cpu_id());
	/*switch(current->priority)
	{
		case 0:
		case 1:
			task_schedule[SMP_cpu_id()].CPU_exec_task_jiffies--;
			current->vrun_time += 1;
			break;
		case 2:
		default:
			task_schedule[SMP_cpu_id()].CPU_exec_task_jiffies -= 2;
			current->vrun_time += 2;
			break;
	}

	if(task_schedule[SMP_cpu_id()].CPU_exec_task_jiffies <= 0)
		current->flags |= NEED_SCHEDULE;*/
}

void IPI_0x201(unsigned long nr, unsigned long parameter, struct pt_regs * regs)
{
	color_printk(WHITE, BLACK, "IPI_0x201_%#010x\n", SMP_cpu_id());
}

void SMP_init()
{
	int i;
	unsigned int a,b,c,d;
    // 打印处理器层次信息
	//get local APIC ID
	for(i = 0; ;i++)
	{
		get_cpuid(0xb,i,&a,&b,&c,&d);
		if((c >> 8 & 0xff) == 0)
			break;
		color_printk(WHITE,BLACK,"local APIC ID Package_../Core_2/SMT_1,type(%x) Width:%#010x,num of logical processor(%x)\n",
			c >> 8 & 0xff,a & 0x1f,b & 0xff);
	}
	
	color_printk(WHITE,BLACK,"x2APIC ID level:%#010x\tx2APIC ID the current logical processor:%#010x\n",c & 0xff,d);
	color_printk(WHITE,BLACK,"SMP copy byte:%#010x\n",
		(unsigned long)&_APU_boot_end - (unsigned long)&_APU_boot_start);
	// 线性地址0xffff800000020000位于首个物理页。首个物理页已经经过页表注册。所以，线性地址直接可以使用。
	// 将从现象地址_APU_boot_start开始的指定尺寸的内容拷贝到线性地址0xffff800000020000位置。
	memcpy(_APU_boot_start, (unsigned char *)0xffff800000020000, (unsigned long)&_APU_boot_end - (unsigned long)&_APU_boot_start);
	// 自旋锁初始化
	spin_init(&SMP_lock);
	// 为中断向量号200~209构建中断门描述符
	for(i = 200; i < 210; i++)
	{
		set_intr_gate(i , 0 , SMP_interrupt[i - 200]);
	}

	memset(SMP_IPI_desc, 0, sizeof(irq_desc_T) * 10);
	register_IPI(200, NULL, &IPI_0x200, NULL, NULL, "IPI 0x200");
	register_IPI(201, NULL, &IPI_0x201, NULL, NULL, "IPI 0x201");
}

在多核的AP逻辑处理单元投入运行前，我们需要为AP的准备引导程序。

上述我们将_APU_boot_start位置内容转移到0xffff800000020000。

然后，我们为中断向量号200~209进行了描述符在IDT的安装。

再然后，我们为向量号200，201准备了中断处理函数IPI_0x200，IPI_0x201。

// 占据64个比特位
struct INT_CMD_REG icr_entry;
// 中断向量号
icr_entry.vector = 0x00;
// INIT投递模式
icr_entry.deliver_mode =  APIC_ICR_IOAPIC_INIT;
// 物理模式
icr_entry.dest_mode = ICR_IOAPIC_DELV_PHYSICAL;
icr_entry.deliver_status = APIC_ICR_IOAPIC_Idle;
icr_entry.res_1 = 0;
// 信号驱动电平
icr_entry.level = ICR_LEVEL_DE_ASSERT;
// 边沿触发
icr_entry.trigger = APIC_ICR_IOAPIC_Edge;
icr_entry.res_2 = 0;
// 投递目标：除自身外所有逻辑处理器
icr_entry.dest_shorthand = ICR_ALL_EXCLUDE_Self;
icr_entry.res_3 = 0;
// 因为采用了速记值。所以，这里无效。
icr_entry.destination.x2apic_destination = 0x00;
// 这样就向除自身外所有逻辑处理单元投递了INIT
wrmsr(0x830, *(unsigned long *)&icr_entry);	//INIT IPI

上述我们通过BSP逻辑处理单元向除自身外其他逻辑处理单元投递了INIT IPI消息。收到消息的逻辑处理单元将进行初始化。

for(global_i = 1; global_i < 4; global_i++)
{
	// 自旋锁
	spin_lock(&SMP_lock);
	color_printk(RED, BLACK, "for1 %#018lx\n", global_i);
	// 动态分配内存空间-32KB
	ptr = (unsigned char *)kmalloc(STACK_SIZE, 0);
	// 栈基地址
	_stack_start = (unsigned long)ptr + STACK_SIZE;
	// 起始部分存储task_struct，用cpu_id字段记录隶属cpu索引
	((struct task_struct *)ptr)->cpu_id = global_i;
	// 每个cpu有独立的TSS区域
	memset(&init_tss[global_i], 0, sizeof(struct tss_struct));
	// 设置TSS区域的rsp0/rsp1/rsp2。中断异常发生时将依赖rsp0/rsp1/rsp2作为临时栈。
	init_tss[global_i].rsp0 = _stack_start;
	init_tss[global_i].rsp1 = _stack_start;
	init_tss[global_i].rsp2 = _stack_start;
	// 动态分配32KB区域--区域尾部作为另一个栈基地址
	ptr = (unsigned char *)kmalloc(STACK_SIZE, 0) + STACK_SIZE;
	// 新的32KB区域起始部分依然存储task_struct，cpu_id字段存储隶属cpu的索引
	((struct task_struct *)(ptr - STACK_SIZE))->cpu_id = global_i;
		
	// 用新的32KB区域尾后位置设置ist机制下栈基地址
	init_tss[global_i].ist1 = (unsigned long)ptr;
	init_tss[global_i].ist2 = (unsigned long)ptr;
	init_tss[global_i].ist3 = (unsigned long)ptr;
	init_tss[global_i].ist4 = (unsigned long)ptr;
	init_tss[global_i].ist5 = (unsigned long)ptr;
	init_tss[global_i].ist6 = (unsigned long)ptr;
	init_tss[global_i].ist7 = (unsigned long)ptr;
	// 为TSS区域安装TSS描述符
	// cpu0的GDT索引是10，cpu1是12，依次类推。因为每个TSS描述符尺寸是16B。
	set_tss_descriptor(10 + global_i * 2, &init_tss[global_i]);

	// 中断向量号0x20
	icr_entry.vector = 0x20;
	// 投递模式Start-Up
	icr_entry.deliver_mode = ICR_Start_up;
	// 不使用速记值来确定投递目标
	icr_entry.dest_shorthand = ICR_No_Shorthand;
	// 投递目标索引--global_i
	icr_entry.destination.x2apic_destination = global_i;
	// 对Start-up消息连续投递两次
	wrmsr(0x830, *(unsigned long *)&icr_entry);	//Start-up IPI
	wrmsr(0x830, *(unsigned long *)&icr_entry);	//Start-up IPI
	//color_printk(RED, BLACK, "for2 %#018lx\n", global_i);
	// 目标逻辑处理器收到Start-up投递模式的IPI消息后，将从0x00020000线性地址处开始执行
}

因为我们的环境上，AP逻辑处理器有3个。

我们为每个AP处理器准备两个32KB的动态分配区域作为AP上内核主线程的栈。

我们为每个AP处理器准备一个TSS区域，并在GDT中进行注册。此TSS区域的rsp0/rsp1/rsp2，ist1/~/ist7分别指向我们分配的两个栈区域尾后位置。

然后，我们将目标AP发送Start-up IPI消息，通知目标AP从物理地址0x00020000位置开始执行。

#include "linkage.h"

# 这里AP处理器执行这里的程序，完成从实模式到IA-32e模式的切换
# 切换过程中设置了GDT，IDT，页表，禁止了外部中断
# 这里的汇编采用的是AT&T格式
# 保证.balign后存储位置对齐到0x1000
.balign	 0x1000
# 代码段
.text
.code16
ENTRY(_APU_boot_start)
# _APU_boot_base是变量，这是变量赋值
_APU_boot_base = .
	# 禁止中断
	cli
	# 让处理器缓存失效
	wbinvd

	# 用%cs来设置ds，es。此时处于实模式
	mov	%cs,	%ax
	mov	%ax,	%ds
	mov	%ax,	%es
	mov	%ax,	%ss
	mov	%ax,	%fs
	mov	%ax,	%gs
	#	set sp
	# _APU_boot_tmp_Stack_end是标号，直接用标号代表位置
	# _APU_boot_base是常量
	# _APU_boot_tmp_Stack_end - _APU_boot_base得到两个位置差值
	# movl首个参数用$修饰
	# 表示源数值采用 $修饰参数自身。而非参数作为地址处的内容。
	movl	$(_APU_boot_tmp_Stack_end - _APU_boot_base),	%esp

	#	get base address
	mov	%cs,	%ax
	movzx	%ax,	%esi
	# 这样%esi指向段基地址--实模式下的
	shll	$4,	%esi

	#	set gdt and 32&64 code address
	# _APU_Code32是标号，直接用代表位置
	# _APU_boot_base是常量
	# 参数1表示地址=%esi+_APU_Code32 - _APU_boot_base，这样得到的是_APU_Code32地址
	leal	(_APU_Code32 - _APU_boot_base)(%esi),	%eax
	# movl参数1为%寄存器，表示源操作数为寄存器值
	# _APU_Code32_vector - _APU_boot_base首先是一个数值。_APU_Code32_vector位置相对_APU_boot_base位置的偏移
	# 考虑到ds已经被赋值。
	# movl参数2数值作为段内偏移，结合ds得到一个位置
	# 将源数值存储到参数2位置中
	movl	%eax,	_APU_Code32_vector - _APU_boot_base

	leal	(_APU_Code64 - _APU_boot_base)(%esi),	%eax
	# 类似可得
	movl	%eax,	_APU_Code64_vector - _APU_boot_base

	leal	(_APU_tmp_GDT - _APU_boot_base)(%esi),	%eax
	movl	%eax,	(_APU_tmp_GDT + 2 - _APU_boot_base)
	
	#	load idt gdt
	# 取得指向位置6字节
	lidtl	_APU_tmp_IDT - _APU_boot_base
	# 取得指向位置6字节
	lgdtl	_APU_tmp_GDT - _APU_boot_base

	#	enable protected mode
	# 开启保护模式。保护模式开启前必须准备好GDT，IDT（如果需处理中断的话）
	# 保护模式开启分页下，还需准备好页表。
	smsw	%ax
	bts	$0	,%ax
	lmsw	%ax

	#	go to 32 code
	# ljmp表示长跳转，又称段间跳转
	# ljmpl l后缀表示jmp操作数尺寸是32
	# ljmpl *(xx)表示从xx代表位置取内容，长跳转取4+2(保护模式段内偏移，段选择子)
	ljmpl	*(_APU_Code32_vector - _APU_boot_base)

.code32
.balign 4
_APU_Code32:
	# 为数据段寄存器重新加载段选择子--按保护模式下解释
	#	go to 64 code
	mov	$0x10,	%ax
	mov	%ax,	%ds
	mov	%ax,	%es
	mov	%ax,	%ss
	mov	%ax,	%fs
	mov	%ax,	%gs
	# 将_APU_boot_tmp_Stack_end地址存储到eax
	leal	(_APU_boot_tmp_Stack_end - _APU_boot_base)(%esi),	%eax
	# 再存储到esp（保护模式未开启分页，虚拟地址就是物理地址）
	movl	%eax,	%esp

	#	open PAE
	# 开启PAE
	movl	%cr4,	%eax
	bts	$5,	%eax
	movl	%eax,	%cr4

	#	set page table
	# 页表---和BSP的cpu运行的内核主线程共用一个页表
	movl	$0x90000,	%eax
	movl	%eax,	%cr3

	#	enable long mode
	# 开启长模式
	movl	$0xC0000080,	%ecx
	rdmsr
	bts	$8,	%eax
	wrmsr

	#	enable PE & paging
	# 开启保护模式&分页
	# 这样此后将运行在IA-32e模式
	movl	%cr0,	%eax
	bts	$0,	%eax
	bts	$31,	%eax
	movl	%eax,	%cr0
	# 长跳转--_APU_Code64_vector位置取4+2字节（段内偏移+段选择子）
	ljmp	*(_APU_Code64_vector - _APU_boot_base)(%esi)

.code64
.balign 4
_APU_Code64:
	#	go to head.S
	# 用IA-32e模式数据段再次设置段寄存器
	movq	$0x20,	%rax
	movq	%rax,	%ds
	movq	%rax,	%es
	movq	%rax,	%fs
	movq	%rax,	%gs
	movq	%rax,	%ss
	movq	$0x100000,	%rax
	# 跳转到0x100000位置，段内跳转
	jmpq	*%rax
	hlt

.balign 4
_APU_tmp_IDT:
	.word	0 # IDT为空表
	.word	0,0

.balign 4
# 保护模式的GDT表
_APU_tmp_GDT:
	.short	_APU_tmp_GDT_end - _APU_tmp_GDT - 1 # 头两字节存储尺寸，实际尺寸-1
	.long	_APU_tmp_GDT - _APU_boot_base # 存储_APU_tmp_GDT位置。存储GDT起始位置。
	.short	0
	.quad	0x00cf9a000000ffff # 保护模式32位代码段
	.quad	0x00cf92000000ffff # 保护模式32位数据段
	.quad	0x0020980000000000 # IA-32e模式64位代码段
	.quad	0x0000920000000000 # IA-32e模式数据段
_APU_tmp_GDT_end:

.balign 4
_APU_Code32_vector:
	.long	_APU_Code32 - _APU_boot_base #存储_APU_Code32位置
	.word	0x08,0	# 存储代码段

.balign 4
_APU_Code64_vector:
	.long	_APU_Code64 - _APU_boot_base #存储_APU_Code64位置
	.word	0x18,0	

.balign 4
_APU_boot_tmp_Stack_start:
	# 使得_APU_boot_tmp_Stack_end位于当前区偏移0x400位置。
	# _APU_boot_tmp_Stack_start与_APU_boot_tmp_Stack_end之间部分填充0
	.org	0x400 
_APU_boot_tmp_Stack_end:

ENTRY(_APU_boot_end)

AP逻辑处理器将执行上述引导程序，完成从实模式到保护模式再到IA-32e模式的切换。

值得注意的是，上述使用的页表位于0x90000，上述位置的页表在Loader中已经完成了构建。

完成模式切换后，跳转到0x100000位置继续执行。

// loader.asm
;=======	init template page table 0x90000 make sure there is not dirty data
	mov	dword	[0x90000],	0x91007
	mov	dword	[0x90004],	0x00000
	mov	dword	[0x90800],	0x91007
	mov	dword	[0x90804],	0x00000

	mov	dword	[0x91000],	0x92007
	mov	dword	[0x91004],	0x00000

	mov	dword	[0x92000],	0x000083
	mov	dword	[0x92004],	0x000000
	mov	dword	[0x92008],	0x200083
	mov	dword	[0x9200c],	0x000000
	mov	dword	[0x92010],	0x400083
	mov	dword	[0x92014],	0x000000
	mov	dword	[0x92018],	0x600083
	mov	dword	[0x9201c],	0x000000
	mov	dword	[0x92020],	0x800083
	mov	dword	[0x92024],	0x000000
	mov	dword	[0x92028],	0xa00083
	mov	dword	[0x9202c],	0x000000

0x100000也是BSP内核首次开始执行位置。

#include "linkage.h"
.section .text
// 这是0x100000位置--物理位置。
// 也是线性地址0x100000，0xffff800000000000 + 0x100000位置
ENTRY(_start)
	// 对AP处理器这里的赋值是不合适的
	mov	$0x10,	%ax
	mov	%ax,	%ds
	mov	%ax,	%es
	mov	%ax,	%fs
	mov	%ax,	%ss
	mov	$0x7E00,	%esp
	// 重新加载GDT，IDT完成AP，BSP的统一
	lgdt	GDT_POINTER(%rip)
	lidt	IDT_POINTER(%rip)

	// 按新的GDT加载段寄存器
	mov	$0x10,	%ax
	mov	%ax,	%ds
	mov	%ax,	%es
	mov	%ax,	%fs
	mov	%ax,	%gs
	mov	%ax,	%ss

	// 对AP，_stack_start指向动态分配的32KB区域
	movq	_stack_start(%rip),	%rsp
	// 再次设置页表
	movq	$0x101000,	%rax
	movq	%rax,		%cr3

	movq	switch_seg(%rip),	%rax
	pushq	$0x08
	pushq	%rax
	// 依次出栈得到段内偏移，段选择子，再执行段间跳转
	lretq

switch_seg:
	.quad	entry64

entry64:
	// 再次设置IA-32e下段寄存器
	movq	$0x10,	%rax
	movq	%rax,	%ds
	movq	%rax,	%es
	movq	%rax,	%gs
	movq	%rax,	%ss
	// 再次设置rsp
	movq	_stack_start(%rip),	%rsp		/* rsp address */
	// 用来测试是否为AP逻辑处理单元
	movq	$0x1b,	%rcx					//if APU
	rdmsr
	bt	$8,	%rax
	jnc	start_smp

	// 作为BSP执行
setup_IDT:							
	leaq	ignore_int(%rip),	%rdx
	movq	$(0x08 << 16),	%rax
	movw	%dx,	%ax
	movq	$(0x8E00 << 32),	%rcx		
	addq	%rcx,	%rax
	movl	%edx,	%ecx
	shrl	$16,	%ecx
	shlq	$48,	%rcx
	addq	%rcx,	%rax
	shrq	$32,	%rdx
	leaq	IDT_Table(%rip),	%rdi
	mov	$256,	%rcx
rp_sidt:
	movq	%rax,	(%rdi)
	movq	%rdx,	8(%rdi)
	addq	$0x10,	%rdi
	dec	%rcx
	jne	rp_sidt

setup_TSS64:
	leaq	init_tss(%rip),	%rdx
	xorq	%rax,	%rax
	xorq	%rcx,	%rcx
	movq	$0x89,	%rax
	shlq	$40,	%rax
	movl	%edx,	%ecx
	shrl	$24,	%ecx
	shlq	$56,	%rcx
	addq	%rcx,	%rax
	xorq	%rcx,	%rcx
	movl	%edx,	%ecx
	andl	$0xffffff,	%ecx
	shlq	$16,	%rcx
	addq	%rcx,	%rax
	addq	$103,	%rax
	leaq	GDT_Table(%rip),	%rdi
	movq	%rax,	80(%rdi)	//tss segment offset
	shrq	$32,	%rdx
	movq	%rdx,	88(%rdi)	//tss+1 segment offset
	
	movq	go_to_kernel(%rip),	%rax		/* movq address */
	pushq	$0x08
	pushq	%rax
	lretq

go_to_kernel:
	.quad	Start_Kernel

start_smp:
	movq	go_to_smp_kernel(%rip),	%rax		/* movq address */
	pushq	$0x08
	pushq	%rax
	lretq

go_to_smp_kernel:
	// 作为AP执行
	.quad	Start_SMP

ignore_int:
	cld
	pushq	%rax
	pushq	%rbx
	pushq	%rcx
	pushq	%rdx
	pushq	%rbp
	pushq	%rdi
	pushq	%rsi
	pushq	%r8
	pushq	%r9
	pushq	%r10
	pushq	%r11
	pushq	%r12
	pushq	%r13
	pushq	%r14
	pushq	%r15
	movq	%es,	%rax
	pushq	%rax
	movq	%ds,	%rax
	pushq	%rax
	movq	$0x10,	%rax
	movq	%rax,	%ds
	movq	%rax,	%es
	leaq	int_msg(%rip),	%rax			/* leaq get address */
	pushq	%rax
	movq	%rax,	%rdx
	movq	$0x00000000,	%rsi
	movq	$0x00ff0000,	%rdi
	movq	$0,	%rax
	callq	color_printk
	addq	$0x8,	%rsp
Loop:
	jmp	Loop	
	popq	%rax
	movq	%rax,	%ds
	popq	%rax
	movq	%rax,	%es
	popq	%r15
	popq	%r14
	popq	%r13
	popq	%r12
	popq	%r11
	popq	%r10
	popq	%r9
	popq	%r8
	popq	%rsi
	popq	%rdi
	popq	%rbp
	popq	%rdx
	popq	%rcx
	popq	%rbx
	popq	%rax
	iretq

int_msg:
	.asciz "Unknown interrupt or fault at RIP\n"

ENTRY(_stack_start)
	.quad	init_task_union + 32768

.align 8
.org	0x1000
__PML4E:
	.quad	0x102003
	.fill	255,8,0
	.quad	0x102003
	.fill	255,8,0

.org	0x2000
__PDPTE:
	.quad	0x103003	/* 0x103003 */
	.fill	511,8,0

.org	0x3000
__PDE:
	.quad	0x000083	
	.quad	0x200083
	.quad	0x400083
	.quad	0x600083
	.quad	0x800083		/* 0x800083 */
	.quad	0xa00083
	.quad	0xc00083
	.quad	0xe00083
	.quad	0x1000083
	.quad	0x1200083
	.quad	0x1400083
	.quad	0x1600083
	.quad	0x1800083
	.quad	0x1a00083
	.quad	0x1c00083
	.quad	0x1e00083
	.quad	0x2000083
	.quad	0x2200083
	.quad	0x2400083
	.quad	0x2600083
	.quad	0x2800083
	.quad	0x2a00083
	.quad	0x2c00083
	.quad	0x2e00083
	
	.quad	0xe0000083		/*0x 3000000*/
	.quad	0xe0200083
	.quad	0xe0400083
	.quad	0xe0600083
	.quad	0xe0800083
	.quad	0xe0a00083
	.quad	0xe0c00083
	.quad	0xe0e00083
	.fill	480,8,0

//=======	GDT_Table
.section .data
.globl GDT_Table

GDT_Table:
	.quad	0x0000000000000000			/*0	NULL descriptor		       	00*/
	.quad	0x0020980000000000			/*1	KERNEL	Code	64-bit	Segment	08*/
	.quad	0x0000920000000000			/*2	KERNEL	Data	64-bit	Segment	10*/
	.quad	0x0000000000000000			/*3	USER	Code	32-bit	Segment 18*/
	.quad	0x0000000000000000			/*4	USER	Data	32-bit	Segment 20*/
	.quad	0x0020f80000000000			/*5	USER	Code	64-bit	Segment	28*/
	.quad	0x0000f20000000000			/*6	USER	Data	64-bit	Segment	30*/
	.quad	0x00cf9a000000ffff			/*7	KERNEL	Code	32-bit	Segment	38*/
	.quad	0x00cf92000000ffff			/*8	KERNEL	Data	32-bit	Segment	40*/
	.fill	100,8,0					/*10 ~ 11 TSS (jmp one segment <9>) in long-mode 128-bit 50*/
GDT_END:

GDT_POINTER:
GDT_LIMIT:	.word	GDT_END - GDT_Table - 1
GDT_BASE:	.quad	GDT_Table

//=======	IDT_Table
.globl IDT_Table

IDT_Table:
	.fill  512,8,0
IDT_END:

IDT_POINTER:
IDT_LIMIT:	.word	IDT_END - IDT_Table - 1
IDT_BASE:	.quad	IDT_Table

这样AP逻辑处理器和BSP逻辑处理器使用了同一个页表0x101000。

AP逻辑处理器和BSP逻辑处理器的GDT，IDT也是同一个。

不过AP逻辑处理器的内核主线程的栈基地址是前面动态分配的32KB区域的尾后位置。

然后，跳转到 Start_SMP执行其逻辑代码。

void Start_SMP()
{
	unsigned int x,y;
	color_printk(RED, YELLOW, "APU starting......\n");
	//enable xAPIC & x2APIC
	__asm__ __volatile__(	"movq 	$0x1b,	%%rcx	\n\t"
				"rdmsr	\n\t"
				"bts	$10,	%%rax	\n\t"
				"bts	$11,	%%rax	\n\t"
				"wrmsr	\n\t"
				"movq 	$0x1b,	%%rcx	\n\t"
				"rdmsr	\n\t"
				:"=a"(x),"=d"(y)
				:
				:"memory");
	if(x & 0xc00)
		color_printk(RED,YELLOW,"xAPIC & x2APIC enabled\n");
	//enable SVR[8] SVR[12]
	__asm__ __volatile__(	"movq 	$0x80f,	%%rcx	\n\t"
				"rdmsr	\n\t"
				"bts	$8,	%%rax	\n\t"
				"bts	$12,	%%rax\n\t"
				"wrmsr	\n\t"
				"movq 	$0x80f,	%%rcx	\n\t"
				"rdmsr	\n\t"
				:"=a"(x),"=d"(y)
				:
				:"memory");
	if(x & 0x100)
		color_printk(RED, YELLOW, "SVR[8] enabled\n");
	if(x & 0x1000)
		color_printk(RED, YELLOW, "SVR[12] enabled\n");
	//get local APIC ID
	__asm__ __volatile__(	"movq $0x802,	%%rcx	\n\t"
				"rdmsr	\n\t"
				:"=a"(x),"=d"(y)
				:
				:"memory");
	color_printk(RED, YELLOW, "x2APIC ID:%#010x\t", x);

	// 我们为AP处理器准备的rsp指向一个32KB动态分配区域尾后位置。区域起始存储task_struct实例对象。
	// 对应此AP的内核主线程。
	// 设置此AP的内核主线程的task_struct
	current->state = TASK_RUNNING;
	current->flags = PF_KTHREAD;
	current->mm = &init_mm;
	list_init(&current->list);
	current->addr_limit = 0xffff800000000000;
	// AP的内核主线程优先级设置为2
	current->priority = 2;
	current->vrun_time = 0;
	current->thread = (struct thread_struct *)(current + 1);
	memset(current->thread, 0, sizeof(struct thread_struct));
	// 内核主线程实时信息--特权级0的栈的栈基地址
	current->thread->rsp0 = _stack_start;
	// 实时rsp
	current->thread->rsp = _stack_start;
	// 实时fs，gs
	current->thread->fs = KERNEL_DS;
	current->thread->gs = KERNEL_DS;
	// 设置此cpu的task_struct对象指针
	init_task[SMP_cpu_id()] = current;
	// 在我们给此AP发送Start-up IPI消息前已经为其安装好了TSS描述符
	// 这里告知AP逻辑处理单元
	load_TR(10 + (global_i -1) * 2);
	// 自旋锁解锁
	spin_unlock(&SMP_lock);
	// 逻辑处理单元当前持有自旋锁数量
	current->preempt_count = 0;
	// AP逻辑处理单元一切就绪后，开启中断
	// 有了TSS，有了IDT，有了中断门，陷阱门描述符。
	// 有了GDT，有了关联处理函数。
	// AP处理器此时也可以响应并处理异常，外部中断
	sti();
	// 循环
	while(1)
	{
		hlt();
	}
}

上述代码逻辑中，首先开启该AP逻辑处理器的Local APIC设备。

然后，设置该AP逻辑处理器的内核主线程的task_struct实例对象。该实例对象位于前面动态分配的32KB栈区域的起始位置。

然后通知内核其TSS区域。

至此，我们开启中断。以便该AP可以响应并处理外部中断。

然后让其循环执行hlt（）。