在内核中,有几个位图变量是用作标识cpu数量和状态的,它们分别是:
变量名称 | 用途 | 循环所使用的宏 |
cpu_possible_mask | 系统中有多少个可以运行的cpu核 | for_each_possible_cpu |
cpu_present_mask | 系统中有多少个可处于运行状态的cpu核 | for_each_present_cpu |
cpu_online_mask | 系统中有多少个正在工作的cpu核 | for_each_online_cpu |
cpu_active_mask | 系统中有多少个活跃的cpu核 |
本文主要介绍一下cpu_possible_mask和cpu_present_mask的初始化。
1, cpu_possible_mask的初始化过程:
start_kernel
--> setup_arch
--> smp_init_cpus
--> acpi_parse_and_init_cpus
smp_cpu_setup
-->set_cpu_possible
从上述过程中,内核从acpi中获取了可以运行的cpu的数量。
2,cpu_present_mask
start_kernel
-->reset_init
--> kernel_thread(kernel_thread)
kernel_init(thread)
-->kernel_init_freeable
-->smp_prepare_cpus
-->cpu_prepare
-->cpu_psci_cpu_prepare
-->set_cpu_present
-->smp_init
-->cpu_up
-->cpu_psci_cpu_boot
还是从start_kernel开始,这个时候cpu0已经启动完成,内核创建了一个线程kernel_init,其余的cpu和在这个函数里进行初始化工作。
内核首先根据cpu_possible_mask中的信息,为每一个possible cpu调用cpu_prepare,得到了固件的回复后,将该cpu设置到cpu_present_mask相对应的位置上,这样系统就获得了可以运行的cpu的位图。
然后内核调用smp_init-->cpu_up逐一初始化这些内核,关于这些内核的的启动流程,可以参考我的另一篇博客:
smp_init过程解析_slab_prepare_cpu-CSDN博客
为什么想到要看这两个变量的初始化呢,原因是最近正在处理的项目是一个双cpu的服务器,每个服务器有32个核,但我们无论是在GUI的系统信息里,还是在dmidecode中,得到的信息都是一个64核的cpu,其中dmidecode信息如下:
dmidecode -r processor
# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 3.3.0 present.
# SMBIOS implementations newer than version 3.2.0 are not
# fully supported by this version of dmidecode.
Handle 0x0008, DMI type 4, 48 bytes
Processor Information
Socket Designation: SOCKET 0
Type: Central Processor
Family: ARM
Manufacturer: PHYTIUM LTD
ID: 10 08 00 00 00 00 00 00
Signature: Implementor 0x00, Variant 0x0, Architecture 0, Part 0x081, Revision 0
Version: Phytium S5000C 64 Core
Voltage: 0.9 V
External Clock: Unknown
Max Speed: 2300 MHz
Current Speed: 2300 MHz
Status: Populated, Enabled
Upgrade: None
L1 Cache Handle: 0x0005
L2 Cache Handle: 0x0006
L3 Cache Handle: 0x0007
Serial Number: KAP8160405050000
Asset Tag: Not Set
Part Number: Not Set
Core Count: 64
Core Enabled: 64
Thread Count: 64
Characteristics:
64-bit capable
Multi-Core
Execute Protection
Enhanced Virtualization
一开始以为是cpu_possible_mask或者cpu_present_mask有误,但根据dmidecode的源码分析后,看到这些信息是从内核的文件/sys/firmware/dmi/tables/smbios_entry_point中获取,dmidecode的过程如下,见dmidecode.c
static void dmi_decode(const struct dmi_header *h, u16 ver)
{
const u8 *data = h->data;
/*
* Note: DMI types 37 and 42 are untested
*/
switch (h->type)
{
case 0: /* 7.1 BIOS Information */
......
case 4: /* 7.5 Processor Information */
printf("Processor Information\n");
if (h->length < 0x1A) break;
printf("\tSocket Designation: %s\n",
dmi_string(h, data[0x04]));
printf("\tType: %s\n",
dmi_processor_type(data[0x05]));
printf("\tFamily: %s\n",
dmi_processor_family(h, ver));
printf("\tManufacturer: %s\n",
dmi_string(h, data[0x07]));
dmi_processor_id(h, "\t");
printf("\tVersion: %s\n",
dmi_string(h, data[0x10]));
printf("\tVoltage:");
dmi_processor_voltage(data[0x11]);
printf("\n");
printf("\tExternal Clock: ");
dmi_processor_frequency(data + 0x12);
printf("\n");
printf("\tMax Speed: ");
dmi_processor_frequency(data + 0x14);
printf("\n");
printf("\tCurrent Speed: ");
dmi_processor_frequency(data + 0x16);
printf("\n");
if (data[0x18] & (1 << 6))
printf("\tStatus: Populated, %s\n",
dmi_processor_status(data[0x18] & 0x07));
else
printf("\tStatus: Unpopulated\n");
printf("\tUpgrade: %s\n",
dmi_processor_upgrade(data[0x19]));
if (h->length < 0x20) break;
if (!(opt.flags & FLAG_QUIET))
{
printf("\tL1 Cache Handle:");
dmi_processor_cache(WORD(data + 0x1A), "L1", ver);
printf("\n");
printf("\tL2 Cache Handle:");
dmi_processor_cache(WORD(data + 0x1C), "L2", ver);
printf("\n");
printf("\tL3 Cache Handle:");
dmi_processor_cache(WORD(data + 0x1E), "L3", ver);
printf("\n");
}
if (h->length < 0x23) break;
printf("\tSerial Number: %s\n",
dmi_string(h, data[0x20]));
printf("\tAsset Tag: %s\n",
dmi_string(h, data[0x21]));
printf("\tPart Number: %s\n",
dmi_string(h, data[0x22]));
if (h->length < 0x28) break;
if (data[0x23] != 0)
printf("\tCore Count: %u\n",
h->length >= 0x2C && data[0x23] == 0xFF ?
WORD(data + 0x2A) : data[0x23]);
if (data[0x24] != 0)
printf("\tCore Enabled: %u\n",
h->length >= 0x2E && data[0x24] == 0xFF ?
WORD(data + 0x2C) : data[0x24]);
if (data[0x25] != 0)
printf("\tThread Count: %u\n",
h->length >= 0x30 && data[0x25] == 0xFF ?
WORD(data + 0x2E) : data[0x25]);
printf("\tCharacteristics:");
dmi_processor_characteristics(WORD(data + 0x26), "\t\t");
break;
......
。
cpu的信息包括socket都在dmicode信息中。
dmi信息在内核中的处理是在driver/firmware/dmi_scan.c中,代码如下:
void __init dmi_scan_machine(void)
{
char __iomem *p, *q;
char buf[32];
if (efi_enabled(EFI_CONFIG_TABLES)) {
if (efi.smbios3 != EFI_INVALID_TABLE_ADDR) {
p = dmi_early_remap(efi.smbios3, 32);
......
dmidecode这些信息是由efi写到内存的efi.smbios3地址中,内核使用iomap将该地址映射到内核的虚拟地址空间,并且读出来放到了/sys/firmware/dmi/tables/smbios_entry_point中,工具dmidecode对其进行读取和解析。
所以这个问题是smbios信息有误,更新后就可以显示正确的信息了。
dmidecode的源码在如下地址: