编译page-types
root@curtis-Aspire-E5-471G:/home/curtis/Downloads/rlk/runninglinuxkernel_5.0/tools# make -C vm/
make: Entering directory '/home/curtis/Downloads/rlk/runninglinuxkernel_5.0/tools/vm'
make -C ../lib/api
make[1]: Entering directory '/home/curtis/Downloads/rlk/runninglinuxkernel_5.0/tools/lib/api'
make -C /home/curtis/Downloads/rlk/runninglinuxkernel_5.0/tools/build CFLAGS= LDFLAGS= fixdep
HOSTCC fixdep.o
HOSTLD fixdep-in.o
LINK fixdep
CC fd/array.o
LD fd/libapi-in.o
CC fs/fs.o
CC fs/tracing_path.o
LD fs/libapi-in.o
CC cpu.o
CC debug.o
CC str_error_r.o
LD libapi-in.o
AR libapi.a
make[1]: Leaving directory '/home/curtis/Downloads/rlk/runninglinuxkernel_5.0/tools/lib/api'
cc -Wall -Wextra -I../lib/ -o page-types page-types.c ../lib/api/libapi.a
cc -Wall -Wextra -I../lib/ -o slabinfo slabinfo.c ../lib/api/libapi.a
cc -Wall -Wextra -I../lib/ -o page_owner_sort page_owner_sort.c ../lib/api/libapi.a
什么叫做hwposion(Documentation/vm/hwpoison.rst)
英特尔cpu支持从某些内存错误中恢复(MCA recovery),正常情况下内存出现fatal err,将会导致系统宕机。要支持这个特性,需要OS层面把内存的页属性定义为poisoned
,杀死与之关联的进程并避免将来使用它。
什么是MCA
MCA Recovery(Machine Check Architecture Recovery)技术源自2010年英特尔提出的硬件自检机制。
用户态如何查看内存页属性,用上边编译的page-types查看
root@curtis-Aspire-E5-471G:/home/curtis/Downloads/rlk/runninglinuxkernel_5.0/tools/vm# ./page-types
flags page-count MB symbolic-flags long-symbolic-flags
0x0000000000000000 149576 584 ___________________________________________
0x0000000004000000 6255 24 __________________________g________________ pgtable
0x0000000001000000 1 0 ________________________z__________________ zero_page
0x0000000000100000 411648 1608 ____________________n______________________ nopage
0x0000000000000008 2 0 ___U_______________________________________ uptodate
0x0000000000000018 1 0 ___UD______________________________________ uptodate,dirty
0x0000000000000028 96511 376 ___U_l_____________________________________ uptodate,lru
0x0000000000044028 305576 1193 ___U_l________b___u________________________ uptodate,lru,swapbacked,unevictable
0x000000000000002c 463086 1808 __RU_l_____________________________________ referenced,uptodate,lru
0x0000000000004030 4922 19 ____Dl________b____________________________ dirty,lru,swapbacked
0x0000000000000038 53 0 ___UDl_____________________________________ uptodate,dirty,lru
0x000000000000403c 136 0 __RUDl________b____________________________ referenced,uptodate,dirty,lru,swapbacked
0x000000000000003c 6 0 __RUDl_____________________________________ referenced,uptodate,dirty,lru
0x0000000000000060 16176 63 _____lA____________________________________ lru,active
0x0000000000000064 17356 67 __R__lA____________________________________ referenced,lru,active
0x0000000000000068 23855 93 ___U_lA____________________________________ uptodate,lru,active
0x000000000000006c 74634 291 __RU_lA____________________________________ referenced,uptodate,lru,active
0x0000000000000074 22 0 __R_DlA____________________________________ referenced,dirty,lru,active
0x0000000000004078 11 0 ___UDlA_______b____________________________ uptodate,dirty,lru,active,swapbacked
0x000000000000007c 8 0 __RUDlA____________________________________ referenced,uptodate,dirty,lru,active
0x000000000000407c 358 1 __RUDlA_______b____________________________ referenced,uptodate,dirty,lru,active,swapbacked
0x0000000000000080 67916 265 _______S___________________________________ slab
0x0000000000000228 4618 18 ___U_l___I_________________________________ uptodate,lru,reclaim
0x000000000000022c 7 0 __RU_l___I_________________________________ referenced,uptodate,lru,reclaim
0x0000000000000400 397166 1551 __________B________________________________ buddy
0x0000000000000800 95 0 ___________M_______________________________ mmap
0x0000000000000810 4 0 ____D______M_______________________________ dirty,mmap
0x0000000000000828 85757 334 ___U_l_____M_______________________________ uptodate,lru,mmap
0x000000000000082c 15909 62 __RU_l_____M_______________________________ referenced,uptodate,lru,mmap
0x0000000000004838 14770 57 ___UDl_____M__b____________________________ uptodate,dirty,lru,mmap,swapbacked
0x0000000000000838 812 3 ___UDl_____M_______________________________ uptodate,dirty,lru,mmap
0x000000000000483c 114 0 __RUDl_____M__b____________________________ referenced,uptodate,dirty,lru,mmap,swapbacked
0x0000000000000868 9427 36 ___U_lA____M_______________________________ uptodate,lru,active,mmap
0x000000000000086c 41448 161 __RU_lA____M_______________________________ referenced,uptodate,lru,active,mmap
0x0000000000004878 29 0 ___UDlA____M__b____________________________ uptodate,dirty,lru,active,mmap,swapbacked
0x000000000000487c 58 0 __RUDlA____M__b____________________________ referenced,uptodate,dirty,lru,active,mmap,swapbacked
0x000000000000087c 3 0 __RUDlA____M_______________________________ referenced,uptodate,dirty,lru,active,mmap
0x0000000000000a28 4 0 ___U_l___I_M_______________________________ uptodate,lru,reclaim,mmap
0x0000000000005808 10 0 ___U_______Ma_b____________________________ uptodate,mmap,anonymous,swapbacked
0x0000000000005828 278367 1087 ___U_l_____Ma_b____________________________ uptodate,lru,mmap,anonymous,swapbacked
0x0000000000001828 3346 13 ___U_l_____Ma______________________________ uptodate,lru,mmap,anonymous
0x000000000000582c 167 0 __RU_l_____Ma_b____________________________ referenced,uptodate,lru,mmap,anonymous,swapbacked
0x000000000004582c 12 0 __RU_l_____Ma_b___u________________________ referenced,uptodate,lru,mmap,anonymous,swapbacked,unevictable
0x000000000000583c 6 0 __RUDl_____Ma_b____________________________ referenced,uptodate,dirty,lru,mmap,anonymous,swapbacked
0x0000000000005868 4 0 ___U_lA____Ma_b____________________________ uptodate,lru,active,mmap,anonymous,swapbacked
0x000000000000586c 126 0 __RU_lA____Ma_b____________________________ referenced,uptodate,lru,active,mmap,anonymous,swapbacked
total 2490368 9728
对于物理内存页flags的定义(/include/linux/page-flags.h)
# 对于发生内存ECC故障之后,内存的flag被标记为PG_hwpoison
PG_hwpoison indicates that a page got corrupted in hardware and contains data with incorrect ECC bits that triggered a machine check. Accessing is not safe since it may cause another machine check. Don't touch!
PG_hwpoison表示页面在硬件中已损坏,并且包含触发计算机检查的 ECC 位不正确的数据。访问是不安全的,因为它可能会导致另一台机器检查。不要碰!
# 如何用page-types查看被flags被置为hwpoison的页
$ ./page-types -b hwpoison
flags page-count MB symbolic-flags long-symbolic-flags
total 0 0
内存硬件CE/UCE故障分类
CE: corrected error。意思就是可矫正的错误。举个例子来说,当内存检查遇到了错误的时候,当检查到的错误是1bit的时候,可能就是因为ECC校验码错误导致的,这时候CPU可以对其进行纠正,不会影响系统的任何进程,这种情况就是corrcted error。
UCE:uncorrect error, 就是不可纠正的错误。同样按照上面的例子来讲,当发生多bit的错误时候,就会产生uncorrected error,系统硬件不能直接处理恢复,这种错误就是uncorrected error。发生这种错误,或多或少会对系统产生影响.
另外, UCE 又分为 UCE-non-fatal和UCE-fatal.
UCE-non-fatal,指的是发生了硬件错误, 但是可以恢复. 还是以内存为例, 通常采用的动作是找到使用到错误内存的进程,然后将该进程杀死, 但是不会影响整个系统的运行
UCE-fatal, 指的是产生了相当严重的错误, CPU必须进入关机或者重启的流程
register banks的意思可以里为组寄存器, 每组中寄存器的名字都是相同的,但是内容可以保存不同组的内容. 如Ai是bank寄存器,在bank1中有A,在bank2中也有A,当用BANK1的A时,可以读写相应内容,但是不影响bank2的