fio触发linux系统 oom kill,杀掉进程,释放memory
先贴上出现问题的时候,dmesg抓到的信息吧:
root@unassigned:~/test/fio_test# dmesg -T
[Tue Jul 11 13:59:51 2023] fio invoked oom-killer: gfp_mask=0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0
[Tue Jul 11 13:59:51 2023] fio cpuset=/ mems_allowed=0
[Tue Jul 11 13:59:51 2023] CPU: 4 PID: 8275 Comm: fio Not tainted 4.15.0-212-generic #223-Ubuntu
[Tue Jul 11 13:59:51 2023] Hardware name: System manufacturer System Product Name/PRIME Z390-A, BIOS 0903 03/18/2019
[Tue Jul 11 13:59:51 2023] Call Trace:
[Tue Jul 11 13:59:51 2023] dump_stack+0x6d/0x8b
[Tue Jul 11 13:59:51 2023] dump_header+0x71/0x282
[Tue Jul 11 13:59:51 2023] ? ___ratelimit+0x9c/0x100
[Tue Jul 11 13:59:51 2023] oom_kill_process+0x21f/0x420
[Tue Jul 11 13:59:51 2023] out_of_memory+0x116/0x4e0
[Tue Jul 11 13:59:51 2023] __alloc_pages_slowpath+0xa3d/0xe70
[Tue Jul 11 13:59:51 2023] __alloc_pages_nodemask+0x29a/0x2c0
[Tue Jul 11 13:59:51 2023] alloc_pages_vma+0x88/0x1f0
[Tue Jul 11 13:59:51 2023] do_anonymous_page+0x11a/0x420
[Tue Jul 11 13:59:51 2023] __handle_mm_fault+0x7da/0xc50
[Tue Jul 11 13:59:51 2023] handle_mm_fault+0xe7/0x260
[Tue Jul 11 13:59:51 2023] __do_page_fault+0x281/0x4b0
[Tue Jul 11 13:59:51 2023] do_page_fault+0x2e/0xe0
[Tue Jul 11 13:59:51 2023] ? page_fault+0x2f/0x50
[Tue Jul 11 13:59:51 2023] page_fault+0x45/0x50
[Tue Jul 11 13:59:51 2023] RIP: 0033:0x55b758c04739
[Tue Jul 11 13:59:51 2023] RSP: 002b:00007fffb3ab6160 EFLAGS: 00010206
[Tue Jul 11 13:59:51 2023] RAX: 000000000003f4aa RBX: 00007fdd39ba6000 RCX: 00007fdca4c25000
[Tue Jul 11 13:59:51 2023] RDX: 00007fdd50b61c50 RSI: 00007fdca4636010 RDI: 00007fdd39ba6000
[Tue Jul 11 13:59:51 2023] RBP: 0000000000000001 R08: 00000014a9289000 R09: 0000000000000000
[Tue Jul 11 13:59:51 2023] R10: 0000000000056507 R11: 0000000000000001 R12: 0000000000000001
[Tue Jul 11 13:59:51 2023] R13: 00000000001153c6 R14: 00007fdd50b546d0 R15: 0000000000000038
[Tue Jul 11 13:59:51 2023] Mem-Info:
[Tue Jul 11 13:59:51 2023] active_anon:1718018 inactive_anon:215680 isolated_anon:0
active_file:0 inactive_file:119 isolated_file:0
unevictable:0 dirty:0 writeback:0 unstable:0
slab_reclaimable:7509 slab_unreclaimable:13812
mapped:64558 shmem:64600 pagetables:6067 bounce:0
free:26587 free_pcp:0 free_cma:0
[Tue Jul 11 13:59:51 2023] Node 0 active_anon:6872072kB inactive_anon:862720kB active_file:0kB inactive_file:476kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:258232kB dirty:0kB writeback:0kB shmem:258400kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
[Tue Jul 11 13:59:51 2023] Node 0 DMA free:15896kB min:132kB low:164kB high:196kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15900kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Tue Jul 11 13:59:51 2023] lowmem_reserve[]: 0 2006 7732 7732 7732
[Tue Jul 11 13:59:51 2023] Node 0 DMA32 free:40404kB min:17504kB low:21880kB high:26256kB active_anon:2076676kB inactive_anon:0kB active_file:420kB inactive_file:348kB unevictable:0kB writepending:0kB present:2210564kB managed:2122928kB mlocked:0kB kernel_stack:0kB pagetables:4048kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Tue Jul 11 13:59:51 2023] lowmem_reserve[]: 0 0 5725 5725 5725
[Tue Jul 11 13:59:51 2023] Node 0 Normal free:50048kB min:49944kB low:62428kB high:74912kB active_anon:4795396kB inactive_anon:862720kB active_file:648kB inactive_file:192kB unevictable:0kB writepending:0kB present:5996544kB managed:5863252kB mlocked:0kB kernel_stack:3152kB pagetables:20220kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Tue Jul 11 13:59:51 2023] lowmem_reserve[]: 0 0 0 0 0
[Tue Jul 11 13:59:51 2023] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15896kB
[Tue Jul 11 13:59:51 2023] Node 0 DMA32: 355*4kB (UM) 147*8kB (UM) 55*16kB (UM) 14*32kB (UM) 3*64kB (UM) 1*128kB (U) 1*256kB (U) 1*512kB (M) 2*1024kB (UM) 1*2048kB (M) 8*4096kB (M) = 41876kB
[Tue Jul 11 13:59:51 2023] Node 0 Normal: 1102*4kB (UME) 902*8kB (UME) 375*16kB (UME) 151*32kB (UME) 42*64kB (UME) 12*128kB (UME) 3*256kB (UME) 2*512kB (UE) 2*1024kB (UM) 2*2048kB (UM) 4*4096kB (U) = 51000kB
[Tue Jul 11 13:59:51 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[Tue Jul 11 13:59:51 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Tue Jul 11 13:59:51 2023] 65028 total pagecache pages
[Tue Jul 11 13:59:51 2023] 171 pages in swap cache
[Tue Jul 11 13:59:51 2023] Swap cache stats: add 524642, delete 524471, find 154/289
[Tue Jul 11 13:59:51 2023] Free swap = 0kB
[Tue Jul 11 13:59:51 2023] Total swap = 2097148kB
[Tue Jul 11 13:59:51 2023] 2055774 pages RAM
[Tue Jul 11 13:59:51 2023] 0 pages HighMem/MovableOnly
[Tue Jul 11 13:59:51 2023] 55254 pages reserved
[Tue Jul 11 13:59:51 2023] 0 pages cma reserved
[Tue Jul 11 13:59:51 2023] 0 pages hwpoisoned
[Tue Jul 11 13:59:51 2023] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[Tue Jul 11 13:59:51 2023] [ 482] 0 482 19586 66 176128 96 0 systemd-journal
[Tue Jul 11 13:59:51 2023] [ 502] 0 502 11741 110 118784 558 -1000 systemd-udevd
[Tue Jul 11 13:59:51 2023] [ 503] 0 503 24428 8 98304 35 0 lvmetad
[Tue Jul 11 13:59:51 2023] [ 696] 0 696 11902 40 135168 59 0 rpcbind
[Tue Jul 11 13:59:51 2023] [ 702] 62583 702 35447 0 180224 118 0 systemd-timesyn
[Tue Jul 11 13:59:51 2023] [ 710] 100 710 19981 78 176128 82 0 systemd-network
[Tue Jul 11 13:59:51 2023] [ 795] 101 795 17723 9 184320 224 0 systemd-resolve
[Tue Jul 11 13:59:51 2023] [ 901] 0 901 7577 23 102400 47 0 cron
[Tue Jul 11 13:59:51 2023] [ 908] 0 908 42973 293 237568 1747 0 networkd-dispat
[Tue Jul 11 13:59:51 2023] [ 911] 0 911 40271 0 90112 52 0 lxcfs
[Tue Jul 11 13:59:51 2023] [ 922] 0 922 72156 90 200704 115 0 accounts-daemon
[Tue Jul 11 13:59:51 2023] [ 925] 0 925 27623 38 118784 46 0 irqbalance
[Tue Jul 11 13:59:51 2023] [ 926] 103 926 12525 0 135168 124 -900 dbus-daemon
[Tue Jul 11 13:59:51 2023] [ 962] 0 962 7084 11 102400 41 0 atd
[Tue Jul 11 13:59:51 2023] [ 968] 102 968 66819 0 172032 321 0 rsyslogd
[Tue Jul 11 13:59:51 2023] [ 973] 0 973 15493 14 159744 129 0 systemd-logind
[Tue Jul 11 13:59:51 2023] [ 982] 0 982 45921 83 118784 943 0 python3.8
[Tue Jul 11 13:59:51 2023] [ 999] 0 999 72221 14 196608 183 0 polkitd
[Tue Jul 11 13:59:51 2023] [ 1046] 0 1046 3491 1 73728 73 0 bash
[Tue Jul 11 13:59:51 2023] [ 1086] 0 1086 38958 395 348160 8457 0 python
[Tue Jul 11 13:59:51 2023] [ 1196] 0 1196 47080 190 262144 1791 0 unattended-upgr
[Tue Jul 11 13:59:51 2023] [ 1276] 0 1276 18076 1 180224 188 -1000 sshd
[Tue Jul 11 13:59:51 2023] [ 1298] 0 1298 3800 5 77824 28 0 agetty
[Tue Jul 11 13:59:51 2023] [ 1312] 0 1312 124420 22 180224 153 0 automount
[Tue Jul 11 13:59:51 2023] [ 1374] 0 1374 18169 74 180224 182 0 sshd
[Tue Jul 11 13:59:51 2023] [ 1383] 0 1383 18091 0 184320 209 0 sshd
[Tue Jul 11 13:59:51 2023] [ 1385] 0 1385 4982 1 86016 456 0 bash
[Tue Jul 11 13:59:51 2023] [ 1402] 0 1402 3267 0 73728 33 0 sftp-server
[Tue Jul 11 13:59:51 2023] [ 8259] 0 8259 176363 64545 778240 1006 0 fio
[Tue Jul 11 13:59:51 2023] [ 8275] 0 8275 776463 467194 4980736 127214 0 fio
[Tue Jul 11 13:59:51 2023] [ 8276] 0 8276 776464 467431 4980736 127406 0 fio
[Tue Jul 11 13:59:51 2023] [ 8277] 0 8277 776465 465425 4960256 127090 0 fio
[Tue Jul 11 13:59:51 2023] [ 8278] 0 8278 776466 466949 4960256 127231 0 fio
[Tue Jul 11 13:59:51 2023] [ 8299] 0 8299 18161 88 184320 164 0 sshd
[Tue Jul 11 13:59:51 2023] [ 8301] 0 8301 18091 0 184320 210 0 sshd
[Tue Jul 11 13:59:51 2023] [ 8303] 0 8303 4922 1 86016 416 0 bash
[Tue Jul 11 13:59:51 2023] [ 8318] 0 8318 3267 1 73728 41 0 sftp-server
[Tue Jul 11 13:59:51 2023] [ 8331] 0 8331 1144 22 57344 8 0 iostat
[Tue Jul 11 13:59:51 2023] Out of memory: Kill process 8276 (fio) score 228 or sacrifice child
[Tue Jul 11 13:59:51 2023] Killed process 8276 (fio) total-vm:3105856kB, anon-rss:1869588kB, file-rss:0kB, shmem-rss:136kB
[Tue Jul 11 13:59:51 2023] oom_reaper: reaped process 8276 (fio), now anon-rss:0kB, file-rss:0kB, shmem-rss:136kB
[Tue Jul 11 14:01:35 2023] fio invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
[Tue Jul 11 14:01:35 2023] fio cpuset=/ mems_allowed=0
[Tue Jul 11 14:01:35 2023] CPU: 1 PID: 8275 Comm: fio Not tainted 4.15.0-212-generic #223-Ubuntu
[Tue Jul 11 14:01:35 2023] Hardware name: System manufacturer System Product Name/PRIME Z390-A, BIOS 0903 03/18/2019
[Tue Jul 11 14:01:35 2023] Call Trace:
[Tue Jul 11 14:01:35 2023] dump_stack+0x6d/0x8b
[Tue Jul 11 14:01:35 2023] dump_header+0x71/0x282
[Tue Jul 11 14:01:35 2023] ? ___ratelimit+0x9c/0x100
[Tue Jul 11 14:01:35 2023] oom_kill_process+0x21f/0x420
[Tue Jul 11 14:01:35 2023] out_of_memory+0x116/0x4e0
[Tue Jul 11 14:01:35 2023] __alloc_pages_slowpath+0xa3d/0xe70
[Tue Jul 11 14:01:35 2023] __alloc_pages_nodemask+0x29a/0x2c0
[Tue Jul 11 14:01:35 2023] alloc_pages_current+0x6a/0xe0
[Tue Jul 11 14:01:35 2023] __page_cache_alloc+0x81/0xa0
[Tue Jul 11 14:01:35 2023] filemap_fault+0x42f/0x750
[Tue Jul 11 14:01:35 2023] ? filemap_map_pages+0x181/0x390
[Tue Jul 11 14:01:35 2023] ext4_filemap_fault+0x31/0x50
[Tue Jul 11 14:01:35 2023] __do_fault+0x34/0x100
[Tue Jul 11 14:01:35 2023] __handle_mm_fault+0x982/0xc50
[Tue Jul 11 14:01:35 2023] handle_mm_fault+0xe7/0x260
[Tue Jul 11 14:01:35 2023] __do_page_fault+0x281/0x4b0
[Tue Jul 11 14:01:35 2023] do_page_fault+0x2e/0xe0
[Tue Jul 11 14:01:35 2023] ? page_fault+0x2f/0x50
[Tue Jul 11 14:01:35 2023] page_fault+0x45/0x50
[Tue Jul 11 14:01:35 2023] RIP: 0033:0x55b758bf5900
[Tue Jul 11 14:01:35 2023] RSP: 002b:00007fffb3ab6158 EFLAGS: 00010246
[Tue Jul 11 14:01:35 2023] RAX: 000000000006fa47 RBX: 00007fdd39ba6000 RCX: 0000000000001000
[Tue Jul 11 14:01:35 2023] RDX: 0000000000000001 RSI: 00000000367d7cc8 RDI: 00007fdd50b54b90
[Tue Jul 11 14:01:35 2023] RBP: 0000000000000001 R08: 000000000006fa47 R09: 0000001baccbd000
[Tue Jul 11 14:01:35 2023] R10: 0000000000000008 R11: 00007fdd50b54b90 R12: 0000000000000000
[Tue Jul 11 14:01:35 2023] R13: 0000000000000001 R14: 00000000367d7cc8 R15: 0000001baccbd000
[Tue Jul 11 14:01:35 2023] Mem-Info:
[Tue Jul 11 14:01:35 2023] active_anon:1718174 inactive_anon:215707 isolated_anon:0
active_file:62 inactive_file:186 isolated_file:0
unevictable:0 dirty:0 writeback:0 unstable:0
slab_reclaimable:7570 slab_unreclaimable:13847
mapped:64598 shmem:64597 pagetables:6019 bounce:0
free:26567 free_pcp:30 free_cma:0
[Tue Jul 11 14:01:35 2023] Node 0 active_anon:6872696kB inactive_anon:862828kB active_file:248kB inactive_file:744kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:258392kB dirty:0kB writeback:0kB shmem:258388kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
[Tue Jul 11 14:01:35 2023] Node 0 DMA free:15896kB min:132kB low:164kB high:196kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15900kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Tue Jul 11 14:01:35 2023] lowmem_reserve[]: 0 2006 7732 7732 7732
[Tue Jul 11 14:01:35 2023] Node 0 DMA32 free:40396kB min:17504kB low:21880kB high:26256kB active_anon:2077736kB inactive_anon:4kB active_file:0kB inactive_file:48kB unevictable:0kB writepending:0kB present:2210564kB managed:2122928kB mlocked:0kB kernel_stack:0kB pagetables:4052kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Tue Jul 11 14:01:35 2023] lowmem_reserve[]: 0 0 5725 5725 5725
[Tue Jul 11 14:01:35 2023] Node 0 Normal free:49976kB min:49944kB low:62428kB high:74912kB active_anon:4794960kB inactive_anon:862824kB active_file:80kB inactive_file:1160kB unevictable:0kB writepending:0kB present:5996544kB managed:5863252kB mlocked:0kB kernel_stack:3152kB pagetables:20024kB bounce:0kB free_pcp:120kB local_pcp:0kB free_cma:0kB
[Tue Jul 11 14:01:35 2023] lowmem_reserve[]: 0 0 0 0 0
[Tue Jul 11 14:01:35 2023] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15896kB
[Tue Jul 11 14:01:35 2023] Node 0 DMA32: 162*4kB (UM) 128*8kB (UM) 43*16kB (UM) 9*32kB (UM) 6*64kB (UM) 3*128kB (UM) 1*256kB (U) 1*512kB (M) 2*1024kB (UM) 1*2048kB (M) 8*4096kB (M) = 41048kB
[Tue Jul 11 14:01:35 2023] Node 0 Normal: 892*4kB (UME) 1008*8kB (UME) 407*16kB (UME) 142*32kB (UME) 40*64kB (UME) 11*128kB (UE) 3*256kB (UME) 2*512kB (UE) 2*1024kB (UM) 2*2048kB (UM) 4*4096kB (U) = 50976kB
[Tue Jul 11 14:01:35 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[Tue Jul 11 14:01:35 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Tue Jul 11 14:01:35 2023] 65174 total pagecache pages
[Tue Jul 11 14:01:35 2023] 304 pages in swap cache
[Tue Jul 11 14:01:35 2023] Swap cache stats: add 652012, delete 651708, find 652/1133
[Tue Jul 11 14:01:35 2023] Free swap = 0kB
[Tue Jul 11 14:01:35 2023] Total swap = 2097148kB
[Tue Jul 11 14:01:35 2023] 2055774 pages RAM
[Tue Jul 11 14:01:35 2023] 0 pages HighMem/MovableOnly
[Tue Jul 11 14:01:35 2023] 55254 pages reserved
[Tue Jul 11 14:01:35 2023] 0 pages cma reserved
[Tue Jul 11 14:01:35 2023] 0 pages hwpoisoned
[Tue Jul 11 14:01:35 2023] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[Tue Jul 11 14:01:35 2023] [ 482] 0 482 19586 84 176128 78 0 systemd-journal
[Tue Jul 11 14:01:35 2023] [ 502] 0 502 11741 112 118784 556 -1000 systemd-udevd
[Tue Jul 11 14:01:35 2023] [ 503] 0 503 24428 5 98304 38 0 lvmetad
[Tue Jul 11 14:01:35 2023] [ 696] 0 696 11902 52 135168 63 0 rpcbind
[Tue Jul 11 14:01:35 2023] [ 702] 62583 702 35447 5 180224 123 0 systemd-timesyn
[Tue Jul 11 14:01:35 2023] [ 710] 100 710 19981 65 176128 95 0 systemd-network
[Tue Jul 11 14:01:35 2023] [ 795] 101 795 17723 24 184320 216 0 systemd-resolve
[Tue Jul 11 14:01:35 2023] [ 901] 0 901 7577 23 102400 47 0 cron
[Tue Jul 11 14:01:35 2023] [ 908] 0 908 42973 118 237568 1922 0 networkd-dispat
[Tue Jul 11 14:01:35 2023] [ 911] 0 911 40271 0 90112 52 0 lxcfs
[Tue Jul 11 14:01:35 2023] [ 922] 0 922 72156 96 200704 113 0 accounts-daemon
[Tue Jul 11 14:01:35 2023] [ 925] 0 925 27623 38 118784 46 0 irqbalance
[Tue Jul 11 14:01:35 2023] [ 926] 103 926 12525 65 135168 96 -900 dbus-daemon
[Tue Jul 11 14:01:35 2023] [ 962] 0 962 7084 11 102400 41 0 atd
[Tue Jul 11 14:01:35 2023] [ 968] 102 968 66819 32 172032 281 0 rsyslogd
[Tue Jul 11 14:01:35 2023] [ 973] 0 973 15493 33 159744 111 0 systemd-logind
[Tue Jul 11 14:01:35 2023] [ 982] 0 982 45921 83 118784 943 0 python3.8
[Tue Jul 11 14:01:35 2023] [ 999] 0 999 72221 7 196608 190 0 polkitd
[Tue Jul 11 14:01:35 2023] [ 1046] 0 1046 3491 1 73728 73 0 bash
[Tue Jul 11 14:01:35 2023] [ 1086] 0 1086 38958 234 348160 8618 0 python
[Tue Jul 11 14:01:35 2023] [ 1196] 0 1196 47080 72 262144 1909 0 unattended-upgr
[Tue Jul 11 14:01:35 2023] [ 1276] 0 1276 18076 1 180224 188 -1000 sshd
[Tue Jul 11 14:01:35 2023] [ 1298] 0 1298 3800 5 77824 28 0 agetty
[Tue Jul 11 14:01:35 2023] [ 1312] 0 1312 124420 36 180224 143 0 automount
[Tue Jul 11 14:01:35 2023] [ 1374] 0 1374 18169 82 180224 174 0 sshd
[Tue Jul 11 14:01:35 2023] [ 1383] 0 1383 18091 0 184320 209 0 sshd
[Tue Jul 11 14:01:35 2023] [ 1385] 0 1385 4982 1 86016 456 0 bash
[Tue Jul 11 14:01:35 2023] [ 1402] 0 1402 3267 0 73728 33 0 sftp-server
[Tue Jul 11 14:01:35 2023] [ 8259] 0 8259 176363 64547 778240 1004 0 fio
[Tue Jul 11 14:01:35 2023] [ 8275] 0 8275 960813 623106 6561792 169332 0 fio
[Tue Jul 11 14:01:35 2023] [ 8277] 0 8277 960815 621310 6541312 169149 0 fio
[Tue Jul 11 14:01:35 2023] [ 8278] 0 8278 960816 622859 6561792 169294 0 fio
[Tue Jul 11 14:01:35 2023] [ 8299] 0 8299 18161 92 184320 160 0 sshd
[Tue Jul 11 14:01:35 2023] [ 8301] 0 8301 18091 0 184320 210 0 sshd
[Tue Jul 11 14:01:35 2023] [ 8303] 0 8303 4922 1 86016 416 0 bash
[Tue Jul 11 14:01:35 2023] [ 8318] 0 8318 3267 1 73728 41 0 sftp-server
[Tue Jul 11 14:01:35 2023] [ 8331] 0 8331 1144 22 57344 8 0 iostat
[Tue Jul 11 14:01:35 2023] Out of memory: Kill process 8275 (fio) score 305 or sacrifice child
[Tue Jul 11 14:01:35 2023] Killed process 8275 (fio) total-vm:3843252kB, anon-rss:2492284kB, file-rss:0kB, shmem-rss:140kB
[Tue Jul 11 14:01:35 2023] oom_reaper: reaped process 8275 (fio), now anon-rss:0kB, file-rss:0kB, shmem-rss:140kB
[Tue Jul 11 14:11:17 2023] fio invoked oom-killer: gfp_mask=0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0
[Tue Jul 11 14:11:17 2023] fio cpuset=/ mems_allowed=0
[Tue Jul 11 14:11:17 2023] CPU: 0 PID: 8277 Comm: fio Not tainted 4.15.0-212-generic #223-Ubuntu
[Tue Jul 11 14:11:17 2023] Hardware name: System manufacturer System Product Name/PRIME Z390-A, BIOS 0903 03/18/2019
[Tue Jul 11 14:11:17 2023] Call Trace:
[Tue Jul 11 14:11:17 2023] dump_stack+0x6d/0x8b
[Tue Jul 11 14:11:17 2023] dump_header+0x71/0x282
[Tue Jul 11 14:11:17 2023] ? ___ratelimit+0x9c/0x100
[Tue Jul 11 14:11:17 2023] oom_kill_process+0x21f/0x420
[Tue Jul 11 14:11:17 2023] out_of_memory+0x116/0x4e0
[Tue Jul 11 14:11:17 2023] __alloc_pages_slowpath+0xa3d/0xe70
[Tue Jul 11 14:11:17 2023] __alloc_pages_nodemask+0x29a/0x2c0
[Tue Jul 11 14:11:17 2023] alloc_pages_vma+0x88/0x1f0
[Tue Jul 11 14:11:17 2023] do_anonymous_page+0x11a/0x420
[Tue Jul 11 14:11:17 2023] __handle_mm_fault+0x7da/0xc50
[Tue Jul 11 14:11:17 2023] handle_mm_fault+0xe7/0x260
[Tue Jul 11 14:11:17 2023] __do_page_fault+0x281/0x4b0
[Tue Jul 11 14:11:17 2023] do_page_fault+0x2e/0xe0
[Tue Jul 11 14:11:17 2023] ? page_fault+0x2f/0x50
[Tue Jul 11 14:11:17 2023] page_fault+0x45/0x50
[Tue Jul 11 14:11:17 2023] RIP: 0033:0x55b758c04739
[Tue Jul 11 14:11:17 2023] RSP: 002b:00007fffb3ab6160 EFLAGS: 00010206
[Tue Jul 11 14:11:17 2023] RAX: 0000000000080eaa RBX: 00007fdd39bc5690 RCX: 00007fdc169ed000
[Tue Jul 11 14:11:17 2023] RDX: 00007fdd50b655d0 RSI: 00007fdc15dd7010 RDI: 00007fdd39bc5690
[Tue Jul 11 14:11:17 2023] RBP: 0000000000000000 R08: 0000015f28ceb000 R09: 0000000000000000
[Tue Jul 11 14:11:17 2023] R10: 00000000000fd8ff R11: 0000000000000001 R12: 0000000000000000
[Tue Jul 11 14:11:17 2023] R13: 000000000004b510 R14: 00007fdd50b56210 R15: 0000000000000000
[Tue Jul 11 14:11:17 2023] Mem-Info:
[Tue Jul 11 14:11:17 2023] active_anon:1718168 inactive_anon:215490 isolated_anon:0
active_file:162 inactive_file:67 isolated_file:0
unevictable:0 dirty:0 writeback:0 unstable:0
slab_reclaimable:7671 slab_unreclaimable:13870
mapped:64607 shmem:64583 pagetables:6009 bounce:0
free:26673 free_pcp:58 free_cma:0
[Tue Jul 11 14:11:17 2023] Node 0 active_anon:6872672kB inactive_anon:861960kB active_file:648kB inactive_file:268kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:258428kB dirty:0kB writeback:0kB shmem:258332kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[Tue Jul 11 14:11:17 2023] Node 0 DMA free:15896kB min:132kB low:164kB high:196kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15900kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Tue Jul 11 14:11:17 2023] lowmem_reserve[]: 0 2006 7732 7732 7732
[Tue Jul 11 14:11:17 2023] Node 0 DMA32 free:40836kB min:17504kB low:21880kB high:26256kB active_anon:1271540kB inactive_anon:805220kB active_file:408kB inactive_file:632kB unevictable:0kB writepending:0kB present:2210564kB managed:2122928kB mlocked:0kB kernel_stack:0kB pagetables:3984kB bounce:0kB free_pcp:120kB local_pcp:0kB free_cma:0kB
[Tue Jul 11 14:11:17 2023] lowmem_reserve[]: 0 0 5725 5725 5725
[Tue Jul 11 14:11:17 2023] Node 0 Normal free:49960kB min:49944kB low:62428kB high:74912kB active_anon:5600952kB inactive_anon:56884kB active_file:684kB inactive_file:0kB unevictable:0kB writepending:0kB present:5996544kB managed:5863252kB mlocked:0kB kernel_stack:3120kB pagetables:20052kB bounce:0kB free_pcp:112kB local_pcp:0kB free_cma:0kB
[Tue Jul 11 14:11:17 2023] lowmem_reserve[]: 0 0 0 0 0
[Tue Jul 11 14:11:17 2023] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15896kB
[Tue Jul 11 14:11:17 2023] Node 0 DMA32: 233*4kB (UM) 142*8kB (UM) 49*16kB (UM) 20*32kB (UM) 8*64kB (UM) 2*128kB (UM) 1*256kB (U) 1*512kB (M) 1*1024kB (U) 1*2048kB (M) 8*4096kB (M) = 40868kB
[Tue Jul 11 14:11:17 2023] Node 0 Normal: 721*4kB (UMEH) 892*8kB (UMEH) 390*16kB (UMEH) 142*32kB (UMEH) 43*64kB (UMEH) 12*128kB (UME) 5*256kB (UME) 3*512kB (UME) 2*1024kB (UM) 2*2048kB (UM) 4*4096kB (U) = 50436kB
[Tue Jul 11 14:11:17 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[Tue Jul 11 14:11:17 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Tue Jul 11 14:11:17 2023] 65158 total pagecache pages
[Tue Jul 11 14:11:17 2023] 182 pages in swap cache
[Tue Jul 11 14:11:17 2023] Swap cache stats: add 821382, delete 821199, find 2044/2935
[Tue Jul 11 14:11:17 2023] Free swap = 0kB
[Tue Jul 11 14:11:17 2023] Total swap = 2097148kB
[Tue Jul 11 14:11:17 2023] 2055774 pages RAM
[Tue Jul 11 14:11:17 2023] 0 pages HighMem/MovableOnly
[Tue Jul 11 14:11:17 2023] 55254 pages reserved
[Tue Jul 11 14:11:17 2023] 0 pages cma reserved
[Tue Jul 11 14:11:17 2023] 0 pages hwpoisoned
[Tue Jul 11 14:11:17 2023] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[Tue Jul 11 14:11:17 2023] [ 482] 0 482 19586 89 176128 75 0 systemd-journal
[Tue Jul 11 14:11:17 2023] [ 502] 0 502 11741 58 118784 609 -1000 systemd-udevd
[Tue Jul 11 14:11:17 2023] [ 503] 0 503 24428 0 98304 44 0 lvmetad
[Tue Jul 11 14:11:17 2023] [ 696] 0 696 11902 38 135168 77 0 rpcbind
[Tue Jul 11 14:11:17 2023] [ 702] 62583 702 35447 16 180224 112 0 systemd-timesyn
[Tue Jul 11 14:11:17 2023] [ 710] 100 710 19981 50 176128 110 0 systemd-network
[Tue Jul 11 14:11:17 2023] [ 795] 101 795 17723 25 184320 215 0 systemd-resolve
[Tue Jul 11 14:11:17 2023] [ 901] 0 901 7577 29 102400 41 0 cron
[Tue Jul 11 14:11:17 2023] [ 908] 0 908 42973 37 237568 2003 0 networkd-dispat
[Tue Jul 11 14:11:17 2023] [ 911] 0 911 40271 0 90112 52 0 lxcfs
[Tue Jul 11 14:11:17 2023] [ 922] 0 922 72156 75 200704 134 0 accounts-daemon
[Tue Jul 11 14:11:17 2023] [ 925] 0 925 27623 43 118784 45 0 irqbalance
[Tue Jul 11 14:11:17 2023] [ 926] 103 926 12525 81 135168 80 -900 dbus-daemon
[Tue Jul 11 14:11:17 2023] [ 962] 0 962 7084 9 102400 43 0 atd
[Tue Jul 11 14:11:17 2023] [ 968] 102 968 66819 107 172032 263 0 rsyslogd
[Tue Jul 11 14:11:17 2023] [ 973] 0 973 15493 41 159744 103 0 systemd-logind
[Tue Jul 11 14:11:17 2023] [ 982] 0 982 45921 83 118784 943 0 python3.8
[Tue Jul 11 14:11:17 2023] [ 999] 0 999 72221 0 196608 197 0 polkitd
[Tue Jul 11 14:11:17 2023] [ 1046] 0 1046 3491 1 73728 73 0 bash
[Tue Jul 11 14:11:17 2023] [ 1086] 0 1086 38958 134 348160 8718 0 python
[Tue Jul 11 14:11:17 2023] [ 1196] 0 1196 47080 27 262144 1954 0 unattended-upgr
[Tue Jul 11 14:11:17 2023] [ 1276] 0 1276 18076 1 180224 188 -1000 sshd
[Tue Jul 11 14:11:17 2023] [ 1298] 0 1298 3800 5 77824 28 0 agetty
[Tue Jul 11 14:11:17 2023] [ 1312] 0 1312 124420 15 180224 164 0 automount
[Tue Jul 11 14:11:17 2023] [ 1374] 0 1374 18169 101 180224 155 0 sshd
[Tue Jul 11 14:11:17 2023] [ 1383] 0 1383 18091 0 184320 209 0 sshd
[Tue Jul 11 14:11:17 2023] [ 1385] 0 1385 4982 1 86016 456 0 bash
[Tue Jul 11 14:11:17 2023] [ 1402] 0 1402 3267 0 73728 33 0 sftp-server
[Tue Jul 11 14:11:17 2023] [ 8259] 0 8259 176363 64547 778240 1004 0 fio
[Tue Jul 11 14:11:17 2023] [ 8277] 0 8277 1360240 932940 9723904 253138 0 fio
[Tue Jul 11 14:11:17 2023] [ 8278] 0 8278 1360241 934277 9744384 253586 0 fio
[Tue Jul 11 14:11:17 2023] [ 8299] 0 8299 18161 100 184320 155 0 sshd
[Tue Jul 11 14:11:17 2023] [ 8301] 0 8301 18091 0 184320 210 0 sshd
[Tue Jul 11 14:11:17 2023] [ 8303] 0 8303 4952 230 86016 194 0 bash
[Tue Jul 11 14:11:17 2023] [ 8318] 0 8318 3267 1 73728 41 0 sftp-server
[Tue Jul 11 14:11:17 2023] [10893] 0 10893 2482 59 69632 0 0 bash
[Tue Jul 11 14:11:17 2023] [10894] 0 10894 2482 57 73728 0 0 bash
[Tue Jul 11 14:11:17 2023] [10898] 0 10898 2763 41 65536 0 0 dmesg
[Tue Jul 11 14:11:17 2023] [10926] 0 10926 1120 16 49152 0 0 tail
[Tue Jul 11 14:11:17 2023] Out of memory: Kill process 8278 (fio) score 457 or sacrifice child
[Tue Jul 11 14:11:17 2023] Killed process 8278 (fio) total-vm:5440964kB, anon-rss:3736832kB, file-rss:124kB, shmem-rss:152kB
[Tue Jul 11 14:11:17 2023] oom_reaper: reaped process 8278 (fio), now anon-rss:0kB, file-rss:0kB, shmem-rss:152kB
root@unassigned:~/test/fio_test#
这里主要的讯息就是,有一个 Out of memory: Kill process 8278 (fio) score 457 or sacrifice child
讯息,说明遇到OOM kill掉了我们fio的进程
首先,OOM 即为 out of memory的缩写,通俗的来讲,这个状况就是在进程运行过程中,占用的内存太大了,一直没释放,导致内部不足爆掉,系统为了他自己的安全,于是启用的一种保护措施,将占用内存高的进程杀掉,以此来保护系统可以正常运行的策略
所以,出现这样的状况,一般来说就是内存占用率太高了。从以下几点开始尝试解决问题:
1,增大系统内存:如果您经常遇到内存不足的问题,可能需要考虑增加系统的物理内存。更多的内存可提供更大的容量来满足工作负载需求。
2,优化系统配置:在相关任务运行期间,关闭不必要的后台进程或服务,以释放额外的内存资源。
3,监控系统资源:使用系统监控工具来观察系统资源使用情况,以便发现任何内存泄漏或其他资源瓶颈的问题
4,优化fio任务:检查您的fio任务设置和参数,确保它与系统资源和容量一致。减少并发任务数、调整I/O队列深度或减小数据块大小都可以减少内存消耗。
首先尝试第一点,我增大了内存,由8G升到32G,但看起来也只是让测试跑的时间变长了一些,有改善,但并没有解决问题
此时,我怀疑fio版本或系统兼容性问题,于是在原本的系统上,尝试更换fio版本进行测试,得到的结果依旧是遇到相同的问题
然后,更改系统,原本为Ubuntu 18.04.6
,更换为CentOS8/Redhat 7.6 均无改善
排除了环境因素,又从自身找原因。其实正常情况下应该是从自身找原因,即FIO配置下手,但刚开始并没有意识到配置上会有什么异常
直到重复检查几遍后发现fio配置上有问题:
我原本的配置:
[JEDEC-219]
ioengine=libaio
direct=1
rw=randrw
norandommap
randrepeat=0
rwmixread=40
iodepth=128
numjobs=4
bssplit=512/4:1024/1:1536/1:2048/1:2560/1:3072/1:3584/1:4k/67:8k/10:16k/7:32k/3:64k/3
blockalign=4k
random_distribution=zoned:50/5:30/15:20/80
loops=10000
filename=/dev/nvme0n1
group_reporting
write_iops_log=iops.log
write_bw_log=bw.log
write_lat_log=lat.log
可以看到最后几行,我设定了 iops/bw/lat三种log需要保存起来,但是我并没有给它保存的时间,也就是 log_avg_msec
参数没给,但最开始我潜意识中是认为,他的默认值就是1s记录一次
也就是 log_avg_msec=1000
,直到我在运行fio的时候,发现memory的使用率蹭蹭上涨,我觉得是不是哪里出了问题,测试没有带校验,所以不存在其他占用,只有这个结果输出会占用,
然后我先运行一小会,查看结果发现他的结果意外的多,如下:
974, 1, 1, 4096
974, 1, 1, 4096
974, 1, 0, 4096
974, 1, 1, 4096
974, 1, 1, 4096
974, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 32768
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 16384
975, 1, 1, 4096
975, 1, 1, 512
975, 1, 1, 8192
975, 1, 1, 8192
975, 1, 1, 4096
975, 1, 0, 32768
975, 1, 1, 3072
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 0, 4096
975, 1, 1, 4096
正常来说,第一列是代表时间,ms为单位,看起来就是它每毫秒记录了一次或多次IO,所以数据爆炸了,导致内存爆掉
与我们的预期不符,于是我将log_avg_msec=1000
这句加到了fio的配置中去,fio运行起来后,发现内存的占用非常平稳,没有明显的逐步上升的趋势,测试停止后,测试结果都正常保存了下来,
查看结果如下:
18000, 42756, 1, 0
19000, 29475, 0, 0
19000, 44417, 1, 0
20000, 30457, 0, 0
20000, 45424, 1, 0
21000, 29571, 0, 0
21000, 44551, 1, 0
22000, 24774, 0, 0
22000, 37544, 1, 0
23000, 26196, 0, 0
23000, 39118, 1, 0
24000, 29357, 0, 0
24000, 43485, 1, 0
25000, 22756, 0, 0
25000, 34414, 1, 0
26000, 23096, 0, 0
26000, 34759, 1, 0
27000, 27125, 0, 0
27000, 40415, 1, 0
这次看起来正常了,混合读写,每秒记录一次IO,(第三列读写分开,1代表写,0代表读,第2列代表数据)
经过长时间的测试验证,没有发现相同的问题了,问题解决
结论:在不了解的情况下,尽量完整设定fio的配置参数,这样虽然设定臃肿,但是排查的时候,参数很清晰。不能全靠默认设定来,有可能会出理解上的错误,且不太好查看问题点