fio遇到 OOM kill(out of memory)错误而停止测试

fio触发linux系统 oom kill,杀掉进程,释放memory

先贴上出现问题的时候,dmesg抓到的信息吧:

root@unassigned:~/test/fio_test# dmesg -T
[Tue Jul 11 13:59:51 2023] fio invoked oom-killer: gfp_mask=0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0
[Tue Jul 11 13:59:51 2023] fio cpuset=/ mems_allowed=0
[Tue Jul 11 13:59:51 2023] CPU: 4 PID: 8275 Comm: fio Not tainted 4.15.0-212-generic #223-Ubuntu
[Tue Jul 11 13:59:51 2023] Hardware name: System manufacturer System Product Name/PRIME Z390-A, BIOS 0903 03/18/2019
[Tue Jul 11 13:59:51 2023] Call Trace:
[Tue Jul 11 13:59:51 2023]  dump_stack+0x6d/0x8b
[Tue Jul 11 13:59:51 2023]  dump_header+0x71/0x282
[Tue Jul 11 13:59:51 2023]  ? ___ratelimit+0x9c/0x100
[Tue Jul 11 13:59:51 2023]  oom_kill_process+0x21f/0x420
[Tue Jul 11 13:59:51 2023]  out_of_memory+0x116/0x4e0
[Tue Jul 11 13:59:51 2023]  __alloc_pages_slowpath+0xa3d/0xe70
[Tue Jul 11 13:59:51 2023]  __alloc_pages_nodemask+0x29a/0x2c0
[Tue Jul 11 13:59:51 2023]  alloc_pages_vma+0x88/0x1f0
[Tue Jul 11 13:59:51 2023]  do_anonymous_page+0x11a/0x420
[Tue Jul 11 13:59:51 2023]  __handle_mm_fault+0x7da/0xc50
[Tue Jul 11 13:59:51 2023]  handle_mm_fault+0xe7/0x260
[Tue Jul 11 13:59:51 2023]  __do_page_fault+0x281/0x4b0
[Tue Jul 11 13:59:51 2023]  do_page_fault+0x2e/0xe0
[Tue Jul 11 13:59:51 2023]  ? page_fault+0x2f/0x50
[Tue Jul 11 13:59:51 2023]  page_fault+0x45/0x50
[Tue Jul 11 13:59:51 2023] RIP: 0033:0x55b758c04739
[Tue Jul 11 13:59:51 2023] RSP: 002b:00007fffb3ab6160 EFLAGS: 00010206
[Tue Jul 11 13:59:51 2023] RAX: 000000000003f4aa RBX: 00007fdd39ba6000 RCX: 00007fdca4c25000
[Tue Jul 11 13:59:51 2023] RDX: 00007fdd50b61c50 RSI: 00007fdca4636010 RDI: 00007fdd39ba6000
[Tue Jul 11 13:59:51 2023] RBP: 0000000000000001 R08: 00000014a9289000 R09: 0000000000000000
[Tue Jul 11 13:59:51 2023] R10: 0000000000056507 R11: 0000000000000001 R12: 0000000000000001
[Tue Jul 11 13:59:51 2023] R13: 00000000001153c6 R14: 00007fdd50b546d0 R15: 0000000000000038
[Tue Jul 11 13:59:51 2023] Mem-Info:
[Tue Jul 11 13:59:51 2023] active_anon:1718018 inactive_anon:215680 isolated_anon:0
                            active_file:0 inactive_file:119 isolated_file:0
                            unevictable:0 dirty:0 writeback:0 unstable:0
                            slab_reclaimable:7509 slab_unreclaimable:13812
                            mapped:64558 shmem:64600 pagetables:6067 bounce:0
                            free:26587 free_pcp:0 free_cma:0
[Tue Jul 11 13:59:51 2023] Node 0 active_anon:6872072kB inactive_anon:862720kB active_file:0kB inactive_file:476kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:258232kB dirty:0kB writeback:0kB shmem:258400kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
[Tue Jul 11 13:59:51 2023] Node 0 DMA free:15896kB min:132kB low:164kB high:196kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15900kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Tue Jul 11 13:59:51 2023] lowmem_reserve[]: 0 2006 7732 7732 7732
[Tue Jul 11 13:59:51 2023] Node 0 DMA32 free:40404kB min:17504kB low:21880kB high:26256kB active_anon:2076676kB inactive_anon:0kB active_file:420kB inactive_file:348kB unevictable:0kB writepending:0kB present:2210564kB managed:2122928kB mlocked:0kB kernel_stack:0kB pagetables:4048kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Tue Jul 11 13:59:51 2023] lowmem_reserve[]: 0 0 5725 5725 5725
[Tue Jul 11 13:59:51 2023] Node 0 Normal free:50048kB min:49944kB low:62428kB high:74912kB active_anon:4795396kB inactive_anon:862720kB active_file:648kB inactive_file:192kB unevictable:0kB writepending:0kB present:5996544kB managed:5863252kB mlocked:0kB kernel_stack:3152kB pagetables:20220kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Tue Jul 11 13:59:51 2023] lowmem_reserve[]: 0 0 0 0 0
[Tue Jul 11 13:59:51 2023] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15896kB
[Tue Jul 11 13:59:51 2023] Node 0 DMA32: 355*4kB (UM) 147*8kB (UM) 55*16kB (UM) 14*32kB (UM) 3*64kB (UM) 1*128kB (U) 1*256kB (U) 1*512kB (M) 2*1024kB (UM) 1*2048kB (M) 8*4096kB (M) = 41876kB
[Tue Jul 11 13:59:51 2023] Node 0 Normal: 1102*4kB (UME) 902*8kB (UME) 375*16kB (UME) 151*32kB (UME) 42*64kB (UME) 12*128kB (UME) 3*256kB (UME) 2*512kB (UE) 2*1024kB (UM) 2*2048kB (UM) 4*4096kB (U) = 51000kB
[Tue Jul 11 13:59:51 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[Tue Jul 11 13:59:51 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Tue Jul 11 13:59:51 2023] 65028 total pagecache pages
[Tue Jul 11 13:59:51 2023] 171 pages in swap cache
[Tue Jul 11 13:59:51 2023] Swap cache stats: add 524642, delete 524471, find 154/289
[Tue Jul 11 13:59:51 2023] Free swap  = 0kB
[Tue Jul 11 13:59:51 2023] Total swap = 2097148kB
[Tue Jul 11 13:59:51 2023] 2055774 pages RAM
[Tue Jul 11 13:59:51 2023] 0 pages HighMem/MovableOnly
[Tue Jul 11 13:59:51 2023] 55254 pages reserved
[Tue Jul 11 13:59:51 2023] 0 pages cma reserved
[Tue Jul 11 13:59:51 2023] 0 pages hwpoisoned
[Tue Jul 11 13:59:51 2023] [ pid ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[Tue Jul 11 13:59:51 2023] [  482]     0   482    19586       66   176128       96             0 systemd-journal
[Tue Jul 11 13:59:51 2023] [  502]     0   502    11741      110   118784      558         -1000 systemd-udevd
[Tue Jul 11 13:59:51 2023] [  503]     0   503    24428        8    98304       35             0 lvmetad
[Tue Jul 11 13:59:51 2023] [  696]     0   696    11902       40   135168       59             0 rpcbind
[Tue Jul 11 13:59:51 2023] [  702] 62583   702    35447        0   180224      118             0 systemd-timesyn
[Tue Jul 11 13:59:51 2023] [  710]   100   710    19981       78   176128       82             0 systemd-network
[Tue Jul 11 13:59:51 2023] [  795]   101   795    17723        9   184320      224             0 systemd-resolve
[Tue Jul 11 13:59:51 2023] [  901]     0   901     7577       23   102400       47             0 cron
[Tue Jul 11 13:59:51 2023] [  908]     0   908    42973      293   237568     1747             0 networkd-dispat
[Tue Jul 11 13:59:51 2023] [  911]     0   911    40271        0    90112       52             0 lxcfs
[Tue Jul 11 13:59:51 2023] [  922]     0   922    72156       90   200704      115             0 accounts-daemon
[Tue Jul 11 13:59:51 2023] [  925]     0   925    27623       38   118784       46             0 irqbalance
[Tue Jul 11 13:59:51 2023] [  926]   103   926    12525        0   135168      124          -900 dbus-daemon
[Tue Jul 11 13:59:51 2023] [  962]     0   962     7084       11   102400       41             0 atd
[Tue Jul 11 13:59:51 2023] [  968]   102   968    66819        0   172032      321             0 rsyslogd
[Tue Jul 11 13:59:51 2023] [  973]     0   973    15493       14   159744      129             0 systemd-logind
[Tue Jul 11 13:59:51 2023] [  982]     0   982    45921       83   118784      943             0 python3.8
[Tue Jul 11 13:59:51 2023] [  999]     0   999    72221       14   196608      183             0 polkitd
[Tue Jul 11 13:59:51 2023] [ 1046]     0  1046     3491        1    73728       73             0 bash
[Tue Jul 11 13:59:51 2023] [ 1086]     0  1086    38958      395   348160     8457             0 python
[Tue Jul 11 13:59:51 2023] [ 1196]     0  1196    47080      190   262144     1791             0 unattended-upgr
[Tue Jul 11 13:59:51 2023] [ 1276]     0  1276    18076        1   180224      188         -1000 sshd
[Tue Jul 11 13:59:51 2023] [ 1298]     0  1298     3800        5    77824       28             0 agetty
[Tue Jul 11 13:59:51 2023] [ 1312]     0  1312   124420       22   180224      153             0 automount
[Tue Jul 11 13:59:51 2023] [ 1374]     0  1374    18169       74   180224      182             0 sshd
[Tue Jul 11 13:59:51 2023] [ 1383]     0  1383    18091        0   184320      209             0 sshd
[Tue Jul 11 13:59:51 2023] [ 1385]     0  1385     4982        1    86016      456             0 bash
[Tue Jul 11 13:59:51 2023] [ 1402]     0  1402     3267        0    73728       33             0 sftp-server
[Tue Jul 11 13:59:51 2023] [ 8259]     0  8259   176363    64545   778240     1006             0 fio
[Tue Jul 11 13:59:51 2023] [ 8275]     0  8275   776463   467194  4980736   127214             0 fio
[Tue Jul 11 13:59:51 2023] [ 8276]     0  8276   776464   467431  4980736   127406             0 fio
[Tue Jul 11 13:59:51 2023] [ 8277]     0  8277   776465   465425  4960256   127090             0 fio
[Tue Jul 11 13:59:51 2023] [ 8278]     0  8278   776466   466949  4960256   127231             0 fio
[Tue Jul 11 13:59:51 2023] [ 8299]     0  8299    18161       88   184320      164             0 sshd
[Tue Jul 11 13:59:51 2023] [ 8301]     0  8301    18091        0   184320      210             0 sshd
[Tue Jul 11 13:59:51 2023] [ 8303]     0  8303     4922        1    86016      416             0 bash
[Tue Jul 11 13:59:51 2023] [ 8318]     0  8318     3267        1    73728       41             0 sftp-server
[Tue Jul 11 13:59:51 2023] [ 8331]     0  8331     1144       22    57344        8             0 iostat
[Tue Jul 11 13:59:51 2023] Out of memory: Kill process 8276 (fio) score 228 or sacrifice child
[Tue Jul 11 13:59:51 2023] Killed process 8276 (fio) total-vm:3105856kB, anon-rss:1869588kB, file-rss:0kB, shmem-rss:136kB
[Tue Jul 11 13:59:51 2023] oom_reaper: reaped process 8276 (fio), now anon-rss:0kB, file-rss:0kB, shmem-rss:136kB
[Tue Jul 11 14:01:35 2023] fio invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
[Tue Jul 11 14:01:35 2023] fio cpuset=/ mems_allowed=0
[Tue Jul 11 14:01:35 2023] CPU: 1 PID: 8275 Comm: fio Not tainted 4.15.0-212-generic #223-Ubuntu
[Tue Jul 11 14:01:35 2023] Hardware name: System manufacturer System Product Name/PRIME Z390-A, BIOS 0903 03/18/2019
[Tue Jul 11 14:01:35 2023] Call Trace:
[Tue Jul 11 14:01:35 2023]  dump_stack+0x6d/0x8b
[Tue Jul 11 14:01:35 2023]  dump_header+0x71/0x282
[Tue Jul 11 14:01:35 2023]  ? ___ratelimit+0x9c/0x100
[Tue Jul 11 14:01:35 2023]  oom_kill_process+0x21f/0x420
[Tue Jul 11 14:01:35 2023]  out_of_memory+0x116/0x4e0
[Tue Jul 11 14:01:35 2023]  __alloc_pages_slowpath+0xa3d/0xe70
[Tue Jul 11 14:01:35 2023]  __alloc_pages_nodemask+0x29a/0x2c0
[Tue Jul 11 14:01:35 2023]  alloc_pages_current+0x6a/0xe0
[Tue Jul 11 14:01:35 2023]  __page_cache_alloc+0x81/0xa0
[Tue Jul 11 14:01:35 2023]  filemap_fault+0x42f/0x750
[Tue Jul 11 14:01:35 2023]  ? filemap_map_pages+0x181/0x390
[Tue Jul 11 14:01:35 2023]  ext4_filemap_fault+0x31/0x50
[Tue Jul 11 14:01:35 2023]  __do_fault+0x34/0x100
[Tue Jul 11 14:01:35 2023]  __handle_mm_fault+0x982/0xc50
[Tue Jul 11 14:01:35 2023]  handle_mm_fault+0xe7/0x260
[Tue Jul 11 14:01:35 2023]  __do_page_fault+0x281/0x4b0
[Tue Jul 11 14:01:35 2023]  do_page_fault+0x2e/0xe0
[Tue Jul 11 14:01:35 2023]  ? page_fault+0x2f/0x50
[Tue Jul 11 14:01:35 2023]  page_fault+0x45/0x50
[Tue Jul 11 14:01:35 2023] RIP: 0033:0x55b758bf5900
[Tue Jul 11 14:01:35 2023] RSP: 002b:00007fffb3ab6158 EFLAGS: 00010246
[Tue Jul 11 14:01:35 2023] RAX: 000000000006fa47 RBX: 00007fdd39ba6000 RCX: 0000000000001000
[Tue Jul 11 14:01:35 2023] RDX: 0000000000000001 RSI: 00000000367d7cc8 RDI: 00007fdd50b54b90
[Tue Jul 11 14:01:35 2023] RBP: 0000000000000001 R08: 000000000006fa47 R09: 0000001baccbd000
[Tue Jul 11 14:01:35 2023] R10: 0000000000000008 R11: 00007fdd50b54b90 R12: 0000000000000000
[Tue Jul 11 14:01:35 2023] R13: 0000000000000001 R14: 00000000367d7cc8 R15: 0000001baccbd000
[Tue Jul 11 14:01:35 2023] Mem-Info:
[Tue Jul 11 14:01:35 2023] active_anon:1718174 inactive_anon:215707 isolated_anon:0
                            active_file:62 inactive_file:186 isolated_file:0
                            unevictable:0 dirty:0 writeback:0 unstable:0
                            slab_reclaimable:7570 slab_unreclaimable:13847
                            mapped:64598 shmem:64597 pagetables:6019 bounce:0
                            free:26567 free_pcp:30 free_cma:0
[Tue Jul 11 14:01:35 2023] Node 0 active_anon:6872696kB inactive_anon:862828kB active_file:248kB inactive_file:744kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:258392kB dirty:0kB writeback:0kB shmem:258388kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
[Tue Jul 11 14:01:35 2023] Node 0 DMA free:15896kB min:132kB low:164kB high:196kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15900kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Tue Jul 11 14:01:35 2023] lowmem_reserve[]: 0 2006 7732 7732 7732
[Tue Jul 11 14:01:35 2023] Node 0 DMA32 free:40396kB min:17504kB low:21880kB high:26256kB active_anon:2077736kB inactive_anon:4kB active_file:0kB inactive_file:48kB unevictable:0kB writepending:0kB present:2210564kB managed:2122928kB mlocked:0kB kernel_stack:0kB pagetables:4052kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Tue Jul 11 14:01:35 2023] lowmem_reserve[]: 0 0 5725 5725 5725
[Tue Jul 11 14:01:35 2023] Node 0 Normal free:49976kB min:49944kB low:62428kB high:74912kB active_anon:4794960kB inactive_anon:862824kB active_file:80kB inactive_file:1160kB unevictable:0kB writepending:0kB present:5996544kB managed:5863252kB mlocked:0kB kernel_stack:3152kB pagetables:20024kB bounce:0kB free_pcp:120kB local_pcp:0kB free_cma:0kB
[Tue Jul 11 14:01:35 2023] lowmem_reserve[]: 0 0 0 0 0
[Tue Jul 11 14:01:35 2023] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15896kB
[Tue Jul 11 14:01:35 2023] Node 0 DMA32: 162*4kB (UM) 128*8kB (UM) 43*16kB (UM) 9*32kB (UM) 6*64kB (UM) 3*128kB (UM) 1*256kB (U) 1*512kB (M) 2*1024kB (UM) 1*2048kB (M) 8*4096kB (M) = 41048kB
[Tue Jul 11 14:01:35 2023] Node 0 Normal: 892*4kB (UME) 1008*8kB (UME) 407*16kB (UME) 142*32kB (UME) 40*64kB (UME) 11*128kB (UE) 3*256kB (UME) 2*512kB (UE) 2*1024kB (UM) 2*2048kB (UM) 4*4096kB (U) = 50976kB
[Tue Jul 11 14:01:35 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[Tue Jul 11 14:01:35 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Tue Jul 11 14:01:35 2023] 65174 total pagecache pages
[Tue Jul 11 14:01:35 2023] 304 pages in swap cache
[Tue Jul 11 14:01:35 2023] Swap cache stats: add 652012, delete 651708, find 652/1133
[Tue Jul 11 14:01:35 2023] Free swap  = 0kB
[Tue Jul 11 14:01:35 2023] Total swap = 2097148kB
[Tue Jul 11 14:01:35 2023] 2055774 pages RAM
[Tue Jul 11 14:01:35 2023] 0 pages HighMem/MovableOnly
[Tue Jul 11 14:01:35 2023] 55254 pages reserved
[Tue Jul 11 14:01:35 2023] 0 pages cma reserved
[Tue Jul 11 14:01:35 2023] 0 pages hwpoisoned
[Tue Jul 11 14:01:35 2023] [ pid ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[Tue Jul 11 14:01:35 2023] [  482]     0   482    19586       84   176128       78             0 systemd-journal
[Tue Jul 11 14:01:35 2023] [  502]     0   502    11741      112   118784      556         -1000 systemd-udevd
[Tue Jul 11 14:01:35 2023] [  503]     0   503    24428        5    98304       38             0 lvmetad
[Tue Jul 11 14:01:35 2023] [  696]     0   696    11902       52   135168       63             0 rpcbind
[Tue Jul 11 14:01:35 2023] [  702] 62583   702    35447        5   180224      123             0 systemd-timesyn
[Tue Jul 11 14:01:35 2023] [  710]   100   710    19981       65   176128       95             0 systemd-network
[Tue Jul 11 14:01:35 2023] [  795]   101   795    17723       24   184320      216             0 systemd-resolve
[Tue Jul 11 14:01:35 2023] [  901]     0   901     7577       23   102400       47             0 cron
[Tue Jul 11 14:01:35 2023] [  908]     0   908    42973      118   237568     1922             0 networkd-dispat
[Tue Jul 11 14:01:35 2023] [  911]     0   911    40271        0    90112       52             0 lxcfs
[Tue Jul 11 14:01:35 2023] [  922]     0   922    72156       96   200704      113             0 accounts-daemon
[Tue Jul 11 14:01:35 2023] [  925]     0   925    27623       38   118784       46             0 irqbalance
[Tue Jul 11 14:01:35 2023] [  926]   103   926    12525       65   135168       96          -900 dbus-daemon
[Tue Jul 11 14:01:35 2023] [  962]     0   962     7084       11   102400       41             0 atd
[Tue Jul 11 14:01:35 2023] [  968]   102   968    66819       32   172032      281             0 rsyslogd
[Tue Jul 11 14:01:35 2023] [  973]     0   973    15493       33   159744      111             0 systemd-logind
[Tue Jul 11 14:01:35 2023] [  982]     0   982    45921       83   118784      943             0 python3.8
[Tue Jul 11 14:01:35 2023] [  999]     0   999    72221        7   196608      190             0 polkitd
[Tue Jul 11 14:01:35 2023] [ 1046]     0  1046     3491        1    73728       73             0 bash
[Tue Jul 11 14:01:35 2023] [ 1086]     0  1086    38958      234   348160     8618             0 python
[Tue Jul 11 14:01:35 2023] [ 1196]     0  1196    47080       72   262144     1909             0 unattended-upgr
[Tue Jul 11 14:01:35 2023] [ 1276]     0  1276    18076        1   180224      188         -1000 sshd
[Tue Jul 11 14:01:35 2023] [ 1298]     0  1298     3800        5    77824       28             0 agetty
[Tue Jul 11 14:01:35 2023] [ 1312]     0  1312   124420       36   180224      143             0 automount
[Tue Jul 11 14:01:35 2023] [ 1374]     0  1374    18169       82   180224      174             0 sshd
[Tue Jul 11 14:01:35 2023] [ 1383]     0  1383    18091        0   184320      209             0 sshd
[Tue Jul 11 14:01:35 2023] [ 1385]     0  1385     4982        1    86016      456             0 bash
[Tue Jul 11 14:01:35 2023] [ 1402]     0  1402     3267        0    73728       33             0 sftp-server
[Tue Jul 11 14:01:35 2023] [ 8259]     0  8259   176363    64547   778240     1004             0 fio
[Tue Jul 11 14:01:35 2023] [ 8275]     0  8275   960813   623106  6561792   169332             0 fio
[Tue Jul 11 14:01:35 2023] [ 8277]     0  8277   960815   621310  6541312   169149             0 fio
[Tue Jul 11 14:01:35 2023] [ 8278]     0  8278   960816   622859  6561792   169294             0 fio
[Tue Jul 11 14:01:35 2023] [ 8299]     0  8299    18161       92   184320      160             0 sshd
[Tue Jul 11 14:01:35 2023] [ 8301]     0  8301    18091        0   184320      210             0 sshd
[Tue Jul 11 14:01:35 2023] [ 8303]     0  8303     4922        1    86016      416             0 bash
[Tue Jul 11 14:01:35 2023] [ 8318]     0  8318     3267        1    73728       41             0 sftp-server
[Tue Jul 11 14:01:35 2023] [ 8331]     0  8331     1144       22    57344        8             0 iostat
[Tue Jul 11 14:01:35 2023] Out of memory: Kill process 8275 (fio) score 305 or sacrifice child
[Tue Jul 11 14:01:35 2023] Killed process 8275 (fio) total-vm:3843252kB, anon-rss:2492284kB, file-rss:0kB, shmem-rss:140kB
[Tue Jul 11 14:01:35 2023] oom_reaper: reaped process 8275 (fio), now anon-rss:0kB, file-rss:0kB, shmem-rss:140kB
[Tue Jul 11 14:11:17 2023] fio invoked oom-killer: gfp_mask=0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0
[Tue Jul 11 14:11:17 2023] fio cpuset=/ mems_allowed=0
[Tue Jul 11 14:11:17 2023] CPU: 0 PID: 8277 Comm: fio Not tainted 4.15.0-212-generic #223-Ubuntu
[Tue Jul 11 14:11:17 2023] Hardware name: System manufacturer System Product Name/PRIME Z390-A, BIOS 0903 03/18/2019
[Tue Jul 11 14:11:17 2023] Call Trace:
[Tue Jul 11 14:11:17 2023]  dump_stack+0x6d/0x8b
[Tue Jul 11 14:11:17 2023]  dump_header+0x71/0x282
[Tue Jul 11 14:11:17 2023]  ? ___ratelimit+0x9c/0x100
[Tue Jul 11 14:11:17 2023]  oom_kill_process+0x21f/0x420
[Tue Jul 11 14:11:17 2023]  out_of_memory+0x116/0x4e0
[Tue Jul 11 14:11:17 2023]  __alloc_pages_slowpath+0xa3d/0xe70
[Tue Jul 11 14:11:17 2023]  __alloc_pages_nodemask+0x29a/0x2c0
[Tue Jul 11 14:11:17 2023]  alloc_pages_vma+0x88/0x1f0
[Tue Jul 11 14:11:17 2023]  do_anonymous_page+0x11a/0x420
[Tue Jul 11 14:11:17 2023]  __handle_mm_fault+0x7da/0xc50
[Tue Jul 11 14:11:17 2023]  handle_mm_fault+0xe7/0x260
[Tue Jul 11 14:11:17 2023]  __do_page_fault+0x281/0x4b0
[Tue Jul 11 14:11:17 2023]  do_page_fault+0x2e/0xe0
[Tue Jul 11 14:11:17 2023]  ? page_fault+0x2f/0x50
[Tue Jul 11 14:11:17 2023]  page_fault+0x45/0x50
[Tue Jul 11 14:11:17 2023] RIP: 0033:0x55b758c04739
[Tue Jul 11 14:11:17 2023] RSP: 002b:00007fffb3ab6160 EFLAGS: 00010206
[Tue Jul 11 14:11:17 2023] RAX: 0000000000080eaa RBX: 00007fdd39bc5690 RCX: 00007fdc169ed000
[Tue Jul 11 14:11:17 2023] RDX: 00007fdd50b655d0 RSI: 00007fdc15dd7010 RDI: 00007fdd39bc5690
[Tue Jul 11 14:11:17 2023] RBP: 0000000000000000 R08: 0000015f28ceb000 R09: 0000000000000000
[Tue Jul 11 14:11:17 2023] R10: 00000000000fd8ff R11: 0000000000000001 R12: 0000000000000000
[Tue Jul 11 14:11:17 2023] R13: 000000000004b510 R14: 00007fdd50b56210 R15: 0000000000000000
[Tue Jul 11 14:11:17 2023] Mem-Info:
[Tue Jul 11 14:11:17 2023] active_anon:1718168 inactive_anon:215490 isolated_anon:0
                            active_file:162 inactive_file:67 isolated_file:0
                            unevictable:0 dirty:0 writeback:0 unstable:0
                            slab_reclaimable:7671 slab_unreclaimable:13870
                            mapped:64607 shmem:64583 pagetables:6009 bounce:0
                            free:26673 free_pcp:58 free_cma:0
[Tue Jul 11 14:11:17 2023] Node 0 active_anon:6872672kB inactive_anon:861960kB active_file:648kB inactive_file:268kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:258428kB dirty:0kB writeback:0kB shmem:258332kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[Tue Jul 11 14:11:17 2023] Node 0 DMA free:15896kB min:132kB low:164kB high:196kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15900kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Tue Jul 11 14:11:17 2023] lowmem_reserve[]: 0 2006 7732 7732 7732
[Tue Jul 11 14:11:17 2023] Node 0 DMA32 free:40836kB min:17504kB low:21880kB high:26256kB active_anon:1271540kB inactive_anon:805220kB active_file:408kB inactive_file:632kB unevictable:0kB writepending:0kB present:2210564kB managed:2122928kB mlocked:0kB kernel_stack:0kB pagetables:3984kB bounce:0kB free_pcp:120kB local_pcp:0kB free_cma:0kB
[Tue Jul 11 14:11:17 2023] lowmem_reserve[]: 0 0 5725 5725 5725
[Tue Jul 11 14:11:17 2023] Node 0 Normal free:49960kB min:49944kB low:62428kB high:74912kB active_anon:5600952kB inactive_anon:56884kB active_file:684kB inactive_file:0kB unevictable:0kB writepending:0kB present:5996544kB managed:5863252kB mlocked:0kB kernel_stack:3120kB pagetables:20052kB bounce:0kB free_pcp:112kB local_pcp:0kB free_cma:0kB
[Tue Jul 11 14:11:17 2023] lowmem_reserve[]: 0 0 0 0 0
[Tue Jul 11 14:11:17 2023] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15896kB
[Tue Jul 11 14:11:17 2023] Node 0 DMA32: 233*4kB (UM) 142*8kB (UM) 49*16kB (UM) 20*32kB (UM) 8*64kB (UM) 2*128kB (UM) 1*256kB (U) 1*512kB (M) 1*1024kB (U) 1*2048kB (M) 8*4096kB (M) = 40868kB
[Tue Jul 11 14:11:17 2023] Node 0 Normal: 721*4kB (UMEH) 892*8kB (UMEH) 390*16kB (UMEH) 142*32kB (UMEH) 43*64kB (UMEH) 12*128kB (UME) 5*256kB (UME) 3*512kB (UME) 2*1024kB (UM) 2*2048kB (UM) 4*4096kB (U) = 50436kB
[Tue Jul 11 14:11:17 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[Tue Jul 11 14:11:17 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Tue Jul 11 14:11:17 2023] 65158 total pagecache pages
[Tue Jul 11 14:11:17 2023] 182 pages in swap cache
[Tue Jul 11 14:11:17 2023] Swap cache stats: add 821382, delete 821199, find 2044/2935
[Tue Jul 11 14:11:17 2023] Free swap  = 0kB
[Tue Jul 11 14:11:17 2023] Total swap = 2097148kB
[Tue Jul 11 14:11:17 2023] 2055774 pages RAM
[Tue Jul 11 14:11:17 2023] 0 pages HighMem/MovableOnly
[Tue Jul 11 14:11:17 2023] 55254 pages reserved
[Tue Jul 11 14:11:17 2023] 0 pages cma reserved
[Tue Jul 11 14:11:17 2023] 0 pages hwpoisoned
[Tue Jul 11 14:11:17 2023] [ pid ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[Tue Jul 11 14:11:17 2023] [  482]     0   482    19586       89   176128       75             0 systemd-journal
[Tue Jul 11 14:11:17 2023] [  502]     0   502    11741       58   118784      609         -1000 systemd-udevd
[Tue Jul 11 14:11:17 2023] [  503]     0   503    24428        0    98304       44             0 lvmetad
[Tue Jul 11 14:11:17 2023] [  696]     0   696    11902       38   135168       77             0 rpcbind
[Tue Jul 11 14:11:17 2023] [  702] 62583   702    35447       16   180224      112             0 systemd-timesyn
[Tue Jul 11 14:11:17 2023] [  710]   100   710    19981       50   176128      110             0 systemd-network
[Tue Jul 11 14:11:17 2023] [  795]   101   795    17723       25   184320      215             0 systemd-resolve
[Tue Jul 11 14:11:17 2023] [  901]     0   901     7577       29   102400       41             0 cron
[Tue Jul 11 14:11:17 2023] [  908]     0   908    42973       37   237568     2003             0 networkd-dispat
[Tue Jul 11 14:11:17 2023] [  911]     0   911    40271        0    90112       52             0 lxcfs
[Tue Jul 11 14:11:17 2023] [  922]     0   922    72156       75   200704      134             0 accounts-daemon
[Tue Jul 11 14:11:17 2023] [  925]     0   925    27623       43   118784       45             0 irqbalance
[Tue Jul 11 14:11:17 2023] [  926]   103   926    12525       81   135168       80          -900 dbus-daemon
[Tue Jul 11 14:11:17 2023] [  962]     0   962     7084        9   102400       43             0 atd
[Tue Jul 11 14:11:17 2023] [  968]   102   968    66819      107   172032      263             0 rsyslogd
[Tue Jul 11 14:11:17 2023] [  973]     0   973    15493       41   159744      103             0 systemd-logind
[Tue Jul 11 14:11:17 2023] [  982]     0   982    45921       83   118784      943             0 python3.8
[Tue Jul 11 14:11:17 2023] [  999]     0   999    72221        0   196608      197             0 polkitd
[Tue Jul 11 14:11:17 2023] [ 1046]     0  1046     3491        1    73728       73             0 bash
[Tue Jul 11 14:11:17 2023] [ 1086]     0  1086    38958      134   348160     8718             0 python
[Tue Jul 11 14:11:17 2023] [ 1196]     0  1196    47080       27   262144     1954             0 unattended-upgr
[Tue Jul 11 14:11:17 2023] [ 1276]     0  1276    18076        1   180224      188         -1000 sshd
[Tue Jul 11 14:11:17 2023] [ 1298]     0  1298     3800        5    77824       28             0 agetty
[Tue Jul 11 14:11:17 2023] [ 1312]     0  1312   124420       15   180224      164             0 automount
[Tue Jul 11 14:11:17 2023] [ 1374]     0  1374    18169      101   180224      155             0 sshd
[Tue Jul 11 14:11:17 2023] [ 1383]     0  1383    18091        0   184320      209             0 sshd
[Tue Jul 11 14:11:17 2023] [ 1385]     0  1385     4982        1    86016      456             0 bash
[Tue Jul 11 14:11:17 2023] [ 1402]     0  1402     3267        0    73728       33             0 sftp-server
[Tue Jul 11 14:11:17 2023] [ 8259]     0  8259   176363    64547   778240     1004             0 fio
[Tue Jul 11 14:11:17 2023] [ 8277]     0  8277  1360240   932940  9723904   253138             0 fio
[Tue Jul 11 14:11:17 2023] [ 8278]     0  8278  1360241   934277  9744384   253586             0 fio
[Tue Jul 11 14:11:17 2023] [ 8299]     0  8299    18161      100   184320      155             0 sshd
[Tue Jul 11 14:11:17 2023] [ 8301]     0  8301    18091        0   184320      210             0 sshd
[Tue Jul 11 14:11:17 2023] [ 8303]     0  8303     4952      230    86016      194             0 bash
[Tue Jul 11 14:11:17 2023] [ 8318]     0  8318     3267        1    73728       41             0 sftp-server
[Tue Jul 11 14:11:17 2023] [10893]     0 10893     2482       59    69632        0             0 bash
[Tue Jul 11 14:11:17 2023] [10894]     0 10894     2482       57    73728        0             0 bash
[Tue Jul 11 14:11:17 2023] [10898]     0 10898     2763       41    65536        0             0 dmesg
[Tue Jul 11 14:11:17 2023] [10926]     0 10926     1120       16    49152        0             0 tail
[Tue Jul 11 14:11:17 2023] Out of memory: Kill process 8278 (fio) score 457 or sacrifice child
[Tue Jul 11 14:11:17 2023] Killed process 8278 (fio) total-vm:5440964kB, anon-rss:3736832kB, file-rss:124kB, shmem-rss:152kB
[Tue Jul 11 14:11:17 2023] oom_reaper: reaped process 8278 (fio), now anon-rss:0kB, file-rss:0kB, shmem-rss:152kB
root@unassigned:~/test/fio_test#

这里主要的讯息就是,有一个 Out of memory: Kill process 8278 (fio) score 457 or sacrifice child 讯息,说明遇到OOM kill掉了我们fio的进程

首先,OOM 即为 out of memory的缩写,通俗的来讲,这个状况就是在进程运行过程中,占用的内存太大了,一直没释放,导致内部不足爆掉,系统为了他自己的安全,于是启用的一种保护措施,将占用内存高的进程杀掉,以此来保护系统可以正常运行的策略

所以,出现这样的状况,一般来说就是内存占用率太高了。从以下几点开始尝试解决问题:

1,增大系统内存:如果您经常遇到内存不足的问题,可能需要考虑增加系统的物理内存。更多的内存可提供更大的容量来满足工作负载需求。

2,优化系统配置:在相关任务运行期间,关闭不必要的后台进程或服务,以释放额外的内存资源。

3,监控系统资源:使用系统监控工具来观察系统资源使用情况,以便发现任何内存泄漏或其他资源瓶颈的问题

4,优化fio任务:检查您的fio任务设置和参数,确保它与系统资源和容量一致。减少并发任务数、调整I/O队列深度或减小数据块大小都可以减少内存消耗。

首先尝试第一点,我增大了内存,由8G升到32G,但看起来也只是让测试跑的时间变长了一些,有改善,但并没有解决问题
此时,我怀疑fio版本或系统兼容性问题,于是在原本的系统上,尝试更换fio版本进行测试,得到的结果依旧是遇到相同的问题
然后,更改系统,原本为Ubuntu 18.04.6,更换为CentOS8/Redhat 7.6 均无改善

排除了环境因素,又从自身找原因。其实正常情况下应该是从自身找原因,即FIO配置下手,但刚开始并没有意识到配置上会有什么异常
直到重复检查几遍后发现fio配置上有问题:
我原本的配置:

[JEDEC-219]
ioengine=libaio
direct=1
rw=randrw
norandommap
randrepeat=0
rwmixread=40
iodepth=128
numjobs=4
bssplit=512/4:1024/1:1536/1:2048/1:2560/1:3072/1:3584/1:4k/67:8k/10:16k/7:32k/3:64k/3
blockalign=4k
random_distribution=zoned:50/5:30/15:20/80
loops=10000

filename=/dev/nvme0n1
group_reporting
write_iops_log=iops.log
write_bw_log=bw.log
write_lat_log=lat.log

可以看到最后几行,我设定了 iops/bw/lat三种log需要保存起来,但是我并没有给它保存的时间,也就是 log_avg_msec参数没给,但最开始我潜意识中是认为,他的默认值就是1s记录一次
也就是 log_avg_msec=1000,直到我在运行fio的时候,发现memory的使用率蹭蹭上涨,我觉得是不是哪里出了问题,测试没有带校验,所以不存在其他占用,只有这个结果输出会占用,
然后我先运行一小会,查看结果发现他的结果意外的多,如下:

974, 1, 1, 4096
974, 1, 1, 4096
974, 1, 0, 4096
974, 1, 1, 4096
974, 1, 1, 4096
974, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 32768
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 16384
975, 1, 1, 4096
975, 1, 1, 512
975, 1, 1, 8192
975, 1, 1, 8192
975, 1, 1, 4096
975, 1, 0, 32768
975, 1, 1, 3072
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 1, 4096
975, 1, 0, 4096
975, 1, 1, 4096

正常来说,第一列是代表时间,ms为单位,看起来就是它每毫秒记录了一次或多次IO,所以数据爆炸了,导致内存爆掉
与我们的预期不符,于是我将log_avg_msec=1000这句加到了fio的配置中去,fio运行起来后,发现内存的占用非常平稳,没有明显的逐步上升的趋势,测试停止后,测试结果都正常保存了下来,
查看结果如下:

18000, 42756, 1, 0
19000, 29475, 0, 0
19000, 44417, 1, 0
20000, 30457, 0, 0
20000, 45424, 1, 0
21000, 29571, 0, 0
21000, 44551, 1, 0
22000, 24774, 0, 0
22000, 37544, 1, 0
23000, 26196, 0, 0
23000, 39118, 1, 0
24000, 29357, 0, 0
24000, 43485, 1, 0
25000, 22756, 0, 0
25000, 34414, 1, 0
26000, 23096, 0, 0
26000, 34759, 1, 0
27000, 27125, 0, 0
27000, 40415, 1, 0

这次看起来正常了,混合读写,每秒记录一次IO,(第三列读写分开,1代表写,0代表读,第2列代表数据)
经过长时间的测试验证,没有发现相同的问题了,问题解决

结论:在不了解的情况下,尽量完整设定fio的配置参数,这样虽然设定臃肿,但是排查的时候,参数很清晰。不能全靠默认设定来,有可能会出理解上的错误,且不太好查看问题点

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值