page allocation failure messages和oom message分析

page allocation failure messages 分析

page allocation failure messages

一个典型的page allocation failure message from linux on MIPS CPU

rmm: page allocation failure: order:4, mode:0x104020
CPU: 0 PID: 784 Comm: rmm Tainted: G           O 3.10.27 #6
Stack : 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
          00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
          00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
          00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
          00000000 00000000 00000000 00000000 00000000 00000000 00000000 82fd1560
          ...
Call Trace:[<80018a1c>] 0x80018a1c
[<80018a1c>] 0x80018a1c
[<8006ec9c>] 0x8006ec9c
[<800713b4>] 0x800713b4
[<80071664>] 0x80071664
[<801f5108>] 0x801f5108
[<801e0fd4>] 0x801e0fd4
[<8016a89c>] 0x8016a89c
[<c0261ec4>] 0xc0261ec4
[<80045cc4>] 0x80045cc4
[<8016a89c>] 0x8016a89c
[<c022115c>] 0xc022115c
[<8002998c>] 0x8002998c
[<c0261388>] 0xc0261388
[<801f4f80>] 0x801f4f80
[<801dfacc>] 0x801dfacc
[<801edca4>] 0x801edca4
[<801eeb1c>] 0x801eeb1c
[<801f3bac>] 0x801f3bac
[<8004312c>] 0x8004312c
[<80050f48>] 0x80050f48
[<80045d98>] 0x80045d98
[<801f4b20>] 0x801f4b20
[<80050f48>] 0x80050f48
[<80057744>] 0x80057744
[<801deec8>] 0x801deec8
[<8016a89c>] 0x8016a89c
[<8005fce4>] 0x8005fce4
[<80066914>] 0x80066914
[<80065d38>] 0x80065d38
[<8016a89c>] 0x8016a89c
[<8006318c>] 0x8006318c
[<8005f5a8>] 0x8005f5a8
[<800295ac>] 0x800295ac
[<80016668>] 0x80016668
[<80015460>] 0x80015460
[<80179888>] 0x80179888
[<80179d80>] 0x80179d80
[<8017a510>] 0x8017a510
[<80178a78>] 0x80178a78
[<801012d0>] 0x801012d0
[<800fd710>] 0x800fd710
[<80043118>] 0x80043118
[<800fdb04>] 0x800fdb04
[<800ff254>] 0x800ff254
[<800748e0>] 0x800748e0
[<c0456424>] 0xc0456424
[<801a59e0>] 0x801a59e0
[<80074aac>] 0x80074aac
[<8009d068>] 0x8009d068
[<8006c6d4>] 0x8006c6d4
[<8009cefc>] 0x8009cefc
[<800966f0>] 0x800966f0
[<80083e04>] 0x80083e04
[<80086774>] 0x80086774
[<80089b54>] 0x80089b54
[<8008bb58>] 0x8008bb58
[<800873bc>] 0x800873bc
[<801642ac>] 0x801642ac
[<8001cd98>] 0x8001cd98
[<8008c078>] 0x8008c078
[<80164360>] 0x80164360
[<8035194c>] 0x8035194c
[<8007f418>] 0x8007f418
[<80094b60>] 0x80094b60
[<800b40a8>] 0x800b40a8
[<800963cc>] 0x800963cc
[<8008a690>] 0x8008a690
[<80040000>] 0x80040000
[<80015464>] 0x80015464
[<80080008>] 0x80080008
[<80097c14>] 0x80097c14

Mem-Info:
Normal per-cpu:
CPU    0: hi:    6, btch:   1 usd:   5
active_anon:1654 inactive_anon:453 isolated_anon:0
 active_file:30 inactive_file:27 isolated_file:8
 unevictable:255 dirty:0 writeback:0 unstable:0
 free:1887 slab_reclaimable:290 slab_unreclaimable:1492
 mapped:485 shmem:456 pagetables:84 bounce:0
 free_cma:0
Normal free:7548kB min:736kB low:920kB high:1104kB active_anon:6616kB inactive_anon:1812kB active_file:120kB inactive_file:108kB unevictable:1020kB isolated(anon):0kB isolated(file):32kB present:65536kB managed:34088kB mlocked:0kB dirty:0kB writeback:0kB mapped:1940kB shmem:1824kB slab_reclaimable:1160kB slab_unreclaimable:5968kB kernel_stack:848kB pagetables:336kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0
Normal: 609*4kB (UMR) 331*8kB (UMR) 142*16kB (EMR) 4*32kB (MR) 1*64kB (U) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7548kB
776 total pagecache pages

进程地址空间分布

user space:0x0000 0000 - 0x7FFF FFFF 其中动态库在用户空间高地址
这里写图片描述
kernel space:0x8000 0000 - 0xFFFF FFFF

解释

rmm: page allocation failure: order:4, mode:0x104020
这个错误是在内核分配page时失败。rmm是当前进程名,order:4 表示2^4 个连续的page即16个page=64kb。内存碎片会导致连续的page分配失败,即使当时还有很多空闲的page。当order: 0 分配失败时,表示系统当前已经完全out of memory。
mode表示分配的页模式。是传给内核内存分配器的flag, 具体的标示在 include/linux/gfp.h文件中。

CPU: 0 PID: 784 Comm: rmm Tainted: G           O 3.10.27 #6

pid是784, 内核版本是3.10.27

Mem-Info:
Normal per-cpu:
CPU    0: hi:    6, btch:   1 usd:   5
active_anon:1654 inactive_anon:453 isolated_anon:0
 active_file:30 inactive_file:27 isolated_file:8
 unevictable:255 dirty:0 writeback:0 unstable:0
 free:1887 slab_reclaimable:290 slab_unreclaimable:1492
 mapped:485 shmem:456 pagetables:84 bounce:0
 free_cma:0
Normal free:7548kB min:736kB low:920kB high:1104kB active_anon:6616kB inactive_anon:1812kB active_file:120kB inactive_file:108kB unevictable:1020kB isolated(anon):0kB isolated(file):32kB present:65536kB managed:34088kB mlocked:0kB dirty:0kB writeback:0kB mapped:1940kB shmem:1824kB slab_reclaimable:1160kB slab_unreclaimable:5968kB kernel_stack:848kB pagetables:336kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0
Normal: 609*4kB (UMR) 331*8kB (UMR) 142*16kB (EMR) 4*32kB (MR) 1*64kB (U) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7548kB
776 total pagecache pages

输出meminfo的是show_mem()函数再lib/show_mem.c中, 其中的 show_free_areas() 函数(在mm/page_alloc.c中)打印了这些信息。
该cpu只有一个normal 内存区域zone 它的mem watermark信息如下

  Normal free:7548kB min:736kB low:920kB high:1104kB

最后是空闲page的分布信息

Normal: 609*4kB (UMR) 331*8kB (UMR) 142*16kB (EMR) 4*32kB (MR) 1*64kB (U) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7548kB

这里64kb连续内存页还有数量1, 但是仍然引发了分配失败。
分配打order的内存不会引发oom killer, 只有order 《=3 的内存分配失败才会触发oom killer。

oom message 分析

<4>[12345.342532] systemd-journal invoked oom-killer: gfp_mask=0x800d0, order=0, oom_score_adj=0  
<4>[12345.351216] CPU: 1 PID: 1371 Comm: systemd-journal Tainted: G           O 3.14.31-00017-g40fab71 #1  
<4>[12345.360695] Backtrace:  
<4>[12345.363263] [<c0012fcc>] (dump_backtrace) from [<c00131a4>] (show_stack+0x20/0x24)  
<4>[12345.371192]  r6:00000000 r5:ffffffff r4:00000000 r3:bd943631  
<4>[12345.377136] [<c0013184>] (show_stack) from [<c07bbe78>] (dump_stack+0x7c/0xc8)  
<4>[12345.384710] [<c07bbdfc>] (dump_stack) from [<c07ba7e4>] (dump_header.isra.14+0x74/0x188)  
<4>[12345.393184]  r6:000800d0 r5:00000000 r4:e8088000 r3:00000002  
<4>[12345.399126] [<c07ba770>] (dump_header.isra.14) from [<c00f8a28>] (oom_kill_process+0x230/0x3e0)  
<4>[12345.408234]  r10:00000000 r8:000800d0 r7:00000000 r6:c0b89aa8 r5:000800d0 r4:e9bb79c0  
<4>[12345.416462] [<c00f87f8>] (oom_kill_process) from [<c00f90c8>] (out_of_memory+0x2f4/0x354)  
<4>[12345.425024]  r10:00000000 r9:00000000 r8:000800d0 r7:00000000 r6:c0b89aa8 r5:c0b89d08  
<4>[12345.433249]  r4:c0b89aa8  
<4>[12345.435903] [<c00f8dd4>] (out_of_memory) from [<c00fd6c8>] (__alloc_pages_nodemask+0x93c/0x988)  
<4>[12345.445011]  r10:00000000 r9:c0c38fc0 r8:c0b871d8 r7:e8088000 r6:c0c39bc0 r5:00000000  
<4>[12345.453234]  r4:000800d0  
<4>[12345.455887] [<c00fcd8c>] (__alloc_pages_nodemask) from [<c00fd734>] (__get_free_pages+0x20/0x3c)  
<4>[12345.465087]  r10:e97d36a8 r9:00000063 r8:e8089f6c r7:00000063 r6:b6f79f68 r5:e97d36a8  
<4>[12345.473311]  r4:00000000  
<4>[12345.475965] [<c00fd714>] (__get_free_pages) from [<c0196878>] (proc_pid_readlink+0x68/0x110)  
<4>[12345.484808] [<c0196810>] (proc_pid_readlink) from [<c013dcb8>] (SyS_readlinkat+0xf0/0x104)  
<4>[12345.493461]  r7:bea40520 r6:ffffff9c r5:00004000 r4:00000000  
<4>[12345.499402] [<c013dbc8>] (SyS_readlinkat) from [<c000eee0>] (ret_fast_syscall+0x0/0x34)  
<4>[12345.507785]  r10:00000000 r9:e8088000 r8:c000f148 r7:0000014c r6:00000063 r5:b6f79f68  
<4>[12345.516011]  r4:00000064  
<4>[12345.518663] Mem-info:  
<4>[12345.521049] Normal per-cpu:  
<4>[12345.523969] CPU    0: hi:   42, btch:   7 usd:  23  
<4>[12345.528979] CPU    1: hi:   42, btch:   7 usd:  25  
<4>[12345.534004] HighMem per-cpu:  
<4>[12345.537013] CPU    0: hi:  186, btch:  31 usd:  27  
<4>[12345.542199] CPU    1: hi:  186, btch:  31 usd:  29  
<4>[12345.547247] active_anon:21860 inactive_anon:14790 isolated_anon:0  
<4>[12345.547247]  active_file:41585 inactive_file:10422 isolated_file:0  
<4>[12345.547247]  unevictable:0 dirty:9 writeback:205 unstable:0  
<4>[12345.547247]  free:285748 slab_reclaimable:2100 slab_unreclaimable:26286  
<4>[12345.547247]  mapped:26079 shmem:14857 pagetables:687 bounce:0  
<4>[12345.547247]  free_cma:57779  
<4>[12345.581839] Normal free:233460kB min:2488kB low:3108kB high:3732kB active_anon:17312kB   
inactive_anon:10824kB active_file:128kB inactive_file:4kB unevictable:0kB isolated(anon):0kB   
isolated(file):0kB present:774144kB managed:387568kB mlocked:0kB dirty:16kB writeback:76kB   
mapped:3296kB shmem:10840kB slab_reclaimable:8400kB slab_unreclaimable:105144kB kernel_stack:1168kB   
pagetables:2748kB unstable:0kB bounce:0kB free_cma:231116kB writeback_tmp:0kB pages_scanned:1648   
all_unreclaimable? yes  
<4>[12345.627014] lowmem_reserve[]: 0 10168 10168  
<4>[12345.631565] HighMem free:909036kB min:512kB low:2604kB high:4696kB active_anon:70632kB   
inactive_anon:48336kB active_file:166212kB inactive_file:41684kB unevictable:0kB isolated(anon):0kB   
isolated(file):0kB present:1301504kB managed:1301504kB mlocked:0kB dirty:20kB writeback:744kB   
mapped:101020kB shmem:48588kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB   
pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0   
all_unreclaimable? no  
<4>[12345.675614] lowmem_reserve[]: 0 0 0  
<4>[12345.679437] Normal: 1165*4kB (MRC) 1122*8kB (RC) 1119*16kB (RC) 1118*32kB (C) 1068*64kB (RC)   
748*128kB (C) 0*256kB 0*512kB 0*1024kB 1*2048kB (R) 0*4096kB 0*8192kB = 233460kB  
<4>[12345.695797] HighMem: 99*4kB (M) 1148*8kB (UM) 1314*16kB (UM) 880*32kB (UM) 327*64kB (M)   
87*128kB (M) 34*256kB (M) 38*512kB (M) 12*1024kB (M) 10*2048kB (M) 3*4096kB (M) 91*8192kB (UMR) = 909516kB  
<4>[12345.714293] 66770 total pagecache pages  
<4>[12345.718309] 0 pages in swap cache  
<4>[12345.724832] Swap cache stats: add 0, delete 0, find 0/0  
<4>[12345.730308] Free swap  = 0kB  
<4>[12345.733412] Total swap = 0kB  
<4>[12345.747245] 520192 pages of RAM  
<4>[12345.750577] 286253 free pages  
<4>[12345.753778] 97924 reserved pages  
<4>[12345.757258] 28061 slab pages  
<4>[12345.760574] 115601 pages shared  
<4>[12345.764283] 0 pages swap cached  
<6>[12345.767572] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name  
<6>[12345.775906] [ 1366]     0  1366      459      125       3        0             0 sh  
<6>[12345.785861] [ 1367]     0  1367      665      235       4        0             0 propertyd  
<6>[12345.794802] [ 1368]     0  1368    26553     8835      58        0             0 seed  
<6>[12345.803296] [ 1371]     0  1371     1648      772       5        0             0 systemd-journal  
<6>[12345.812792] [ 1375]     0  1375      750      300       4        0         -1000 systemd-udevd  
<6>[12345.822449] [ 2416]  1040  2416     3852      510       7        0             0 secd  
<6>[12345.831341] [ 2419]     0  2419     6678      923       9        0             0 storagemanagerd  
<6>[12345.840944] [ 2420]     0  2420     1267      497       5        0             0 connmand  
<6>[12345.849566] [ 2422]     0  2422     4484      687       8        0             0 uuid  
<6>[12345.857843] [ 2424]     0  2424     1161      358       5        0             0 connman-vpnd  
<6>[12345.867271] [ 2427]  1000  2427     1593      461       6        0             0 logboxd  
<6>[12345.875846] [ 2432]     0  2432     9483     1718      15        0             0 cmns  
<6>[12345.884104] [ 2451]    81  2451     1355      474       4        0          -900 dbus-daemon  
<6>[12345.893018] [ 2532]     0  2532    11794      246      10        0             0 adbd  
<6>[12345.901304] [ 2535]     0  2535     1502      347       5        0             0 wpa_supplicant  
<6>[12345.910473] [ 2536]     0  2536    12820      866      12        0             0 udisksd  
<6>[12345.919119] [ 2537]     0  2537     1898      527       6        0             0 tyid  
<6>[12345.927361] [ 2540]     0  2540    10076     2157      16        0             0 datamanagerd  
<6>[12345.936349] [ 2554]     0  2554     5983      574       7        0             0 connectivityser  
<6>[12345.945635] [ 2558]     0  2558    10604     5388      21        0             0 weston  
<6>[12345.964101] [ 2589]     0  2589    14597     1917      17        0             0 pagemanagerd  
<6>[12345.973272] [ 2590]     0  2590     3832      515       7        0             0 amt  
<6>[12345.981730] [ 2593]     0  2593     6176     1343      12        0             0 weston-desktop-  
<6>[12345.991046] [ 2599]     0  2599     7185      761      12        0             0 scim-launcher  
<6>[12346.098925] [ 5580]     0  5580      458      116       3        0             0 sh  
<6>[12346.107065] [ 5581]     0  5581      492      175       3        0             0 gzip  
<3>[12346.115335] Out of memory: Kill process 5575 thread_x score 481 or sacrifice child  
<3>[12346.124212] Killed process 5575 thread_x total-vm:106212kB, anon-rss:18036kB, file-rss:2704kB  

loading 高导致oom

/proc/sys/vm/swappiness

Changing the value directly influences the performance of the Linux system. These values are defined:

  • 0: swap is disable
  • 1: minimum amount of swapping without disabling it entirely
  • 10: recommended value to improve performance when sufficient memory exists in a system
  • 100: aggressive swapping
### Memory Allocation Failure 的原因分析 内存分配失败 (Memory Allocation Failure) 是指操作系统或应用程序无法为新请求的数据分配足够的连续物理内存区域。这种错误可能由多种因素引发,以下是常见的原因及其对应的解决方案: #### 原因一:虚拟内存不足 当系统可用的虚拟内存不足以满足当前运行的应用程序需求时,可能会触发 `memory allocation failure` 错误。这通常发生在长时间运行的服务器上,或者在高负载情况下未及时释放资源的情况下[^1]。 #### 解决方案: - **检查并调整虚拟内存设置** 可通过操作系统的性能监视工具查看虚拟内存使用情况,并适当增加交换文件(swap file)大小来缓解此问题。 - **优化应用内存消耗** 对于长期占用大量内存的应用程序,建议定期重启服务以清理不必要的缓存临时数据。 --- #### 原因二:内核级内存分配失败 如果系统频繁报告 `page allocation failure` 或类似的日志消息,则说明可能存在低级别的内存管理问题。这类问题通常是由于物理内存耗尽引起的,尤其是在大页内存分配场景下[^2]。 #### 解决方案: - **监控内存利用率** 使用命令如 `free -m`, `vmstat`, `dmesg | grep -i page` 来诊断具体的内存瓶颈位置。 - **减少内存碎片化** 如果发现内存存在严重的分片现象,可以尝试重新启动机器或将某些进程迁移到其他节点上来整理内存布局。 --- #### 原因三:Java 年轻代空间不足 对于基于 JVM 的应用程序而言,“allocation failure” 往往意味着垃圾回收器未能找到足够大的连续区块用于存储新生对象。此时会触发一次 Minor GC 尝试腾挪更多自由空间;但如果仍然不够的话就会抛出 OOM 异常[^3]。 #### 解决方案: - **增大堆尺寸配置参数** 修改 `-Xms`, `-Xmx` 参数设定初始与最大允许使用的 Java Heap Size 数值范围。 - **启用 G1 收集算法** 考虑切换到更先进的垃圾收集机制比如 G1GC (`-XX:+UseG1GC`) ,它能更好地控制暂停时间吞吐量之间的平衡关系。 --- #### 原因四:Node.js 中的老年代溢出 在 Node.js 应用开发过程中遇到 “allocation failure scavenge might not succeed”,则提示我们正在接近 V8 引擎所能支持的最大老生代容量限制[^4]。 #### 解决方案: - **扩展旧世代上限阈值** 编辑 package.json 文件内的脚本部分加入如下选项即可生效: ```json { "scripts": { "dev": "node --max-old-space-size=8192 ./index.js", "start": "node --max-old-space-size=8192 ./dist/main.js" } } ``` 上述代码片段展示了如何将默认的老年区大小提升至 8GB (即 8192MB)。根据实际硬件条件灵活调节数值直至不再报错为止。 --- ### 总结 针对不同类型的 memory allocation failures 需要采取针对性措施加以应对。无论是从系统层面还是具体编程环境角度出发都需要保持良好的实践习惯才能有效预防此类状况的发生。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值