SLUB: Unable to allocate memory on node -1

kernel: redirfs: loading out-of-tree module taints kernel.
kernel: redirfs: module verification failed: signature and/or required key missing - tainting kernel
kernel: Redirecting File System Framework Version 0.10 <www.redirfs.org> with TrendMicro Patch 9.6.2.8793
kernel: register_chrdev() done: 244
kernel: gsch: loading vfs-filter 9.6.2.8793: OK
kernel: gsch_dev_open() doing
kernel: gsch_dev_open() done: pid:100128(ds_am)
kernel: cannot uninstall hooks if location of sys_call_table is unknown
kernel: gsch_remove_hooks(&gsch_hooks, &orig_hooks) done: -22
kernel: lookup sys_call_table yields ffffffff816c6ee0
kernel: lookup sys_execve yields ffffffff8120b550
kernel: lookup do_execve yields ffffffff8120b2e0
kernel: lookup ia32_sys_call_table yields ffffffff816cdf80
kernel: running: awk '/[0-9a-f]+ [RTrt] compat_do_execve$/ { print "0x" $1 >"/proc/driver/gsch/syshook/addr_compat_do_execve" ; exit }' /boot/System.map-3.10.0-693.11.6.el7.x86_64 doing
kernel: running: awk '/[0-9a-f]+ [RTrt] compat_do_execve$/ { print "0x" $1 >"/proc/driver/gsch/syshook/addr_compat_do_execve" ; exit }' /boot/System.map-3.10.0-693.11.6.el7.x86_64 done(0
kernel: lookup compat_do_execve yields 0
kernel: lookup int_ret_from_sys_call yields ffffffff816b8c91
kernel: lookup getname yields ffffffff81213d40
kernel: lookup putname yields ffffffff81213b40
kernel: hooking open NR=2 ... ffffffff81201ac0 -> ffffffffc05f0a80
kernel: hooking close NR=3 ... ffffffff81201b20 -> ffffffffc05ef8a0
kernel: hooking exit NR=60 ... ffffffff8108fc10 -> ffffffffc05ee9a0
kernel: hooking getpgid NR=121 ... ffffffff810a50d0 -> ffffffffc05ee940
kernel: hooking unlink NR=87 ... ffffffff812153e0 -> ffffffffc05eefa0
kernel: hooking unlinkat NR=263 ... ffffffff812153a0 -> ffffffffc05ef240
kernel: hooking write NR=1 ... ffffffff81203880 -> ffffffffc05eecf0
kernel: hooking pwrite64 NR=18 ... ffffffff81203a20 -> ffffffffc05eedd0
kernel: hooking writev NR=20 ... ffffffff81204030 -> ffffffffc05eeec0
kernel: hooking dup2 NR=33 ... ffffffff812227b0 -> ffffffffc05efa80
kernel: hooking mount NR=165 ... ffffffff81227030 -> ffffffffc05f0cf0
kernel: hooking umount NR=166 ... ffffffff81225720 -> ffffffffc05f01e0
kernel: hooking exit_group NR=231 ... ffffffff8108fcd0 -> ffffffffc05eea20
kernel: hooking stub NR=59 @ ffffffff816b8f30 ... ffffffff8120b550 (-4905512) -> ffffffffc05f08f0 (1056143736)
kernel: gsch_install_hooks(&gsch_hooks, &orig_hooks) done: 0
kernel: gsch_flt: loaded
kernel: gsch_flt_add_mnt(/dev @ Unknown[1021994(devtmpfs)]) done: 0
kernel: gsch_flt_add_mnt(/dev/shm @ Unknown[1021994(tmpfs)]) done: 0
kernel: gsch_flt_add_mnt(/run @ Unknown[1021994(tmpfs)]) done: 0
kernel: gsch_flt_add_mnt(/sys/fs/cgroup @ Unknown[1021994(tmpfs)]) done: 0
kernel: gsch_flt_add_mnt(/sys/fs/pstore @ Unknown[6165676c(pstore)]) done: 0
kernel: gsch_flt_add_mnt(/sys/kernel/config @ Unknown[62656570(configfs)]) done: 0
kernel: ds_am: page allocation failure: order:0, mode:0x200020
kernel: CPU: 3 PID: 100156 Comm: ds_am Tainted: G           OE  ------------   3.10.0-693.11.6.el7.x86_64 #1
kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/05/2016
kernel: Call Trace:
kernel: [<ffffffff816a5ea1>] dump_stack+0x19/0x1b
kernel: [<ffffffff8118a510>] warn_alloc_failed+0x110/0x180
kernel: [<ffffffff816a1e7a>] __alloc_pages_slowpath+0x6b6/0x724
kernel: [<ffffffff8118eaa5>] __alloc_pages_nodemask+0x405/0x420
kernel: [<ffffffff811d3038>] alloc_pages_current+0x98/0x110
kernel: [<ffffffff811dddcc>] new_slab+0x2fc/0x310
kernel: [<ffffffff811df65c>] ___slab_alloc+0x3ac/0x4f0
kernel: [<ffffffffc05d7881>] ? rfs_dcache_get_subs+0xd1/0x2b0 [redirfs]
kernel: [<ffffffffc05d7881>] ? rfs_dcache_get_subs+0xd1/0x2b0 [redirfs]
kernel: [<ffffffff816a31de>] __slab_alloc+0x40/0x5c
kernel: [<ffffffff811e17e7>] kmem_cache_alloc_trace+0x1a7/0x200
kernel: [<ffffffffc05d7fb0>] ? rfs_dcache_rdentry_del+0x160/0x160 [redirfs]
kernel: [<ffffffffc05d7881>] rfs_dcache_get_subs+0xd1/0x2b0 [redirfs]
kernel: [<ffffffffc05d7fb0>] ? rfs_dcache_rdentry_del+0x160/0x160 [redirfs]
kernel: [<ffffffffc05d7b55>] rfs_dcache_walk+0xb5/0x270 [redirfs]
kernel: [<ffffffffc05d00c6>] redirfs_add_path+0x416/0x460 [redirfs]
kernel: [<ffffffffc05f34a9>] gsch_flt_add_mnt+0x169/0x290 [gsch]
kernel: [<ffffffffc05f373c>] gsch_flt_add_mntpoint+0x16c/0x180 [gsch]
kernel: [<ffffffff812029b5>] ? vfs_read+0xf5/0x170
kernel: [<ffffffffc05f38e9>] gsch_flt_add_mnts+0x199/0x2e0 [gsch]
kernel: [<ffffffffc05f4645>] gsch_redirfs_enable_write+0x165/0x190 [gsch]
kernel: [<ffffffffc05ed038>] write_wrapper+0x38/0x60 [gsch]
kernel: [<ffffffff8127230d>] proc_reg_write+0x3d/0x80
kernel: [<ffffffff81202aed>] vfs_write+0xbd/0x1e0
kernel: [<ffffffff812038ff>] SyS_write+0x7f/0xe0
kernel: [<ffffffffc05eed51>] gsch_write_hook_fn+0x61/0xe0 [gsch]
kernel: [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
kernel: Mem-Info:
kernel: active_anon:142341 inactive_anon:95395 isolated_anon:0#012 active_file:163239 inactive_file:106774 isolated_file:0#012 unevictable:0 dirty:60 writeback:0 unstable:0#012 slab_recl
kernel: Node 0 DMA free:15872kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB pres
kernel: lowmem_reserve[]: 0 2975 16013 16013
kernel: Node 0 DMA32 free:56596kB min:12544kB low:15680kB high:18816kB active_anon:244552kB inactive_anon:77844kB active_file:161308kB inactive_file:91552kB unevictable:0kB isolated(anon
kernel: lowmem_reserve[]: 0 0 13037 13037
kernel: Node 0 Normal free:20496kB min:54968kB low:68708kB high:82452kB active_anon:324812kB inactive_anon:303736kB active_file:491648kB inactive_file:335544kB unevictable:0kB isolated(a
kernel: lowmem_reserve[]: 0 0 0 0
kernel: Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15872kB
kernel: Node 0 DMA32: 318*4kB (UE) 0*8kB 2*16kB (UE) 2*32kB (UE) 2*64kB (UE) 0*128kB 103*256kB (UEM) 56*512kB (UM) 0*1024kB 0*2048kB 0*4096kB = 56536kB
kernel: Node 0 Normal: 1326*4kB (EM) 3*8kB (UEM) 2*16kB (UE) 1*32kB (M) 2*64kB (UE) 1*128kB (U) 28*256kB (UEM) 15*512kB (EM) 0*1024kB 0*2048kB 0*4096kB = 20496kB
kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
kernel: 371106 total pagecache pages
kernel: 518 pages in swap cache
kernel: Swap cache stats: add 658, delete 140, find 2093/2095
kernel: Free swap  = 16774648kB
kernel: Total swap = 16777212kB
kernel: SLUB: Unable to allocate memory on node -1 (gfp=0x8020)
kernel:  cache: kmalloc-32, object size: 32, buffer size: 32, default order: 0, min order: 0
kernel:  node 0: slabs: 53196, objs: 6809088, free: 0
kernel: sched: RT throttling activated
kernel: gsch_flt_add_mnt(/ @ Unknown[58465342(xfs)]) done: 0
kernel: gsch_flt_add_mnt(/dev/mqueue @ Unknown[19800202(mqueue)]) done: 0
kernel: gsch_flt_add_mnt(/dev/hugepages @ Unknown[958458f6(hugetlbfs)]) done: 0
kernel: gsch_flt_add_mnt(/proc/fs/nfsd @ Unknown[6e667364(nfsd)]) done: 0
 

 

 

对于SLUB: Unable to allocate memory on node -1 

解决方法可以参考下文

https://pingcap.com/blog/try-to-fix-two-linux-kernel-bugs-while-testing-tidb-operator-in-k8s/

 

 

Try to Fix Two Linux Kernel Bugs While Testing TiDB Operator in K8s

Author: Wenbo Zhang (Linux Kernel Engineer of the EE team at PingCAP)

Kubernetes (K8s) is an open-source container orchestration system that automates application deployment, scaling, and management. It’s the operating system of the cloud-native world. Any defects in K8s or the operation system might put the application in the upper layer at risk.

As the EE (Efficiency Engineering) team at PingCAP, we optimize our office infrastructure and improve office automation and teamwork effectiveness. When we tested TiDB Operator (which creates and manages TiDB clusters) in K8s, we found two Linux kernel bugs. These bugs have plagued our PingCAP K8s environment for quite a while and aren’t thoroughly fixed in the K8s community as a whole.

After extensive investigation and diagnosis, we’ve identified ways to handle these bugs. In this post, I’ll share with you how we tackled these problems. However, as useful as these solutions are, we consider them workarounds, and we believe that more elegant solutions are possible. The goal of this post is to inspire the K8s community, RHEL, and CentOS so that they can help fix these bugs thoroughly in the near future.

Bug #1: Unstable kmem accounting

Keyword: SLUB: Unable to allocate memory on node -1

Related issues in the community:

Issue description

When we tested TiKV’s online transaction processing (OLTP) performance in K8s, I/O performance occasionally jittered. However, following items appeared normal:

  • TiKV and RocksDB log files; RocksDB is TiKV’s underlying storage engine
  • CPU (no bottleneck)
  • Memory and disk statistics from the load information

The only negative indicator was the message, “SLUB: Unable to allocate memory on node -1” in the output of the dmesg command.

Issue analysis

We used funcslower in perf-tools to trace kernel functions that were executed slowly and adjusted the threshold value of the hung_task_timeout_secs kernel parameter. When we performed the write operations in TiKV, we found the following kernel path information:

Based on the information above, we could see that the I/O jitter was related to the file system executing writepage. At the same time, before and after the performance jitter was captured, the dmesg output contained a large amount of information about “SLUB: Unable to allocate memory on node -1”. This message occurred even when the node had sufficient memory.

We analyzed the call stack information output from hung_task and the kernel code. We found that the kernel tried to allocate the bio_vec object via kmem_cache_alloc when executing the bvec_alloc function to allocate the bio_vec object.

If this operation failed, the kernel fell back to allocate the bio_vec object from the mempool. However, inside the mempool, the system first tried to execute the pool->alloc callback for allocation. If this allocation failed, the kernel set the process to an uninterruptible state and put the process in a wait queue. After other processes returned memory resources to the mempool or the timer timed out (after 5 seconds), the process scheduler invoked the process for retry. The wait time of the wait queue was consistent with the jitter delay of our application monitoring.

When we created the Docker container, we didn’t set memory.kmem.limit_in_bytes. But why was kmem insufficient? To determine whether memory.kmem.limit_in_bytes was set, we went to the cgroup memory controller to check the container’s kmem information. The kmem statistics were enabled, and the kemem limit was set to a very large value.

Because we knew that the kmem accounting was unstable in the RHEL 3.10 kernel, we suspected that a kernel bug caused the SLUB allocation failure. We searched for kernel patch information, and found that it was a kernel bug, and that it had been fixed in Linux kernel version 4.x: slub: make dead caches discard free slabs immediately. There was also a namespace leak issue associated with kmem accounting: mm: memcontrol: fix cgroup creation failure after many small jobs.

So how was kmem accounting enabled? We used the opensnoop tool in bcc to monitor the kmem configuration file and captured runc as the file modifier. From the K8s code, we found that kmem accounting was enabled by default in the K8s-dependent runc project.

Solution

Based on our issue analysis, we can fix the bug in either of the following ways:

  • Upgrade the existing kernel to a later version.
  • Disable the kmem accounting feature when starting the container.

Now runc includes the conditional compilation option, and you can disable kmem accounting via Build Tags. After we disabled this feature, our test result showed that the jitter disappeared, and so did the namespace leak and SLUB allocation failure problems.

Operation steps

We need to disable the kmem accounting feature on both kubelet and Docker.

  1. Recompile kubelet according to the corresponding kubelet version:

    • For kubelet v1.14 or later, add Build Tags when you compile kubelet to disable kmem accounting:
    $ make BUILDTAGS="nokmem"
    • For kubelet v1.13 or earlier, we cannot add Build Tags when compiling kubelet. Instead, manually replace the two functions that enable kmem accounting with the following code:
    func EnableKernelMemoryAccounting(path string) error {
        return nil
    }
    func setKernelMemory(path string, kernelMemoryLimit int64) error {
        return nil
    }

    Then recompile kubelet.

  2. Upgrade docker-ce to v18.09.1 or later. In these versions, kmem accounting of runc is disabled in Docker.

  3. Reboot the system.

  4. To verify whether the kmem accounting feature of all the containers is disabled in the newly created pod, run the following command:

    $ cat /sys/fs/cgroup/memory/kubepods/burstable/pod<pod-uid>/<container-id>/memory.kmem.slabinfo

    If the returned result is as follows, it means that the kmem accounting feature is disabled successfully:

    > cat: memory.kmem.slabinfo: Input/output error

Bug #2: Network device reference count leak

Keyword: kernel:unregister_netdevice: waiting for eth0 to become free. Usage count = 1

Related issues in the community:

根据引用\[1\]中的信息,问题中的错误信息"unable to allocate memory"可能是由于内存分配失败引起的。根据引用\[2\]中的资料,这种错误可能有两个原因:一是内存中存在大量碎片,导致没有连续的内存可供分配;二是内存容量不足。因此,解决这个问题的方法可以从两个方面入手。首先,可以尝试通过增加绑定变量、减少应解析等开发角度的方法来改善和避免内存碎片问题。其次,如果内存容量不足,可以考虑扩大内存。根据引用\[3\]中的信息,分析了调用堆栈信息和内核代码,发现在执行bvec_alloc函数时,内核尝试通过kmem_cache_alloc来分配bio_vec对象。因此,可能需要进一步分析具体的调用堆栈和内核代码,以确定如何解决内存分配问题。 #### 引用[.reference_title] - *1* *3* [SLUB: Unable to allocate memory on node -1](https://blog.csdn.net/vic_qxz/article/details/92838823)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^insert_down1,239^v3^insert_chatgpt"}} ] [.reference_item] - *2* [ORA-04031: unable to allocate 4096 bytes of shared memory (解决方案)](https://blog.csdn.net/qq_35624642/article/details/53169792)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^insert_down1,239^v3^insert_chatgpt"}} ] [.reference_item] [ .reference_list ]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值