应用OOM及子进程的创建

Linux应用内存分配失败的问题

大型的嵌入式应用常占用巨量的内存。一些“杂揉、拼凑”而成的应用,常在一个应用中包含多个功能模块,例如音视频处理模块,系统控制模块等。这样的应用设计会带来一系列的内存问题,最主要的一个是音视频的应用会占用大量的内存空间,从而影响应用的运行性能。笔者根据以往的经验,列出一种与子进程创建相关的内存分配失败问题。当某个应用向Linux内核申请内存但内核无法满足时,内核会根据配置,选择性地杀掉该进程;该功能与OOM-Killer相关(注意,区别于安卓内核中的lowmemory-killer):

[  113.092762] Tasks state (memory values in pages):
[  113.097532] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[  113.106311] [    136]    81   136      651       40    32768        0             0 ubusd
[  113.114629] [    137]     0   137      962       45    32768        0             0 ash
[  113.122769] [    138]     0   138      599       18    32768        0             0 askfirst
[  113.131360] [    284]   514   284      756       72    32768        0             0 logd
[  113.139589] [    419]   453   419      698       63    36864        0             0 dnsmasq
[  113.148071] [    481]     0   481      666       17    32768        0             0 dropbear
[  113.156652] [    579]     0   579     1455       50    36864        0             0 hostapd
[  113.165144] [    580]     0   580     1455       51    49152        0             0 wpa_supplicant
[  113.174253] [    641]     0   641      766       90    32768        0             0 netifd
[  113.182655] [    698]     0   698      693       65    36864        0             0 odhcpd
[  113.191060] [    912]     0   912      928       20    36864        0             0 udhcpc
[  113.199463] [    927]     0   927      984       41    36864        0             0 ntpd
[  113.207679] [   1099]     0  1099   209490   208986  1703936        0             0 fork-vfork
[  113.216431] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=fork-vfork,pid=1099,uid=0
[  113.229448] Out of memory: Killed process 1099 (fork-vfork) total-vm:837960kB, anon-rss:835944kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:1664kB oom_score_adj:0
[  113.460009] oom_reaper: reaped process 1099 (fork-vfork), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Killed

如上,笔者编写的fork-vfork调试应用因向内核申请内存失败,而被杀死。可以修改系统配置,以禁用OOM-Killer;那么当分配内存失败时,应用会得到空指针:

sysctl vm.oom_kill_allocating_task=0
sysctl vm.overcommit_memory=2

子进程创建与写时拷贝(COW)

当调用fork创建子进程时,内核会将父进程的地址空间拷贝到子进程中。以笔者的调试结果,并未发现写时拷贝发挥作用;该功能对应用的调试应当是透明的,COW功能可能被内核使用到,但应该忽略该功能的存在:现在可以认为以fork系统调用(实际上是clone)创建子进程时,内核必须有一个地址空间的拷贝过程。那么当系统的内存不足时,创建子进程就会失败:

Memory allocated: 0x15/0x80, 168.00 MB / 1024.00 MB
	`forked parent pid: 1423, child pid: 1445
	Parent: parent pid: 1423, child pid: 0
		time spent in `fork: 10336 micro seconds
Memory allocated: 0x16/0x80, 176.00 MB / 1024.00 MB
	`forked parent pid: 1423, child pid: 1446
	Parent: parent pid: 1423, child pid: 0
		time spent in `fork: 10743 micro seconds
Memory allocated: 0x17/0x80, 184.00 MB / 1024.00 MB
	`forked parent pid: 1423, child pid: 1447
	Parent: parent pid: 1423, child pid: 0
		time spent in `fork: 11156 micro seconds
Memory allocated: 0x18/0x80, 192.00 MB / 1024.00 MB
	`forked parent pid: 1423, child pid: 1448
	Parent: parent pid: 1423, child pid: 0
		time spent in `fork: 11720 micro seconds
Memory allocated: 0x19/0x80, 200.00 MB / 1024.00 MB
	`forked parent pid: 1423, child pid: 1449
	Parent: parent pid: 1423, child pid: 0
		time spent in `fork: 12192 micro seconds
Memory allocated: 0x1a/0x80, 208.00 MB / 1024.00 MB
	`forked parent pid: 1423, child pid: 1450
	Parent: parent pid: 1423, child pid: 0
		time spent in `fork: 12589 micro seconds
Memory allocated: 0x1b/0x80, 216.00 MB / 1024.00 MB
	`forked parent pid: 1423, child pid: 1451
	Parent: parent pid: 1423, child pid: 0
		time spent in `fork: 12950 micro seconds
Memory allocated: 0x1c/0x80, 224.00 MB / 1024.00 MB
	Error, failed to `fork process: Cannot allocate memory
Memory allocated: 0x1d/0x80, 232.00 MB / 1024.00 MB
	Error, failed to `fork process: Cannot allocate memory
Memory allocated: 0x1e/0x80, 240.00 MB / 1024.00 MB
	Error, failed to `fork process: Cannot allocate memory

注意到,笔者编写的调试应用fork-vfork,因其分配的内存过多,而不能创建新的子进程;尽管创建子进程失败后,其仍能申请的新的内存,颇有“一山不容二虎”的味道。此外,随着内存的增加,其创建子进程的时间也增加了:这说明COW功能在fork系统调用过程中未启作用;更重要的一点是,大内存应用应当避免频繁创建子进程,否则其性能会下降。

以vfork创建进程,避免内存拷贝

当以vfork系统调用创建子进程时,父进程会等待子进程退出运行,或执行execve之后,才会继续运行。这样可以避免将父进程的地址空间拷贝到子进程,节约时间。不过这样做的一个后果是,子进程可能“意外”地修改父进程的内存数据:

Memory allocated: 0x1f/0x80, 248.00 MB / 1024.00 MB
	vforked parent pid: 1452, child pid: 1483
	Parent: parent pid: 1452, child pid: 2021
		time spent in vfork: 199 micro seconds
Memory allocated: 0x20/0x80, 256.00 MB / 1024.00 MB
	vforked parent pid: 1452, child pid: 1484
	Parent: parent pid: 1452, child pid: 2021
		time spent in vfork: 189 micro seconds
Memory allocated: 0x21/0x80, 264.00 MB / 1024.00 MB
	vforked parent pid: 1452, child pid: 1485
	Parent: parent pid: 1452, child pid: 2021
		time spent in vfork: 179 micro seconds
Memory allocated: 0x22/0x80, 272.00 MB / 1024.00 MB
	vforked parent pid: 1452, child pid: 1486
	Parent: parent pid: 1452, child pid: 2021
		time spent in vfork: 204 micro seconds
Memory allocated: 0x23/0x80, 280.00 MB / 1024.00 MB
	vforked parent pid: 1452, child pid: 1487
	Parent: parent pid: 1452, child pid: 2021
		time spent in vfork: 178 micro seconds
Memory allocated: 0x24/0x80, 288.00 MB / 1024.00 MB
	vforked parent pid: 1452, child pid: 1488
	Parent: parent pid: 1452, child pid: 2021
		time spent in vfork: 201 micro seconds

如上,系统调用vfork花费的时间基本上是恒定的,不会内存增加而增长。这是因为避免了内存拷贝,子进程直接用父进程的地址空间去运行。这样的代价是,子进程将父进程的pidvp指针指向的空间修改成为了2021,这一结果在父进程中是可见的:

    if (pid == 0) {
        *pidvp = getpid();
        fprintf(stdout, "\t%sed parent pid: %ld, child pid: %ld\n",
            method, (long) getppid(), (long) *pidvp);
        fflush(stdout);
        *pidvp = 2021;
        _exit(0);
    }

创建子进程不仅会将父进程中的所有的线程阻塞(以等待fork系统调用拷贝进程的地址空间,或等待vfork创建的子进程退出或加载新的进程镜像),还可能因在子进程中响应异步操作(例如在子进程中处理了某个信号)而修改父进程的内存或系统资源。因此,在大内存应用的开发中,不能推荐以vfork替代fork系统调用:大内存应用应尽量避免创建子进程。一种常见于桌面系统的方案是,将系统中的多个模块以单进程实现;使用D-Bus等组件,以进程间通信的机制将各个功能模块连接成有机的整体,这即降低了软件内聚,去除耦和,也增加了各个模块的可维护性、可扩展性,从而提升嵌入式应用软件的开发效率。

调试应用源码

笔者编写的fork-vfork调试应用代码如下:

/*
 * Created by yeholmes@outlook.com
 *
 * fork/vfork test
 *
 * 2021/09/25
 */

#include <errno.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdint.h>
#include <time.h>

#define MEMBLOCK_MAX_BLOCKS        512
#define MEMBLOCK_DEFAULT_BLOCKS    64
#define MEMBLOCK_MAX_SIZE          0x1000000    /* 16MB */
#define MEMBLOCK_DEFAULT_SIZE      65536        /* 64K */

typedef pid_t (* fork_func_t)(void);
static fork_func_t g_fork = fork;

struct memblock {
    unsigned int num_blocks;
    unsigned int size_block;
    uint64_t size_total;
    unsigned char * * blocks;
    int ranfd;
};

static void nano_sleep(unsigned int msec)
{
    struct timespec tspec;

    tspec.tv_sec = (time_t) (msec / 1000);
    tspec.tv_nsec = (long) ((msec % 1000) * 1000000);

    nanosleep(&tspec, NULL);
}

static unsigned int memblock_count(const struct memblock * mb, int verbose)
{
    unsigned int idx;
    unsigned int rval;

    rval = 0;
    for (idx = 0; idx < mb->num_blocks; ++idx) {
        if (mb->blocks[idx] != NULL)
            rval++;
    }

    if (verbose) {
        double dval0, dval1;
        uint64_t size;
        size = (uint64_t) rval;
        size *= (uint64_t) mb->size_block;

        dval0 = (double) size;
        dval1 = (double) mb->size_total;
        fprintf(stdout, "Memory allocated: %#x/%#x, %.02f MB / %.02f MB\n",
            rval, mb->num_blocks, dval0 / 1048576.0, dval1 / 1048576.0);
        fflush(stdout);
    }

    return rval;
}

static void memblock_destory(struct memblock * mb)
{
    unsigned int idx;

    close(mb->ranfd);
    mb->ranfd = -1;
    for (idx = 0; idx < mb->num_blocks; ++idx) {
        if (mb->blocks[idx] != NULL) {
            free(mb->blocks[idx]);
            mb->blocks[idx] = NULL;
        }
    }
}

static int memblock_free(struct memblock * mb)
{
    ssize_t rl1;
    unsigned int avail;
    unsigned int freeidx;
    unsigned int idx, count;

    avail = memblock_count(mb, 0);
    if (avail == 0)
        goto err0;

    freeidx = 0;
    rl1 = read(mb->ranfd, &freeidx, sizeof(freeidx));
    if (rl1 != (ssize_t) sizeof(freeidx)) {
        fprintf(stderr, "Error, failed to read random device: %s\n",
            strerror(errno));
        fflush(stderr);
        return -1;
    }
    freeidx = freeidx % avail;

    count = 0;
    for (idx = 0; idx < mb->num_blocks; ++idx) {
        if (mb->blocks[idx] == NULL)
            continue;
        if (count == freeidx) {
            free(mb->blocks[idx]);
            mb->blocks[idx] = NULL;
            return 0;
        }
        count++;
    }

err0:
    fputs("Warning, no available memory block to free!\n", stderr);
    fflush(stderr);
    return -1;
}

static int memblock_alloc(struct memblock * mb)
{
    size_t sizeb, sizer;
    unsigned char * newblock;
    unsigned int idx, empty;

    empty = ~0u;
    newblock = NULL;
    for (idx = 0; idx < mb->num_blocks; ++idx) {
        if (mb->blocks[idx] == NULL) {
            empty = idx;
            break;
        }
    }

    if (empty == ~0u) {
        fputs("Warning, no memory block available\n", stderr);
        fflush(stderr);
        return -1;
    }

    sizeb = (size_t) mb->size_block;
    /* check again */
    if (sizeb == 0 || sizeb > MEMBLOCK_MAX_SIZE) {
        fprintf(stderr, "Error, invalid memory block size: %#x\n",
            (unsigned int) sizeb);
        fflush(stderr);
        return -1;
    }

    newblock = (unsigned char *) malloc(sizeb);
    if (newblock == NULL) {
        fprintf(stderr, "Error, system out of memory: %#x\n",
            (unsigned int) sizeb);
        fflush(stderr);
        return -1;
    }

    sizer = 0;
    while (sizer < sizeb) {
        ssize_t rval;
        size_t sizec = sizeb - sizer;
        if (sizec > 65536) {
            /* read small size each time,
             * because read of /dev/urandom might
             * fail due to large read size, such as 16MB
             */
            sizec = 65536;
        }
        rval = read(mb->ranfd, newblock + sizer, sizec);
        if (rval != (ssize_t) sizec) {
            fprintf(stderr, "Error, failed to read random device: %s\n",
                strerror(errno));
            fflush(stderr);
            break;
        }
        sizer += sizec;
    }

    if (sizer < sizeb) {
        free(newblock);
        return -1;
    }
    mb->blocks[empty] = newblock;
    return 0;
}

static int memblock_init(struct memblock * mb,
    int argc, char *argv[])
{
    unsigned int num_blocks;
    unsigned int size_block;
    uint64_t size_total = 0;

    num_blocks = MEMBLOCK_DEFAULT_BLOCKS;
    if (argc > 0x1) {
        num_blocks = (unsigned int) strtoul(argv[1], NULL, 0x0);
        if (num_blocks == 0 || num_blocks > MEMBLOCK_MAX_BLOCKS) {
            num_blocks = MEMBLOCK_DEFAULT_BLOCKS;
            fprintf(stderr, "Error, invalid number of blocks: %s\n", argv[1]);
            fflush(stderr);
        }
    }

    size_block = MEMBLOCK_DEFAULT_SIZE;
    if (argc > 0x2) {
        size_block = (unsigned int) strtoul(argv[2], NULL, 0x0);
        if (size_block == 0 || size_block > MEMBLOCK_MAX_SIZE) {
            size_block = MEMBLOCK_DEFAULT_SIZE;
            fprintf(stderr, "Error, invalid block size: %s\n", argv[2]);
            fflush(stderr);
        }
    }

    size_total = (uint64_t) num_blocks;
    size_total *= (uint64_t) size_block;
    fprintf(stdout, "INFO: memory blocks: %u, block size: %#x, total: 0x%x%08x\n",
        num_blocks, size_block, (unsigned int) (size_total >> 32), (unsigned int) size_total);
    fflush(stdout);

    mb->num_blocks = num_blocks;
    mb->size_block = size_block;
    mb->size_total = size_total;
    mb->blocks = (unsigned char * *) calloc(
        (size_t) (num_blocks + 0x1), sizeof(unsigned char *));
    if (mb->blocks == NULL) {
        fputs("Error, system out of memory!\n", stderr);
        fflush(stderr);
        return -1;
    }

    mb->ranfd = open("/dev/urandom", O_RDONLY);
    if (mb->ranfd == -1) {
        fprintf(stderr, "Error, failed to open random device: %s\n", strerror(errno));
        fflush(stderr);
        free((void *) mb->blocks);
        mb->blocks = NULL;
        return -1;
    }
    return 0;
}

static int fork_process(void)
{
    int stats;
    long interval;
    pid_t pid, pidr;
    const char * method;
    struct timespec now, then;
    volatile pid_t * pidvp = NULL;

    now.tv_sec = 0;
    now.tv_nsec = 0;
    then.tv_sec = 0;
    then.tv_nsec = 0;

    pidvp = (volatile pid_t *) malloc(4906);
    if (pidvp == NULL) {
        fputs("Error, system out of memory for pidvp!\n", stderr);
        fflush(stderr);
        return -1;
    }
    *pidvp = 0;

    method = (g_fork == fork) ? "`fork" : "vfork";
    clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &now);
    pid = g_fork();
    clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &then);
    if (pid == (pid_t) -1) {
        fprintf(stderr, "\tError, failed to %s process: %s\n",
            method, strerror(errno));
        fflush(stderr);
        free((void *) pidvp);
        pidvp = NULL;
        return -1;
    }

    if (pid == 0) {
        *pidvp = getpid();
        fprintf(stdout, "\t%sed parent pid: %ld, child pid: %ld\n",
            method, (long) getppid(), (long) *pidvp);
        fflush(stdout);
        *pidvp = 2021;
        _exit(0);
    }

again:
    stats = 0;
    pidr = waitpid(pid, &stats, 0);
    if (pidr == (pid_t) -1) {
        int err_n;
        err_n = errno;
        if (err_n == EINTR)
            goto again;
        fprintf(stderr, "\tError, failed to waitpid(%ld): %s\n",
            (long) pid, strerror(err_n));
        fflush(stderr);
        free((void *) pidvp);
        pidvp = NULL;
        return -1;
    }

    interval = then.tv_nsec - now.tv_nsec;
    interval /= 1000; /* microseconds */
    interval += (long) ((then.tv_sec - now.tv_sec) * 1000);

    fprintf(stdout, "\tParent: parent pid: %ld, child pid: %ld\n\t\ttime spent in %s: %ld micro seconds\n",
        (long) getpid(), (long) *pidvp, method, interval);
    fflush(stdout);
    free((void *) pidvp);
    pidvp = NULL;
    return 0;
}

int main(int argc, char *argv[])
{
    int ret, rval;
    unsigned int count;
    unsigned int max_blocks;
    struct memblock mblock;

    rval = 0;
    count = 0;
    if (argc > 0x3) {
        g_fork = vfork;
        fputs("Using vfork to create child process.\n", stdout);
        fflush(stdout);
    } else {
        fputs("Using `fork to create child process.\n", stdout);
        fflush(stdout);
    }

    ret = memblock_init(&mblock, argc, argv);
    if (ret < 0)
        return 1;

    max_blocks = (mblock.num_blocks * 0x3) >> 0x2;
    while (count < max_blocks) {
        ret = memblock_alloc(&mblock);
        if (ret < 0) {
            rval = 2;
            goto err0;
        }
        count = memblock_count(&mblock, 0x1);
        fork_process();
        nano_sleep(250);
    }

again:
    while (count < mblock.num_blocks) {
        memblock_alloc(&mblock);
        count++;
        memblock_count(&mblock, 0x1);
        fork_process();
        nano_sleep(750);
    }

    while (count >= max_blocks) {
        ret = memblock_free(&mblock);
        if (ret < 0) {
            rval = 3;
            goto err0;
        }
        count = memblock_count(&mblock, 0x1);
        fork_process();
        nano_sleep(750);
    }
    goto again;

err0:
    memblock_destory(&mblock);
    return rval;
}

编译和运行的操作如下:

aarch64-linux-gnu-gcc -Wall -O1 -ggdb -fPIC -o fork-vfork fork-vfork.c
./fork-vfork 128 0x800000 # 调用 fork
./fork-vfork 128 0x800000 1 # 调用 vfork
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值