手撕OOM Killer

静望红尘

已于 2022-08-17 15:00:02 修改

阅读量501

点赞数 1

分类专栏： Linux 文章标签： linux

于 2022-08-16 18:11:10 首次发布

本文链接：https://blog.csdn.net/lxy320826/article/details/126371748

版权

Linux 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

内核参数总结

内核参数	默认值	作用	参数值
vm.panic_on_oom	0	oom的时候是否强制panic	0表示遇到OOM的时候，不panic，而是启动OOM Killer； 1表示在有cpuset、memory policy、memcg约束情况下的OOM，不panic，而是启动OOM Killer； 2表示OOM的时候强制系统panic；
vm.oom_kill_allocating_task	0	决定触发OOM时优先kill哪种进程	0表示OOM的时候kill内存占用最大的那个； 1表示kill当前申请内存时触发OOM的进程
vm.oom_dump_tasks	1	用来记录OOM时记录哪些日志。包括进程使用的虚拟内存总量、物理内存、进程的页表信息等	0表示关闭日志打印非0有三种情况打印进程内存使用情况 1. 由OOM导致的kernel panic 2. 没有找到符合条件的进程kill 3. 找到符合条件的进程并kill
/proc/PID/oom_score		用来控制进程打分，由内核通过内存消耗计算得出，分数越高，越容易被选中kill
/proc/PID/oom_score_adj	0	用来控制进程打分，由用户自定义，取值范围[-1000,1000]，-1000表示禁止OOM Kill杀死该进程。分数越高，越容易被选中kill	[-1000,1000]
/proc/PID/oom_adj	0	功能同oom_score_adj。内核中已经废弃该参数。系统中对oom_score_adj或oom_adj中任一个进行设置，内核都会进行两者之间的相互转换。	[-17,15]

统计系统内oom_score分数最高的进程

oomscore.sh

#!/bin/bash
for proc in $(find /proc -maxdepth 1 -regex '/proc/[0-9]+'); do
    printf "%2d %5d %s\n" \
        "$(cat $proc/oom_score)" \
        "$(basename $proc)" \
        "$(cat $proc/cmdline | tr '\0' ' ')"
done 2>/dev/null | sort -nr | head -n 10

内核源码流程

OOM流程

内核源码分析

内核版本：v3.10-rc7

oom的内核源码位于mm/oom_kill.c

主函数：out_of_memory，触发OOM时，就是调用的这个函数

/**
 * out_of_memory - kill the "best" process when we run out of memory
 * @zonelist: zonelist pointer
 * @gfp_mask: memory allocation flags
 * @order: amount of memory being requested as a power of 2
 * @nodemask: nodemask passed to page allocator
 * @force_kill: true if a task must be killed, even if others are exiting
 *
 * If we run out of memory, we have the choice between either
 * killing a random task (bad), letting the system crash (worse)
 * OR try to be smart about which process to kill. Note that we
 * don't have to be perfect here, we just have to be good.
 */
void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask,
        int order, nodemask_t *nodemask, bool force_kill)
{
    const nodemask_t *mpol_mask;
    struct task_struct *p;
    unsigned long totalpages;
    unsigned long freed = 0;
    unsigned int uninitialized_var(points);
    // 默认初始化constraint为CONSTRAINT_NONE
    enum oom_constraint constraint = CONSTRAINT_NONE;
    int killed = 0;
    
    // 通知注册在oom_notify_list上的模块，释放一些内存出来，如果能释放一些，freed > 0，那么直接返回
    blocking_notifier_call_chain(&oom_notify_list, 0, &freed);
    if (freed > 0)
        /* Got some memory back in the last second. */
        return;

    /*
     * If current has a pending SIGKILL or is exiting, then automatically
     * select it.  The goal is to allow it to allocate so that it may
     * quickly exit and free its memory.
     * 
     * 如果当前想要分配内存的进程恰好有一个挂起的SIGKILL信号或者正在退出，则自动选中这个进程
     * 将该进程标记为TIF_MEMDIE，目的是快速退出释放内存
     */
    if (fatal_signal_pending(current) || current->flags & PF_EXITING) {
        set_thread_flag(TIF_MEMDIE);
        return;
    }
    /*
     * Check if there were limitations on the allocation (only relevant for
     * NUMA) that may require different handling.
     * 对于NUMA架构的系统，检查是否存在一些约束策略
     * 对于UMA架构的系统，永远都是返回CONSTRAINT_NONE
     * 设置totalpages
     */
    constraint = constrained_alloc(zonelist, gfp_mask, nodemask,
                        &totalpages);
    // 非CONSTRAINT_MEMORY_POLICY约束，nodemask要置为NULL，这个参数就是用来标记MEMORY_POLICY策略的
    mpol_mask = (constraint == CONSTRAINT_MEMORY_POLICY) ? nodemask : NULL;
    
    // 检查/proc/sys/vm/panic_on_oom的设置
    // 0表示遇到OOM的时候，不panic，而是启动OOM Killer；
    // 非0和2表示在有cpuset、memory policy、memcg约束情况下的OOM，不panic，而是启动OOM Killer；
    // 2表示OOM的时候强制系统panic；
    check_panic_on_oom(constraint, gfp_mask, order, mpol_mask);
    
    // 如果/proc/sys/vm/oom_kill_allocating_task 不为0
    // 且当前进程内存描述符存在
    // 且当前进程不是禁止kill的进程
    // 且当前进程的score即/proc/PID/oom_score_adj 不是最小的-1000
    // 满足上述条件，就不用费事去计算得分再干掉谁了，直接就干掉当前分配内存的进程。
    if (sysctl_oom_kill_allocating_task && current->mm &&
        !oom_unkillable_task(current, NULL, nodemask) &&
        current->signal->oom_score_adj != OOM_SCORE_ADJ_MIN) {
        get_task_struct(current);
        // 因为自己分配内存导致自己被kill的，会有"Out of memory (oom_kill_allocating_task)" 日志输出
        oom_kill_process(current, gfp_mask, order, 0, totalpages, NULL,
                 nodemask,
                 "Out of memory (oom_kill_allocating_task)");
        goto out;
    }
    
    // 针对oom_kill_allocating_task为0或者当前进程不能kill的时候，挑选进程来kill
    p = select_bad_process(&points, totalpages, mpol_mask, force_kill);
    /* Found nothing?!?! Either we hang forever, or we panic. 
     * 找不到进程可以kill，那系统直接panic，dump当前系统所有进程的内存状态等信息。打印no killable processes
     */
    if (!p) {
        dump_header(NULL, gfp_mask, order, NULL, mpol_mask);
        panic("Out of memory and no killable processes...\n");
    }
    
    // 判断上一次oom killer选择kill的进程是否还在结束中，或者有一些进程正在结束，如果是，退出本次oom处理
    // 否则调用oom_kill_process函数来kill进程
    if (PTR_ERR(p) != -1UL) {
        oom_kill_process(p, gfp_mask, order, points, totalpages, NULL,
                 nodemask, "Out of memory");
        killed = 1;
    }
out:
    /*
     * Give the killed threads a good chance of exiting before trying to
     * allocate memory again.
     */
    if (killed)
        schedule_timeout_killable(1); // kill完成后，主动让出cpu
}

check_panic_on_oom函数，由/proc/sys/vm/panic_on_oom内核参数来控制OOM的时候，是否必须panic

/*
 * Determines whether the kernel must panic because of the panic_on_oom sysctl.
 */
void check_panic_on_oom(enum oom_constraint constraint, gfp_t gfp_mask,
            int order, const nodemask_t *nodemask)
{
    // /proc/sys/vm/panic_on_oom为0，则直接返回，不会内核panic
    if (likely(!sysctl_panic_on_oom))
        return;
    if (sysctl_panic_on_oom != 2) {
        /*
         * panic_on_oom == 1 only affects CONSTRAINT_NONE, the kernel
         * does not panic for cpuset, mempolicy, or memcg allocation
         * failures.
         * 
         * 当/proc/sys/vm/panic_on_oom 不为0或者2，且oom_constraint不为CONSTRAINT_NONE，才会发生panic
         * 对于UMA(一致性内存访问)系统而言，oom_constraint永远都是CONSTRAINT_NONE。
         * 也就是说永远都是真正的内存不足导致的OOM，而不是因为有CPUSET、MEMCG、MEMORY_POLICY等约束导致的内存不足。
         * 对于NUMA(非一致性内存访问)系统而言，有可能附加了上面的约束导致系统OOM，实际上系统中还有充足的内存。
         */
        if (constraint != CONSTRAINT_NONE)
            return;
    }
    // /proc/sys/vm/panic_on_oom 值为2的时候，直接系统panic，打印相关日志，日志中会记录compulsory字段
    dump_header(NULL, gfp_mask, order, NULL, nodemask);
    panic("Out of memory: %s panic_on_oom is enabled\n",
        sysctl_panic_on_oom == 2 ? "compulsory" : "system-wide");
}

select_bad_process函数，挑选合适的进程kill。

/*
 * Simple selection loop. We chose the process with the highest
 * number of 'points'.
 *
 * (not docbooked, we don't want this one cluttering up the manual)
 */
static struct task_struct *select_bad_process(unsigned int *ppoints,
        unsigned long totalpages, const nodemask_t *nodemask,
        bool force_kill)
{
    struct task_struct *g, *p;
    struct task_struct *chosen = NULL;
    unsigned long chosen_points = 0;

    rcu_read_lock(); // RCU读锁，遍历线程的过程中，禁止抢占
    do_each_thread(g, p) {
        unsigned int points;
        
        // 根据oom_scan_process_thread函数的返回值，判断究竟是忽略该进程，还是计算对应的得分，还是直接中止本次挑选
        switch (oom_scan_process_thread(p, totalpages, nodemask,
                        force_kill)) {
        // 如果该进程被选中，则直接将chosen_points设置为unsigned long int类型的最大值
        case OOM_SCAN_SELECT:
            chosen = p;
            chosen_points = ULONG_MAX;
            /* fall through */
        case OOM_SCAN_CONTINUE:
            continue;
        case OOM_SCAN_ABORT: // 进程处于结束阶段或者正在被oom kill
            rcu_read_unlock();
            return ERR_PTR(-1UL);
        case OOM_SCAN_OK:
            break;
        };
        
        // 给进程打分,如果该进程的分数比上一个选中的进程分数高，则将选中的进程替换为该进程
        points = oom_badness(p, NULL, nodemask, totalpages);
        if (points > chosen_points) {
            chosen = p;
            chosen_points = points;
        }
    } while_each_thread(g, p);
    if (chosen)
        get_task_struct(chosen);
    rcu_read_unlock();

    *ppoints = chosen_points * 1000 / totalpages;
    return chosen;
}

oom_scan_process_thread函数

enum oom_scan_t oom_scan_process_thread(struct task_struct *task,
        unsigned long totalpages, const nodemask_t *nodemask,
        bool force_kill)
{
    // 进程的状态处于退出状态，则直接返回，不用管，继续下一个进程
    if (task->exit_state)
        return OOM_SCAN_CONTINUE;
    // 当前进程是init进程、内核线程等情况时，直接返回，不用管，继续下一个进程
    if (oom_unkillable_task(task, NULL, nodemask))
        return OOM_SCAN_CONTINUE;

    /*
     * This task already has access to memory reserves and is being killed.
     * Don't allow any other task to have access to the reserves.
     * 
     * 有进程正在因为OOM Killer而中止，则中止本次OOM Killer
     */
    if (test_tsk_thread_flag(task, TIF_MEMDIE)) {
        if (unlikely(frozen(task)))
            __thaw_task(task);
        if (!force_kill)
            return OOM_SCAN_ABORT;
    }
    
    // 进程的内存为0，直接返回，不用管，继续下一个进程
    if (!task->mm)
        return OOM_SCAN_CONTINUE;

    /*
     * If task is allocating a lot of memory and has been marked to be
     * killed first if it triggers an oom, then select it.
     * 进程分配了很多内存并且在触发OOM时被标记为优先中止，则选中该进程
     * !!(p->signal->oom_flags & OOM_FLAG_ORIGIN);
     */
    if (oom_task_origin(task))
        return OOM_SCAN_SELECT;
    
    // 进程处于结束阶段并且不是被强制kill
    if (task->flags & PF_EXITING && !force_kill) {
        /*
         * If this task is not being ptraced on exit, then wait for it
         * to finish before killing some other task unnecessarily.
         */
        if (!(task->group_leader->ptrace & PT_TRACE_EXIT))
            return OOM_SCAN_ABORT;
    }
    return OOM_SCAN_OK;
}

oom_badness函数，计算得分

/**
 * oom_badness - heuristic function to determine which candidate task to kill
 * @p: task struct of which task we should calculate
 * @totalpages: total present RAM allowed for page allocation
 * 
 * The heuristic for determining which task to kill is made to be as simple and
 * predictable as possible.  The goal is to return the highest value for the
 * task consuming the most memory to avoid subsequent oom failures.
 */
unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
              const nodemask_t *nodemask, unsigned long totalpages)
{
    long points;
    long adj;
    
    // 不能杀的进程不杀
    if (oom_unkillable_task(p, memcg, nodemask))
        return 0;
    
    // 父进程已经detached自己的mm，但是子线程还有指向该mm的指针。
    // 遍历进程的子线程，找出这个子线程并返回
    // 如果找不到，就返回0
    p = find_lock_task_mm(p);
    if (!p)
        return 0;
    
    // 获取配置的/proc/PID/oom_score_adj
    // 如果配置的是OOM_SCORE_ADJ_MIN，也就是-1000，就直接返回0
    adj = (long)p->signal->oom_score_adj;
    if (adj == OOM_SCORE_ADJ_MIN) {
        task_unlock(p);
        return 0;
    }

    /*
     * The baseline for the badness score is the proportion of RAM that each
     * task's rss, pagetable and swap space use.
     * 
     * get_mm_rss(): get_mm_counter(mm, MM_FILEPAGES) + get_mm_counter(mm, MM_ANONPAGES)。用户空间的文件映射和匿名页占用的物理内存页数
     * p->mm->nr_ptes: Page table pages: 进程的页表页数
     * get_mm_counter: swap中使用内存空间情况
     */
    points = get_mm_rss(p->mm) + p->mm->nr_ptes +
         get_mm_counter(p->mm, MM_SWAPENTS);
    task_unlock(p);

    /*
     * Root processes get 3% bonus, just like the __vm_enough_memory()
     * implementation used by LSMs.
     * 
     * 具有CAP_SYS_ADMIN能力的进程，在原有基础上，再减去30分
     */
    if (has_capability_noaudit(p, CAP_SYS_ADMIN))
        adj -= 30;

    /* Normalize to oom_score_adj units */
    adj *= totalpages / 1000;
    points += adj;

    /*
     * Never return 0 for an eligible task regardless of the root bonus and
     * oom_score_adj (oom_score_adj can't be OOM_SCORE_ADJ_MIN here).
     */
    return points > 0 ? points : 1;
}

oom_unkillable_task函数，用来判断哪些进程不能轻易被kill

/* return true if the task is not adequate as candidate victim task. */
static bool oom_unkillable_task(struct task_struct *p, 
        const struct mem_cgroup *memcg, const nodemask_t *nodemask)
{
    // 通过gid判断进程是否是init，如果是，直接返回true，因为1号进程不能被kill
    if (is_global_init(p))
        return true;
    // 内核线程不能被kill，直接返回true
    if (p->flags & PF_KTHREAD)
        return true;

    /* When mem_cgroup_out_of_memory() and p is not member of the group */
    // 如果开启了CONSTRAINT_MEMCG，但是这个进程不是开启MEMCG策略的这组进程的成员，则返回true，不需要kill
    if (memcg && !task_in_mem_cgroup(p, memcg))
        return true;

    
    /* p may not have freeable memory in nodemask */
    // 在NUMA系统中，如果进程是因为CONSTRAINT_MEMORY_POLICY约束导致的内存不足，返回true，不需要kill
    if (!has_intersects_mems_allowed(p, nodemask))
        return true;

    return false;
}

oom_kill_process函数，真正执行kill的函数

/*
 * Must be called while holding a reference to p, which will be released upon
 * returning.
 */
void oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
                      unsigned int points, unsigned long totalpages,
                      struct mem_cgroup *memcg, nodemask_t *nodemask,
                      const char *message)
{
        struct task_struct *victim = p;
        struct task_struct *child;
        struct task_struct *t = p;
        struct mm_struct *mm;
        unsigned int victim_points = 0;
        static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL,
                                              DEFAULT_RATELIMIT_BURST);

        /*
         * If the task is already exiting, don't alarm the sysadmin or kill
         * its children or threads, just set TIF_MEMDIE so it can die quickly
         * 
         * 如果该进程正在退出，仅仅给这个进程标记上TIF_MEMDIE，可以退出的更快
         */
        if (p->flags & PF_EXITING) {
                set_tsk_thread_flag(p, TIF_MEMDIE);
                put_task_struct(p);
                return;
        }

        // 如果log打印太频繁，会禁止打印日志，否则打印该进程的内存状态信息
        if (__ratelimit(&oom_rs))
                dump_header(p, gfp_mask, order, memcg, nodemask);

        task_lock(p);
        // 打印message, pid, 进程name, 进程的得分
        pr_err("%s: Kill process %d (%s) score %d or sacrifice child\n",
                message, task_pid_nr(p), p->comm, points);
        task_unlock(p);

        /*
         * If any of p's children has a different mm and is eligible for kill,
         * the one with the highest oom_badness() score is sacrificed for its
         * parent.  This attempts to lose the minimal amount of work done while
         * still freeing memory.
         * 
         * 遍历进程p的子进程，如果子进程的内存描述符mm和父进程的不同，且该子进程的得分最高，那么子进程会代替父进程被kill
         */
        read_lock(&tasklist_lock);
        do {
                list_for_each_entry(child, &t->children, sibling) {
                        unsigned int child_points;

                        if (child->mm == p->mm)
                                continue;
                        /*
                         * oom_badness() returns 0 if the thread is unkillable
                         */
                        child_points = oom_badness(child, memcg, nodemask,
                                                                totalpages);
                        if (child_points > victim_points) {
                                put_task_struct(victim);
                                victim = child;
                                victim_points = child_points;
                                get_task_struct(victim);
                        }
                }
        } while_each_thread(p, t);
        read_unlock(&tasklist_lock);

        rcu_read_lock();
        p = find_lock_task_mm(victim);
        if (!p) {
                rcu_read_unlock();
                put_task_struct(victim);
                return;
        } else if (victim != p) {
                get_task_struct(p);
                put_task_struct(victim);
                victim = p;
        }

        /* mm cannot safely be dereferenced after task_unlock(victim) */
        mm = victim->mm;
        pr_err("Killed process %d (%s) total-vm:%lukB, anon-rss:%lukB, file-rss:%lukB\n",
                task_pid_nr(victim), victim->comm, K(victim->mm->total_vm),
                K(get_mm_counter(victim->mm, MM_ANONPAGES)),
                K(get_mm_counter(victim->mm, MM_FILEPAGES)));
        task_unlock(victim);

        /*
         * Kill all user processes sharing victim->mm in other thread groups, if
         * any.  They don't get access to memory reserves, though, to avoid
         * depletion of all memory.  This prevents mm->mmap_sem livelock when an
         * oom killed thread cannot exit because it requires the semaphore and
         * its contended by another thread trying to allocate memory itself.
         * That thread will now get access to memory reserves since it has a
         * pending fatal signal.
         *
         * 对于所有与被选中进程共享地址空间，且不是内核线程，且oom_score_adj不是-1000的进程，都要被杀掉
         * 因为共享地址空间而被牵连的进程，会打印出sharing same memory日志，表明是被牵连
         */
        for_each_process(p)
                if (p->mm == mm && !same_thread_group(p, victim) &&
                    !(p->flags & PF_KTHREAD)) {
                        if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)
                                continue;

                        task_lock(p);   /* Protect ->comm from prctl() */
                        pr_err("Kill process %d (%s) sharing same memory\n",
                                task_pid_nr(p), p->comm);
                        task_unlock(p);
                        do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, true); // 向进程传递SIGKILL信号
                }
        rcu_read_unlock();

        set_tsk_thread_flag(victim, TIF_MEMDIE); // 标记该进程已经被oom_killer选中，正在被kill
        do_send_sig_info(SIGKILL, SEND_SIG_FORCED, victim, true);
        put_task_struct(victim);
}

dump_header和dump_tasks函数，打印进程内存状态等日志

static void dump_header(struct task_struct *p, gfp_t gfp_mask, int order,
            struct mem_cgroup *memcg, const nodemask_t *nodemask)
{
    task_lock(current);
    // 打印是哪个进程触发了OOM
    pr_warning("%s invoked oom-killer: gfp_mask=0x%x, order=%d, "
        "oom_score_adj=%hd\n",
        current->comm, gfp_mask, order,
        current->signal->oom_score_adj);
    // 打印当前进程的cpuset信息。包括cpuset的name和该进程可以访问哪些memory node
    cpuset_print_task_mems_allowed(current);
    task_unlock(current);
    // 打印进程的栈信息
    dump_stack();
    // 如果是memcg限制的进程，打印跟memcg相关的OOM信息
    if (memcg)
        mem_cgroup_print_oom_info(memcg, p); 
    else
        // 否则打印整个系统的内存使用情况
        show_mem(SHOW_MEM_FILTER_NODES);
    // 如果打开了oom_dump_tasks参数，就会调用dump_tasks函数dump用户空间所有进程的内存状态
    if (sysctl_oom_dump_tasks)
        dump_tasks(memcg, nodemask);
}

static void dump_tasks(const struct mem_cgroup *memcg, const nodemask_t *nodemask)
{
    struct task_struct *p;
    struct task_struct *task;

    pr_info("[ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name\n");
    rcu_read_lock();
    // 遍历系统中所有进程
    for_each_process(p) {
        // 如果是不可被Kill的进程，直接下一个
        if (oom_unkillable_task(p, memcg, nodemask))
            continue;
        
        //从当前进程或者该进程的子线程判断mm锁是否存在，如果存在，则返回包含该锁的当前进程或者该进程的子线程
        task = find_lock_task_mm(p);
        //如果这是内核线程或者进程p的所有子线程都不存在内存描述符mm,task为NULL，则直接跳过，不kill，同时这些进程的内存状况信息也不打印
        if (!task) {
            /*
             * This is a kthread or all of p's threads have already
             * detached their mm's.  There's no need to report
             * them; they can't be oom killed anyway.
             */
            continue;
        }
        //经过前面判断，对于有资格的task,开始打印相应的信息（pid, uid, tgid, vm size, rss, pgtables_bytes, swapents, oom_score_adj value, name）
        pr_info("[%5d] %5d %5d %8lu %8lu %7lu %8lu         %5hd %s\n",
            task->pid, from_kuid(&init_user_ns, task_uid(task)),
            task->tgid, task->mm->total_vm, get_mm_rss(task->mm),
            task->mm->nr_ptes,
            get_mm_counter(task->mm, MM_SWAPENTS),
            task->signal->oom_score_adj, task->comm);
        
        //释放的这个锁，是在函数find_lock_task_mm里面加的
        task_unlock(task);
    }
    rcu_read_unlock();
}

补充

当前系统是否是NUMA体系结构

lscpu命令查看

# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              8    # 共有8个逻辑CPU
On-line CPU(s) list: 0-7
Thread(s) per core:  2    # 每个core有2个Threads
Core(s) per socket:  4    # 每个socket有4个core
Socket(s):           1    # 共有1个socket
NUMA node(s):        1    # 共有1个NUMA nodes，所以每个socket有1个node
Vendor ID:           GenuineIntel
CPU family:          6
Model:               106
Model name:          Intel(R) Xeon(R) Platinum 8338C CPU @ 2.60GHz
Stepping:            6
CPU MHz:             2593.906
BogoMIPS:            5187.81
Virtualization:      VT-x
Hypervisor vendor:   KVM
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            4096K
L3 cache:            16384K
NUMA node0 CPU(s):   0-7
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat avx512vbmi umip pku ospke avx512_vpopcntdq la57 rdpid arch_capabilities

总的Threads = 1(socket) * 4(cores) * 2(Threads) = 8

Linux capabilities相关命令

yum install libcap-ng-utils

# 查看当前有特殊cap的所有进程
pscap -a

# 查看指定进程的cap
getpcaps PID

# 显示文件系统中支持cap的文件
filecap

# 获取可执行文件的cap
getcap xxx

静望红尘

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
手撕OOM Killer

从源码分析oom killer的过程
复制链接

扫一扫

专栏目录