linux进程优先级

一、linux进程优先级表示方法

struct task_struct {
        ...
    int prio, static_prio, normal_prio;
    unsigned int rt_priority;
        ...
}
字段描述
static_prio用于保存静态优先级, 是进程启动时分配的优先级, 可以通过nice系统调用来进行修改, 取值范围为 100-139;static_prio 的值越小,表明进程的静态优先级越高
rt_priority用于保存实时优先级,取值范围为0-99。实时优先级(rt_priority)的值越大,意味着进程优先级越高
normal_prio他的值取决于静态优先级和调度策略
prio保存进程的动态优先级,内核决定。有些进程超时,内核会自动调高它的动态优先级。调度器最终使用的优先级数值。prio 值的范围是 0 ~ 139,prio 值越小,表明进程的优先级越高

1、static_prio 静态优先级

  静态优先级不会随时间改变,内核不会主动修改它,可以通过nice系统调用来进行修改

/*
 * Convert user-nice values [ -20 ... 0 ... 19 ]
 * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ],
 * and back.
 */
#define NICE_TO_PRIO(nice)    (MAX_RT_PRIO + (nice) + 20)
#define PRIO_TO_NICE(prio)    ((prio) - MAX_RT_PRIO - 20)
#define TASK_NICE(p)        PRIO_TO_NICE((p)->static_prio)

/*
 * 'User priority' is the nice value converted to something we
 * can work with better when scaling various scheduler parameters,
 * it's a [ 0 ... 39 ] range.
 */
#define USER_PRIO(p)        ((p)-MAX_RT_PRIO)
#define TASK_USER_PRIO(p)    USER_PRIO((p)->static_prio)
#define MAX_USER_PRIO        (USER_PRIO(MAX_PRIO))

/********************* 函数 set_user_nice *****************************/
p->static_prio = NICE_TO_PRIO(nice);        // 当有需要时,系统会通过调用 NICE_TO_PRIO() 来修改 static_prio 的值

由上面代码知道,我们可以通过调用 NICE_TO_PRIO(nice) 来修改 static_prio  的值, static_prio 值的计算方法如下:

static_prio = MAX_RT_PRIO + nice +20

  MAX_RT_PRIO 的值为100,nice 的范围是 -20 ~ +19,故 static_prio 值的范围是 100 ~ 139。 static_prio 的值越小,表明进程的静态优先级越高

2、rt_priority 实时优先级

  rt_priority 值的范围是 0 ~ 99,只对实时进程有效。由式子:

prio = MAX_RT_PRIO-1 - p->rt_priority; 

  知道,rt_priority 值越大,则 prio 值越小,故 实时优先级(rt_priority)的值越大,意味着进程优先级越高

  rt_priority 的值也是取决于调度策略的,可以在 _setscheduler 函数中对 rt_priority 值进行设置。

 3、normal_prio 优先级

  normal_prio 的值取决于静态优先级和调度策略,可以通过 _setscheduler 函数来设置 normal_prio 的值 。

进程类型调度器普通优先级normal_prio
EDF实时进程EDFMAX_DL_PRIO-1 = -1
普通实时进程RTMAX_RT_PRIO-1 - p->rt_priority = 99 - rt_priority
普通进程CFS__normal_prio(p) = static_prio

core.c - kernel/sched/core.c - Linux source code v4.9.337 - Bootlin

/*
 * __normal_prio - return the priority that is based on the static prio
 * 普通进程(非实时进程)的普通优先级normal_prio就是静态优先级static_prio
 */
static inline int __normal_prio(struct task_struct *p)
{
    return p->static_prio;
}

/*
 * Calculate the expected normal priority: i.e. priority
 * without taking RT-inheritance into account. Might be
 * boosted by interactivity modifiers. Changes upon fork,
 * setprio syscalls, and whenever the interactivity
 * estimator recalculates.
 */
static inline int normal_prio(struct task_struct *p)
{
    int prio;

    if (task_has_dl_policy(p))              /*  EDF调度的实时进程  */
            prio = MAX_DL_PRIO-1;
    else if (task_has_rt_policy(p))       /*  普通实时进程的优先级  */
            prio = MAX_RT_PRIO-1 - p->rt_priority;
    else                                              /*  普通进程的优先级  */
            prio = __normal_prio(p);
    return prio;
}

4、prio 动态优先级

  prio 的值是调度器最终使用的优先级数值,即调度器选择一个进程时实际选择的值。prio 值越小,表明进程的优先级越高。prio  值的取值范围是 0 ~ MAX_PRIO,即 0 ~ 139(包括 0 和 139),根据调度策略的不同,又可以分为两个区间,其中区间 0 ~ 99 的属于实时进程,区间 100 ~139 的为非实时进程,即:

  • 当进程为实时进程时, prio 的值由实时优先级值(rt_priority)计算得来:prio = MAX_RT_PRIO - 1 - rt_priority
  • 当进程为非实时进程时,prio 的值由静态优先级值(static_prio)得来:prio = normal_prio = static_prio
p->prio = effective_prio(p);

core.c - kernel/sched/core.c - Linux source code v4.9.337 - Bootlin

/*
 * Calculate the current priority, i.e. the priority
 * taken into account by the scheduler. This value might
 * be boosted by RT tasks, or might be boosted by
 * interactivity modifiers. Will be RT if the task got
 * RT-boosted. If not then it returns p->normal_prio.
 */
static int effective_prio(struct task_struct *p)
{
    p->normal_prio = normal_prio(p);
    /*
     * If we are RT tasks or we were boosted to RT priority,
     * keep the priority unchanged. Otherwise, update priority
     * to the normal priority:
     */
    if (!rt_prio(p->prio))
            return p->normal_prio;
    return p->prio;
}

我们会发现函数首先effective_prio设置了普通优先级, 显然我们用effective_prio同时设置了两个优先级(普通优先级normal_prio和动态优先级prio)

因此计算动态优先级的流程如下

  • 设置进程的普通优先级(实时进程99-rt_priority, 普通进程为static_priority)
  • 计算进程的动态优先级(实时进程则维持动态优先级的prio不变, 普通进程的动态优先级即为其普通优先级)

prio和normal_prio有和区别?

调度器会考虑的优先级则保存在prio. 由于在某些情况下内核需要暂时提高进程的优先级, 因此需要用prio表示. 由于这些改变不是持久的, 因此静态优先级static_prio和普通优先级normal_prio不受影响.

二、linux进程优先级查看方法

1、cat /proc/PID/stat

每个字段的含义可以在 proc 手册页中找到(man proc)。与优先级相关的字段是:

第18个字段:priority(进程的优先级),数值上等于进程的动态优先级-100

task_struct->prio - MAX_RT_PRIO;//MAX_RT_PRIO == 100


第19个字段:nice(进程的nice值),数值上等于进程的静态 级-120

task_struct->static_prio-DEFAULT_PRIO //DEFAULT_PRIO==120

ps,top命令输出优先级和nice值就是通过proc目录下的这两个值衍生过来的!

源码实现:

 内核头文件linux/sched/prio.h定义了关于进程优先级的宏,其中用于转换关系是MAX_RT_PRIO和DEFAULT_PRIO:

MAX_RT_PRIO100
DEFAULT_PRIO120

array.c - fs/proc/array.c - Linux source code v4.9.337 - Bootlin

static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
			struct pid *pid, struct task_struct *task, int whole)
{
	int priority, nice;
......
	/* scale priority and nice values from timeslices to -20..20 */
	/* to make it look like a "normal" Unix priority/nice value  */
	priority = task_prio(task);
	nice = task_nice(task);

......
	seq_put_decimal_ll(m, " ", priority);
	seq_put_decimal_ll(m, " ", nice);

......
	return 0;
}
#define PRIO_TO_NICE(prio)	((prio) - DEFAULT_PRIO) //DEFAULT_PRIO==120

/**
 * task_prio - return the priority value of a given task.
 * @p: the task in question.
 *
 * Return: The priority value as seen by users in /proc.
 * RT tasks are offset by -200. Normal tasks are centered
 * around 0, value goes from -16 to +15.
 */
int task_prio(const struct task_struct *p)
{
	return p->prio - MAX_RT_PRIO;//MAX_RT_PRIO == 100
}


/**
 * task_nice - return the nice value of a given task.
 * @p: the task in question.
 *
 * Return: The nice value [ -20 ... 0 ... 19 ].
 */
static inline int task_nice(const struct task_struct *p)
{
	return PRIO_TO_NICE((p)->static_prio);
}
extern

array.c - fs/proc/array.c - Linux source code v4.9.337 - Bootlin

2、cat /proc/PID/sched

policy:调度策略

prio:动态优先级

ref:

Linux 进程管理(五)进程调度的调试 | Matrix

 2、ps命令

ps -l 命令是 ps 命令的一个选项,用于以长格式显示进程信息,这个选项会提供比默认输出更多的详细信息,包括进程的优先级、状态、父进程ID等。

每个字段的含义如下:

  • F:进程标志,4 代表使用者为 super user
  • S 代表这个程序的状态 (STAT)
  • UID:用户ID,表示进程所有者的用户ID。
  • PID:进程ID。
  • PPID:父进程ID。
  • C:CPU使用率。
  • PRI:进程的优先级。
  • NI:进程的nice值。
  • SZ:进程的虚拟内存大小(以页为单位)。
  • RSS:进程的驻留集大小(实际使用的物理内存,以KB为单位)。
  • WCHAN:进程正在等待的内核函数。
  • TTY:终端类型。
  • TIME:进程使用的CPU时间。
  • COMMAND:启动进程的命令。

ps_PRI = static_priority - 40

这边PRI的取值范围为[-40,99],也就是说,ps中PRI值为80等价于nice值为0,等价于静态优先级的120。

源码分析:

procps-ng / procps · GitLab

library\readproc.c

读取proc//PID/stat下的priority,nice属性,赋值到proc_t->priority,proc_t->nice

///

// Reads /proc/*/stat files, being careful not to trip over processes with
// names like ":-) 1 2 3 4 5 6".
static int stat2proc (const char *S, proc_t *restrict P) {
......
    sscanf(S,
       "%c "                      // state
       "%d %d %d %d %d "          // ppid, pgrp, sid, tty_nr, tty_pgrp
       "%lu %lu %lu %lu %lu "     // flags, min_flt, cmin_flt, maj_flt, cmaj_flt
       "%llu %llu %llu %llu "     // utime, stime, cutime, cstime
       "%d %d "                   // priority, nice
       "%d "                      // num_threads
       "%lu "                     // 'alarm' == it_real_value (obsolete, always 0)
       "%llu "                    // start_time
       "%lu "                     // vsize
       "%lu "                     // rss
       "%lu %lu %lu %lu %lu %lu " // rsslim, start_code, end_code, start_stack, esp, eip
       "%*s %*s %*s %*s "         // pending, blocked, sigign, sigcatch                      <=== DISCARDED
       "%lu %*u %*u "             // 0 (former wchan), 0, 0                                  <=== Placeholders only
       "%d %d "                   // exit_signal, task_cpu
       "%d %d "                   // rt_priority, policy (sched)
       "%llu %llu %llu",          // blkio_ticks, gtime, cgtime
       &P->state,
       &P->ppid, &P->pgrp, &P->session, &P->tty, &P->tpgid,
       &P->flags, &P->min_flt, &P->cmin_flt, &P->maj_flt, &P->cmaj_flt,
       &P->utime, &P->stime, &P->cutime, &P->cstime,
       &P->priority, &P->nice,
       &P->nlwp,
       &P->alarm,
       &P->start_time,
       &P->vsize,
       &P->rss,
       &P->rss_rlim, &P->start_code, &P->end_code, &P->start_stack, &P->kstk_esp, &P->kstk_eip,
/*     P->signal, P->blocked, P->sigignore, P->sigcatch,   */ /* can't use */
       &P->wchan, /* &P->nswap, &P->cnswap, */  /* nswap and cnswap dead for 2.4.xx and up */
/* -- Linux 2.0.35 ends here -- */
       &P->exit_signal, &P->processor,  /* 2.2.1 ends with "exit_signal" */
/* -- Linux 2.2.8 to 2.5.17 end here -- */
       &P->rtprio, &P->sched,  /* both added to 2.5.18 */
       &P->blkio_tics, &P->gtime, &P->cgtime
    );
......
}

 src\ps\output.c

根据proc_t->priority的值进行各种转换,rSv(PRIORITY, s_int, pp) == proc_t->priority

//

// "PRI" is created by "opri", or by "pri" when -c is used.
//
// Unix98 only specifies that a high "PRI" is low priority.
// Sun and SCO add the -c behavior. Sun defines "pri" and "opri".
// Linux may use "priority" for historical purposes.
//
// According to the kernel's fs/proc/array.c and kernel/sched.c source,
// the kernel reports it in /proc via this:
//        p->prio - MAX_RT_PRIO
// such that "RT tasks are offset by -200. Normal tasks are centered
// around 0, value goes from -16 to +15" but who knows if that is
// before or after the conversion...
//
// <linux/sched.h> says:
// MAX_RT_PRIO is currently 100.       (so we see 0 in /proc)
// RT tasks have a p->prio of 0 to 99. (so we see -100 to -1)
// non-RT tasks are from 100 to 139.   (so we see 0 to 39)
// Lower values have higher priority, as in the UNIX standard.
//
// In any case, pp->priority+100 should get us back to what the kernel
// has for p->prio.
//
// Test results with the "yes" program on a 2.6.x kernel:
//
// # ps -C19,_20 -o pri,opri,intpri,priority,ni,pcpu,pid,comm
// PRI PRI PRI PRI  NI %CPU  PID COMMAND
//   0  99  99  39  19 10.6 8686 19
//  34  65  65   5 -20 94.7 8687 _20
//
// Grrr. So the UNIX standard "PRI" must NOT be from "pri".
// Either of the others will do. We use "opri" for this.
// (and use "pri" when the "-c" option is used)
// Probably we should have Linux-specific "pri_for_l" and "pri_for_lc"
//
// sched_get_priority_min.2 says the Linux static priority is
// 1..99 for RT and 0 for other... maybe 100 is kernel-only?
//
// A nice range would be -99..0 for RT and 1..40 for normal,
// which is pp->priority+1. (3-digit max, positive is normal,
// negative or 0 is RT, and meets the standard for PRI)
//

// legal as UNIX "PRI"
// "priority"         (was -20..20, now -100..39)
static int pr_priority(char *restrict const outbuf, const proc_t *restrict const pp){    /* -20..20 */
setREL1(PRIORITY)
    return snprintf(outbuf, COLWID, "%d", rSv(PRIORITY, s_int, pp));
}

// legal as UNIX "PRI"
// "intpri" and "opri" (was 39..79, now  -40..99)
static int pr_opri(char *restrict const outbuf, const proc_t *restrict const pp){        /* 39..79 */
setREL1(PRIORITY)
    return snprintf(outbuf, COLWID, "%d", 60 + rSv(PRIORITY, s_int, pp));
}

// legal as UNIX "PRI"
// "pri_foo"   --  match up w/ nice values of sleeping processes (-120..19)
static int pr_pri_foo(char *restrict const outbuf, const proc_t *restrict const pp){
setREL1(PRIORITY)
    return snprintf(outbuf, COLWID, "%d", rSv(PRIORITY, s_int, pp) - 20);
}

// legal as UNIX "PRI"
// "pri_bar"   --  makes RT pri show as negative       (-99..40)
static int pr_pri_bar(char *restrict const outbuf, const proc_t *restrict const pp){
setREL1(PRIORITY)
    return snprintf(outbuf, COLWID, "%d", rSv(PRIORITY, s_int, pp) + 1);
}

// legal as UNIX "PRI"
// "pri_baz"   --  the kernel's ->prio value, as of Linux 2.6.8     (1..140)
static int pr_pri_baz(char *restrict const outbuf, const proc_t *restrict const pp){
setREL1(PRIORITY)
    return snprintf(outbuf, COLWID, "%d", rSv(PRIORITY, s_int, pp) + 100);
}

// not legal as UNIX "PRI"
// "pri"               (was 20..60, now    0..139)
static int pr_pri(char *restrict const outbuf, const proc_t *restrict const pp){         /* 20..60 */
setREL1(PRIORITY)
    return snprintf(outbuf, COLWID, "%d", 39 - rSv(PRIORITY, s_int, pp));
}

// not legal as UNIX "PRI"
// "pri_api"   --  match up w/ RT API    (-40..99)
static int pr_pri_api(char *restrict const outbuf, const proc_t *restrict const pp){
setREL1(PRIORITY)
    return snprintf(outbuf, COLWID, "%d", -1 - rSv(PRIORITY, s_int, pp));
}

// Linux applies nice value in the scheduling policies (classes)
// SCHED_OTHER(0) and SCHED_BATCH(3).  Ref: sched_setscheduler(2).
// Also print nice value for old kernels which didn't use scheduling
// policies (-1).
static int pr_nice(char *restrict const outbuf, const proc_t *restrict const pp){
setREL2(NICE,SCHED_CLASS)
  if(rSv(SCHED_CLASS, s_int, pp)!=0 && rSv(SCHED_CLASS, s_int, pp)!=3 && rSv(SCHED_CLASS, s_int, pp)!=-1) return snprintf(outbuf, COLWID, "-");
  return snprintf(outbuf, COLWID, "%d", rSv(NICE, s_int, pp));
}

/* Note: upon conversion to the <pids> API the numerous former sort provisions
         for otherwise non-printable fields (pr_nop) have been retained. And,
         since the new library can sort on any item, many previously printable
         but unsortable fields have now been made sortable. */
/* there are about 211 listed */
/* Many of these are placeholders for unsupported options. */
static const format_struct format_array[] = { /*
 .spec        .head      .pr               .sr                   .width .vendor .flags  */
......
{"pri",       "PRI",     pr_pri,           PIDS_PRIORITY,            3,    XXX,  TO|RIGHT},
{"pri_api",   "API",     pr_pri_api,       PIDS_PRIORITY,            3,    LNX,  TO|RIGHT},
{"pri_bar",   "BAR",     pr_pri_bar,       PIDS_PRIORITY,            3,    LNX,  TO|RIGHT},
{"pri_baz",   "BAZ",     pr_pri_baz,       PIDS_PRIORITY,            3,    LNX,  TO|RIGHT},
{"pri_foo",   "FOO",     pr_pri_foo,       PIDS_PRIORITY,            3,    LNX,  TO|RIGHT},
{"priority",  "PRI",     pr_priority,      PIDS_PRIORITY,            3,    LNX,  TO|RIGHT},
.......
};
// "priority"         (was -20..20, now -100..39)
// "intpri" and "opri" (was 39..79, now  -40..99)
// "pri_foo"   --  match up w/ nice values of sleeping processes (-120..19)
// "pri_bar"   --  makes RT pri show as negative       (-99..40)
// "pri_baz"   --  the kernel's ->prio value, as of Linux 2.6.8     (1..140)
// "pri"               (was 20..60, now    0..139)
// "pri_api"   --  match up w/ RT API    (-40..99)
ps -o pid,priority,opri,pri_foo,pri_bar,pri_baz,pri,pri_api,comm

  PID PRI PRI FOO BAR BAZ PRI API COMMAND
 2201  20  80   0  21 120  19 -21 zsh
 2762  30  90  10  31 130   9 -31 cat
 2826  20  80   0  21 120  19 -21 ps

ref:

linux - Unix ps -l priority - Super User

https://www.zhihu.com/question/38975681

3、top命令

实时进程下,PR 直接显示为 "RT" (某些特殊情况下会为负值),而实际的优先级有另外的计算方式 (虽然显示为 "RT",但是其优先级隐含的处在 0 ~ 99 这个范围)。

非实时进程下,top_PR = static_priority - 100
也就是说,top中的PR取值为[0,39]

源码分析:

同ps命令一样,rSv(EU_PRI, s_int)== proc_t->priority==task_struct->prio - 100

如何理解nice值?

linux 进程调度策略和优先级

https://www.cnblogs.com/lcword/p/8267342.html

https://zhuanlan.zhihu.com/p/602290353

https://blog.csdn.net/u010317005/article/details/80531985

大总结:

三、linux进程优先级修改

nice值是用户可以设置的,最终会影响静态优先级

nice命令:新起一个进程的同时,设置进程优先级

renice命令:修改进程优先级

nice系统调用是的内核实现是sys_nice,其定义在SYSCALL_DEFINE1 identifier - Linux source code v4.9.337 - Bootlin

它在通过一系列检测后, 通过set_user_nice函数,设置到static_pri,

void set_user_nice(struct task_struct *p, long nice)
{
......
	p->static_prio = NICE_TO_PRIO(nice);
	set_load_weight(p);
	old_prio = p->prio;
	p->prio = effective_prio(p);
	delta = p->prio - old_prio;
......
}

Linux:线程优先级设置_linux线程优先级-CSDN博客

https://zhuanlan.zhihu.com/p/665100294

ref:

https://www.cnblogs.com/linhaostudy/p/9930245.html#autoid-0-0-0

https://www.cnblogs.com/tongye/p/9615625.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值