linux内核 —— sched（一）

最新推荐文章于 2025-02-20 10:35:18 发布

tim514

最新推荐文章于 2025-02-20 10:35:18 发布

阅读量5k

点赞数 1

分类专栏： Linux内核文章标签： linux

本文链接：https://blog.csdn.net/tim514/article/details/122035958

版权

Linux内核专栏收录该内容

10 篇文章

订阅专栏

本文介绍了Linux内核中的调度系统，包括六种不同的调度策略、进程的优先级管理以及用户如何修改进程的调度策略和参数等内容。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

【前言】

一篇文章难以总结linux kernel中最核心的调度系统，但是可以将其中比较重要的点摘录出来，以便之后继续研究。

内核进行硬件资源的分配，而进程是运行的程序，是资源的分配单元：内存资源、CPU资源、I/O资源。

【1.调度策略】

调度策略是程序猿最直接打交道的地方，内核有6种调度策略：

Priority	Policy	Schduler	Detail
IDLE	SCHED_IDLE	CFS-IDLE	优先级最低，在系统负载很低时使用，0号进程
Normal	SCHED_NORMAL SCHED_OTHER	CFS	普通的分时进程
Normal	SCHED_BATCH	CFS	批处理进程
Realtime	SCHED_FIFO	RT	先入先出的实时进程
	SCHED_RR	RR	时间片轮转的实时进程
	SCHED_DEADLINE	DL	针对突发型计算，且对延迟和完成时间高度敏感的任务适用。基于Earliest Deadline First (EDF)

PS：

sched_batch调度的线程被认为是非交互式的，但是受CPU限制并针对吞吐量进行了优化.因此，此策略对缓存更友好。默认的sched_batch时间片为1.5秒。此外，在使用SMP的情况下，sched_batch将迁移到具有高空闲(相对于非批处理线程)的内核。

【2.优先级】

Show you the fucking code！

struct task_struct {
	int				prio;               //动态优先级
	int				static_prio;        //静态优先级
	int				normal_prio;        //归一优先级
	unsigned int	rt_priority;        //实时优先级
    ...
}；

prio：动态优先级，进行调度时使用优先级，动态优先级在运行时可以被修改，比如系统可能会临时调升一个普通进程的优先级，如果调度policy为：SCHED_FIFO或者SCHED_RR，不需要改变动态优先级。

static int effective_prio(struct task_struct *p)
{
	p->normal_prio = normal_prio(p);
	/*
	 * If we are RT tasks or we were boosted to RT priority,
	 * keep the priority unchanged. Otherwise, update priority
	 * to the normal priority:
	 */
	if (!rt_prio(p->prio))
		return p->normal_prio;
	return p->prio;
}

【3.用户修改Flow】

1. 设置Nice值：

如果是DL或者RT Policy，只改变priority。

如果是CFS类，dequeue --> 修改参数 --> enqueue (针对rq而言)，当delta （动态优先级的差）变化时，产生了调度点：调用resched_curr。

void set_user_nice(struct task_struct *p, long nice) {
    int old_prio, delta, queued;
    unsigned long flags;
    struct rq *rq;

    rq = task_rq_lock(p, &flags);
    if (task_has_dl_policy(p) || task_has_rt_policy(p)) {－－－－－－－－－－－（1）
        p->static_prio = NICE_TO_PRIO(nice);
        goto out_unlock;
    }
    queued = task_on_rq_queued(p);－－－－－－－－－－－－－－－－－－－（2）
    if (queued)
        dequeue_task(rq, p, DEQUEUE_SAVE);

    p->static_prio = NICE_TO_PRIO(nice);－－－－－－－－－－－－－－－－（3）
    set_load_weight(p);
    old_prio = p->prio;
    p->prio = effective_prio(p);
    delta = p->prio - old_prio;

    if (queued) {
        enqueue_task(rq, p, ENQUEUE_RESTORE);－－－－－－－－－－－－（2）
    if (delta < 0 || (delta > 0 && task_running(rq, p)))－－－－－－－－－－－－（4）
        resched_curr(rq);
    }

out_unlock:
    task_rq_unlock(rq, p, &flags);
}

2. 进程default的Policy和params：

万物皆由fork生，哈哈！：sched_fork

超级重要的一个标记：sched_reset_on_fork（在linux中，对于每一个进程，我们都会进行资源限制。引入了RLIMIT_RTTIME这个per-process的资源限制项）

1. 缺省的调度策略是SCHED_NORMAL，静态优先级等于120，不管父进程如何，即便是deadline的进程，其fork的子进程也需要恢复到缺省参数。

2. 子进程中恢复到了缺省的调度策略和优先级：既然调度策略和静态优先级已经修改了，那么也需要更新动态优先级和归一化优先级。此外，load weight也需要更新。

int sched_fork(unsigned long clone_flags, struct task_struct *p)
{
……  dup_task_struct
    p->prio = current->normal_prio; －－－－－－－－－－－－－－－－－－－（1）
    if (unlikely(p->sched_reset_on_fork)) {
        if (task_has_dl_policy(p) || task_has_rt_policy(p)) {－－－－－－－－－－（2）
            p->policy = SCHED_NORMAL;
            p->static_prio = NICE_TO_PRIO(0);
            p->rt_priority = 0;
        } else if (PRIO_TO_NICE(p->static_prio) < 0)
            p->static_prio = NICE_TO_PRIO(0);

        p->prio = p->normal_prio = __normal_prio(p); －－－－－－－－－－－－（3）
        set_load_weight(p);
        p->sched_reset_on_fork = 0;
    }
……
}

3、用户空间设定调度策略和调度参数

int thread_policy = sched_getscheduler(0);

sched_get_priority_min(thread_policy);
rr_max_priority = sched_get_priority_max(thread_policy);

thread_param.sched_priority = rr_max_priority;
ret = sched_setscheduler(0, thread_policy, &thread_param);

thread_policy = sched_getscheduler(0);
sched_getparam(0, &thread_param);

PS:

友好的程序员为普通线程预留了一定的时间，防止RT thread让系统无法调度其他进程。

kernel.sched_rt_runtime_us：950000000/10000000000