【cpufreq governor】cpu util 和 cpu margin怎么计算的

最新推荐文章于 2023-08-09 15:35:55 发布

悟空明镜

最新推荐文章于 2023-08-09 15:35:55 发布

阅读量4.9k

点赞数 2

分类专栏： EAS-调度器学习 linux kernel cfs scheduler 文章标签： cpu util margin

转载请联系我，否则侵权，谢谢。

本文链接：https://blog.csdn.net/wukongmingjing/article/details/81739394

版权

EAS-调度器学习同时被 2 个专栏收录

13 篇文章 42 订阅

订阅专栏

linux kernel cfs scheduler

13 篇文章 30 订阅

订阅专栏

在计算cpu的util(函数sugov_get_util)期间需要使用margin来补偿util(在看schedutil governor的时候，不仅仅有cpu 的util margin，还有freq margin)，得到最终的util+=margin
那么这个margin怎么计算的呢？
sugov_update_shared–>sugov_get_util–>boosted_cpu_util
下面就来看看这个函数怎么计算margin的吧

unsigned long  
boosted_cpu_util(int cpu)  
{  
    unsigned long util = cpu_util_freq(cpu);//获取当前cpu的util 
    long margin = schedtune_cpu_margin(util, cpu);  
  
    trace_sched_boost_cpu(cpu, util, margin);  
  
    return util + margin;  
}

使用walt计算cpu util：

static inline unsigned long cpu_util_freq(int cpu)  
{  
    unsigned long util = cpu_rq(cpu)->cfs.avg.util_avg;  
    unsigned long capacity = capacity_orig_of(cpu);  
  
#ifdef CONFIG_SCHED_WALT  
    if (!walt_disabled && sysctl_sched_use_walt_cpu_util)  
        util = div64_u64(cpu_rq(cpu)->cumulative_runnable_avg,  
                 walt_ravg_window >> SCHED_LOAD_SHIFT);  
#endif  
    return (util >= capacity) ? capacity : util;  
}

可以知道util = cumulative_runnable_avg/(walt_avg_window>>10)。walt_avg_window是常量，在walt.c文件中定义了，而且在将walt负载怎么计算的文章中有详细的解释：https://blog.csdn.net/wuming_422103632/article/details/81633225

我们重点看schedtune_cpu_margin怎么计算得到margin的

static inline int  
schedtune_cpu_margin(unsigned long util, int cpu)  
{  
    int boost = schedtune_cpu_boost(cpu);  
  
    if (boost == 0)  
        return 0;  
  
    return schedtune_margin(util, boost);  
}

详细解释如下：
1.schedtune_cpu_boost函数

int schedtune_cpu_boost(int cpu)  
{  
    struct boost_groups *bg;  
  
    bg = &per_cpu(cpu_boost_groups, cpu);  
    return bg->boost_max;  
}

上面函数是获取结构体struct boost_group 元素boost_max，结构体boost_group是存储cpu上面runnable task分不同的group，可能每个group有不同的boost参数设定。下面是这个结构体的解释：

/* SchedTune boost groups 
 * Keep track of all the boost groups which impact on CPU, for example when a 
 * CPU has two RUNNABLE tasks belonging to two different boost groups and thus 
 * likely with different boost values. 
 * Since on each system we expect only a limited number of boost groups, here 
 * we use a simple array to keep track of the metrics required to compute the 
 * maximum per-CPU boosting value. 
 */  
struct boost_groups {  
    /* Maximum boost value for all RUNNABLE tasks on a CPU */  
    bool idle;  
    int boost_max;  
    struct {  
        /* The boost for tasks on that boost group */  
        int boost;  
        /* Count of RUNNABLE tasks on that boost group */  
        unsigned tasks;  
    } group[BOOSTGROUPS_COUNT];  
    /* CPU's boost group locking */  
    raw_spinlock_t lock;  
};  
/* Boost groups affecting each CPU in the system */  
DEFINE_PER_CPU(struct boost_groups, cpu_boost_groups);

2.schedtune_margin函数：

static long  
schedtune_margin(unsigned long signal, long boost)  
{  
    long long margin = 0;  
  
    /* 
     * Signal proportional compensation (SPC) 
     * 
     * The Boost (B) value is used to compute a Margin (M) which is 
     * proportional to the complement of the original Signal (S): 
     *   M = B * (SCHED_CAPACITY_SCALE - S) 
     * The obtained M could be used by the caller to "boost" S. 
     */  
    if (boost >= 0)  
        margin = signal * boost;  
    else  
        margin = -signal * boost;  
  
    margin  = reciprocal_divide(margin, schedtune_spc_rdiv);  
  
    if (boost >= 0)  
        margin = clamp_t(long long, margin, 0,  
                        SCHED_CAPACITY_SCALE - signal);  
  
    if (boost < 0)  
        margin *= -1;  
    return margin;  
}

主要是计算schedtune_spc_rdiv = reciprocal_value(100);reciprocal_value函数源码如下：

struct reciprocal_value reciprocal_value(u32 d)  
{  
    struct reciprocal_value R;  
    u64 m;  
    int l;  
  
    l = fls(d - 1);  /*d=100,fls(99)=7*/
    m = ((1ULL << 32) * ((1ULL << l) - d));  
    do_div(m, d);  
    ++m;  
    R.m = (u32)m;  //R.m = 1202590843
    R.sh1 = min(l, 1);  //R.sh1 = 1
    R.sh2 = max(l - 1, 0); //R.sh2 = 6
  
    return R;  
}  
 
/** 
 * fls - find last (most-significant) bit set 
 * @x: the word to search 
 * 
 * This is defined the same way as ffs. 
 * Note fls(0) = 0, fls(1) = 1, fls(0x80000000) = 32. 
 */  
  
static __always_inline int fls(int x)  
{  
    int r = 32;  
  
    if (!x)  
        return 0;  
    if (!(x & 0xffff0000u)) {  
        x <<= 16;  
        r -= 16;  
    }  
    if (!(x & 0xff000000u)) {  
        x <<= 8;  
        r -= 8;  
    }  
    if (!(x & 0xf0000000u)) {  
        x <<= 4;  
        r -= 4;  
    }  
    if (!(x & 0xc0000000u)) {  
        x <<= 2;  
        r -= 2;  
    }  
    if (!(x & 0x80000000u)) {  
        x <<= 1;  
        r -= 1;  
    }  
    return r;  
}

所以能够知道结构体schedtune_spc_rdiv是一个常量结构体，而函数

margin  = reciprocal_divide(margin, schedtune_spc_rdiv);

是一个除法，reciprocal_divide是计算A/B的优化函数。由于乘法在计算机上快得多,所以内核使用所谓的 Newton-Raphson 方法,这只需要乘法和位移,虽然我们对数学细节并不关系,但我们需要知道,内核可以不计算C=A/B,而是使用C=reciprocal_divide(A, reciprocal_value(B))的方式,后者涉及的两个函数都是库程序。

static inline u32 reciprocal_divide(u32 a, struct reciprocal_value R)  
{  
    u32 t = (u32)(((u64)a * R.m) >> 32);  
    return (t + ((a - t) >> R.sh1)) >> R.sh2;  
}

这个算法的思路简介如下：算法简介
这个算法的实现就是margin = signal*boost/100,从打印结果也证实了：

[ 52.191507] signal=49,margin=4 //执行reciprocal_divide之后打印的
[ 52.191519] boost=10,margin=4 //执行clamp_t之后打印的数值
[ 52.191532] signal=49,margin=4
[ 52.191543] boost=10,margin=4
[ 52.191931] signal=700,margin=70

而clamp_t函数定义如下：

#define clamp_t(type, val, lo, hi) min_t(type, max_t(type, val, lo), hi)

可以明显的知道margin就是signal/100的数值，这里的signal是经过boost转化的，具体看上面的code，经过这么负载的算法运算，目的是这个除法运算太频繁了，而在arm里面除法比较消耗cpu指令，所以乘法和位移来替代除法运算。最后归结一句话，上面的代码没必要看，执行boosted_cpu_util(cpu)之后，对此cpu的util修正，实际执行的是如下公式：

util += ±(util * boost)/100

最复杂的就是优化除法。遗留action：
上面处理的boost在schedtune中，涉及到cgroup(control group)这个子系统比较复杂，需要耗时间去理解。后续在学习

悟空明镜

关注

2
点赞
踩
7

收藏

觉得还不错? 一键收藏
0
评论
【cpufreq governor】cpu util 和 cpu margin怎么计算的

在计算cpu的util(函数sugov_get_util)期间需要使用margin来补偿util(在看schedutil governor的时候，不仅仅有cpu 的util margin，还有freq margin)，得到最终的util+=margin 那么这个margin怎么计算的呢？ sugov_update_shared–&amp;amp;amp;amp;amp;gt;sugov_get_util–&amp;amp;amp;amp;amp;gt;boosted_cp...
复制链接

扫一扫

专栏目录