percpu_ref状态迁移的并发思考

最新推荐文章于 2022-08-20 11:59:03 发布

kaka__55

最新推荐文章于 2022-08-20 11:59:03 发布

阅读量679

点赞数

分类专栏： linux 文章标签： linux

本文链接：https://blog.csdn.net/kaka__55/article/details/122628965

版权

linux 专栏收录该内容

23 篇文章 0 订阅

订阅专栏

最近看了下percpu_ref的代码，感觉对于并发处理这块设计的挺巧妙的，写个博客记录一下。本文代码基于linux内核4.19.195.
之前简单了解了一下percpu_ref的作用，当我们认为这个变量将不被使用的时候，我们会调用percpu_ref_kill试图终结这个变量的生命周期，这个函数最终会调用到__percpu_ref_switch_to_atomic。当吧percpu_ref变量转换成atomic类型之后，就可以按照atomic的方式来对该变量保护的变量进行相关解引用/释放的操作了。但是，转换成atomic的这个过程中，可能并发的存在其他percpu_ref_put、percpu_ref_get的操作。那么，内核是怎么做到平滑的把percpu_ref从percpu模式切换到atomic模式的呢？
不妨直接来看看函数__percpu_ref_switch_to_atomic。

static void __percpu_ref_switch_to_atomic(struct percpu_ref *ref,
					  percpu_ref_func_t *confirm_switch)
{
	if (ref->percpu_count_ptr & __PERCPU_REF_ATOMIC) {
		if (confirm_switch)
			confirm_switch(ref);
		return;
	}

	/* switching from percpu to atomic */
	ref->percpu_count_ptr |= __PERCPU_REF_ATOMIC;

	/*
	 * Non-NULL ->confirm_switch is used to indicate that switching is
	 * in progress.  Use noop one if unspecified.
	 */
	ref->confirm_switch = confirm_switch ?: percpu_ref_noop_confirm_switch;

	percpu_ref_get(ref);	/* put after confirmation */
	call_rcu_sched(&ref->rcu, percpu_ref_switch_to_atomic_rcu);
}

首先，给percpu_ref变量置上了__PERCPU_REF_ATOMIC标志，然后，get了一把ref，然后把函数percpu_ref_switch_to_atomic_rcu挂在了rcu的回调链表上。当代码走到这里，percpu_ref变量就会被置上__PERCPU_REF_ATOMIC以及__PERCPU_REF_DEAD（这个在percpu_ref_kill函数一开始就调用了）标志了，此时，若有流程调用percpu_ref_put，则会判断ref的模式，根据这两个标志位，ref模式显然为atomic，则会使用atomic相关函数进行引用计数的增减操作，相关代码如下。

/*
 * Internal helper.  Don't use outside percpu-refcount proper.  The
 * function doesn't return the pointer and let the caller test it for NULL
 * because doing so forces the compiler to generate two conditional
 * branches as it can't assume that @ref->percpu_count is not NULL.
 */
static inline bool __ref_is_percpu(struct percpu_ref *ref,
					  unsigned long __percpu **percpu_countp)
{
	unsigned long percpu_ptr;

	/*
	 * The value of @ref->percpu_count_ptr is tested for
	 * !__PERCPU_REF_ATOMIC, which may be set asynchronously, and then
	 * used as a pointer.  If the compiler generates a separate fetch
	 * when using it as a pointer, __PERCPU_REF_ATOMIC may be set in
	 * between contaminating the pointer value, meaning that
	 * READ_ONCE() is required when fetching it.
	 *
	 * The smp_read_barrier_depends() implied by READ_ONCE() pairs
	 * with smp_store_release() in __percpu_ref_switch_to_percpu().
	 */
	percpu_ptr = READ_ONCE(ref->percpu_count_ptr);

	/*
	 * Theoretically, the following could test just ATOMIC; however,
	 * then we'd have to mask off DEAD separately as DEAD may be
	 * visible without ATOMIC if we race with percpu_ref_kill().  DEAD
	 * implies ATOMIC anyway.  Test them together.
	 */
	if (unlikely(percpu_ptr & __PERCPU_REF_ATOMIC_DEAD))
		return false;

	*percpu_countp = (unsigned long __percpu *)percpu_ptr;
	return true;
}
/**
 * percpu_ref_put - decrement a percpu refcount
 * @ref: percpu_ref to put
 *
 * Decrement the refcount, and if 0, call the release function (which was passed
 * to percpu_ref_init())
 *
 * This function is safe to call as long as @ref is between init and exit.
 */
static inline void percpu_ref_put(struct percpu_ref *ref)
{
	percpu_ref_put_many(ref, 1);
}

static inline void percpu_ref_put_many(struct percpu_ref *ref, unsigned long nr)
{
	unsigned long __percpu *percpu_count;

	rcu_read_lock_sched();

	if (__ref_is_percpu(ref, &percpu_count))
		this_cpu_sub(*percpu_count, nr);
	else if (unlikely(atomic_long_sub_and_test(nr, &ref->count)))
		ref->release(ref);

	rcu_read_unlock_sched();
}

从代码角度上来说，函数percpu_ref_switch_to_atomic_rcu可能不会立马就被调用，因为他需要等待优雅周期过去。但是，percpu的模式，从标志__PERCPU_REF_ATOMIC被置位的那一瞬间，就被切换成atomic模式了（这里能不能优化，直到percpu_ref_switch_to_atomic_rcu函数执行前再转换？即是否可以把置位该标志的语句挪到percpu_ref_switch_to_atomic_rcu中？），但是，把相关percpu上的引用计数统一整理到percpu_ref结构中的atomic_long_t变量count的工作，明明是在percpu_ref_switch_to_atomic_rcu函数里面做的。在percpu_ref_put_many函数中，若该atomic变量值为0，就会被调用release函数释放了呀，这可不行。那么，内核是怎么做的呢？
仔细观察函数percpu_ref_switch_to_atomic_rcu

static void percpu_ref_switch_to_atomic_rcu(struct rcu_head *rcu)
{
	struct percpu_ref *ref = container_of(rcu, struct percpu_ref, rcu);
	unsigned long __percpu *percpu_count = percpu_count_ptr(ref);
	unsigned long count = 0;
	int cpu;

	for_each_possible_cpu(cpu)
		count += *per_cpu_ptr(percpu_count, cpu);

	pr_debug("global %ld percpu %ld",
		 atomic_long_read(&ref->count), (long)count);

	/*
	 * It's crucial that we sum the percpu counters _before_ adding the sum
	 * to &ref->count; since gets could be happening on one cpu while puts
	 * happen on another, adding a single cpu's count could cause
	 * @ref->count to hit 0 before we've got a consistent value - but the
	 * sum of all the counts will be consistent and correct.
	 *
	 * Subtracting the bias value then has to happen _after_ adding count to
	 * &ref->count; we need the bias value to prevent &ref->count from
	 * reaching 0 before we add the percpu counts. But doing it at the same
	 * time is equivalent and saves us atomic operations:
	 */
	atomic_long_add((long)count - PERCPU_COUNT_BIAS, &ref->count);

	WARN_ONCE(atomic_long_read(&ref->count) <= 0,
		  "percpu ref (%pf) <= 0 (%ld) after switching to atomic",
		  ref->release, atomic_long_read(&ref->count));

	/* @ref is viewed as dead on all CPUs, send out switch confirmation */
	percpu_ref_call_confirm_rcu(rcu);
}

重点关注如下语句：

atomic_long_add((long)count - PERCPU_COUNT_BIAS, &ref->count);

在使用percpu引用计数和的时候，还要减去一个PERCPU_COUNT_BIAS。
原来，这是因为初始化时，加上了PERCPU_COUNT_BIAS。这个是一个比较大的数字，这样，基本能保证，在percpu_ref_switch_to_atomic_rcu函数完成前，percpu_ref_put是无法把相关引用计数降为0的。只有正常情况下，只有执行完
atomic_long_add((long)count - PERCPU_COUNT_BIAS, &ref->count);
后的语句，才有机会将该atomic值减为0，从而触发release的操作。

那么，如果把置位__PERCPU_REF_ATOMIC的动作，放到percpu_ref_switch_to_atomic_rcu函数里，这样能延长一点该变量percpu状态的时间，这样的优化不知道行不行得通。

每次读内核代码，总会感叹内核的博大精深。

kaka__55

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
percpu_ref状态迁移的并发思考

最近看了下percpu_ref的代码，感觉对于并发处理这块设计的挺巧妙的，写个博客记录一下。本文代码基于linux内核4.19.195.之前简单了解了一下percpu_ref的作用，当我们认为这个变量将不被使用的时候，我们会调用percpu_ref_kill试图终结这个变量的生命周期，这个函数最终会调用到__percpu_ref_switch_to_atomic。当吧percpu_ref变量转换成atomic类型之后，就可以按照atomic的方式来对该变量保护的变量进行相关解引用/释放的操作了。但是，转换
复制链接

扫一扫

专栏目录