percpu_ref状态迁移的并发思考

最近看了下percpu_ref的代码,感觉对于并发处理这块设计的挺巧妙的,写个博客记录一下。本文代码基于linux内核4.19.195.
之前简单了解了一下percpu_ref的作用,当我们认为这个变量将不被使用的时候,我们会调用percpu_ref_kill试图终结这个变量的生命周期,这个函数最终会调用到__percpu_ref_switch_to_atomic。当吧percpu_ref变量转换成atomic类型之后,就可以按照atomic的方式来对该变量保护的变量进行相关解引用/释放的操作了。但是,转换成atomic的这个过程中,可能并发的存在其他percpu_ref_put、percpu_ref_get的操作。那么,内核是怎么做到平滑的把percpu_ref从percpu模式切换到atomic模式的呢?
不妨直接来看看函数__percpu_ref_switch_to_atomic。

static void __percpu_ref_switch_to_atomic(struct percpu_ref *ref,
					  percpu_ref_func_t *confirm_switch)
{
	if (ref->percpu_count_ptr & __PERCPU_REF_ATOMIC) {
		if (confirm_switch)
			confirm_switch(ref);
		return;
	}

	/* switching from percpu to atomic */
	ref->percpu_count_ptr |= __PERCPU_REF_ATOMIC;

	/*
	 * Non-NULL ->confirm_switch is used to indicate that switching is
	 * in progress.  Use noop one if unspecified.
	 */
	ref->confirm_switch = confirm_switch ?: percpu_ref_noop_confirm_switch;

	percpu_ref_get(ref);	/* put after confirmation */
	call_rcu_sched(&ref->rcu, percpu_ref_switch_to_atomic_rcu);
}

首先,给percpu_ref变量置上了__PERCPU_REF_ATOMIC标志,然后,get了一把ref,然后把函数percpu_ref_switch_to_atomic_rcu挂在了rcu的回调链表上。当代码走到这里,percpu_ref变量就会被置上__PERCPU_REF_ATOMIC以及__PERCPU_REF_DEAD(这个在percpu_ref_kill函数一开始就调用了)标志了,此时,若有流程调用percpu_ref_put,则会判断ref的模式,根据这两个标志位,ref模式显然为atomic,则会使用atomic相关函数进行引用计数的增减操作,相关代码如下。

/*
 * Internal helper.  Don't use outside percpu-refcount proper.  The
 * function doesn't return the pointer and let the caller test it for NULL
 * because doing so forces the compiler to generate two conditional
 * branches as it can't assume that @ref->percpu_count is not NULL.
 */
static inline bool __ref_is_percpu(struct percpu_ref *ref,
					  unsigned long __percpu **percpu_countp)
{
	unsigned long percpu_ptr;

	/*
	 * The value of @ref->percpu_count_ptr is tested for
	 * !__PERCPU_REF_ATOMIC, which may be set asynchronously, and then
	 * used as a pointer.  If the compiler generates a separate fetch
	 * when using it as a pointer, __PERCPU_REF_ATOMIC may be set in
	 * between contaminating the pointer value, meaning that
	 * READ_ONCE() is required when fetching it.
	 *
	 * The smp_read_barrier_depends() implied by READ_ONCE() pairs
	 * with smp_store_release() in __percpu_ref_switch_to_percpu().
	 */
	percpu_ptr = READ_ONCE(ref->percpu_count_ptr);

	/*
	 * Theoretically, the following could test just ATOMIC; however,
	 * then we'd have to mask off DEAD separately as DEAD may be
	 * visible without ATOMIC if we race with percpu_ref_kill().  DEAD
	 * implies ATOMIC anyway.  Test them together.
	 */
	if (unlikely(percpu_ptr & __PERCPU_REF_ATOMIC_DEAD))
		return false;

	*percpu_countp = (unsigned long __percpu *)percpu_ptr;
	return true;
}
/**
 * percpu_ref_put - decrement a percpu refcount
 * @ref: percpu_ref to put
 *
 * Decrement the refcount, and if 0, call the release function (which was passed
 * to percpu_ref_init())
 *
 * This function is safe to call as long as @ref is between init and exit.
 */
static inline void percpu_ref_put(struct percpu_ref *ref)
{
	percpu_ref_put_many(ref, 1);
}

static inline void percpu_ref_put_many(struct percpu_ref *ref, unsigned long nr)
{
	unsigned long __percpu *percpu_count;

	rcu_read_lock_sched();

	if (__ref_is_percpu(ref, &percpu_count))
		this_cpu_sub(*percpu_count, nr);
	else if (unlikely(atomic_long_sub_and_test(nr, &ref->count)))
		ref->release(ref);

	rcu_read_unlock_sched();
}

从代码角度上来说,函数percpu_ref_switch_to_atomic_rcu可能不会立马就被调用,因为他需要等待优雅周期过去。但是,percpu的模式,从标志__PERCPU_REF_ATOMIC被置位的那一瞬间,就被切换成atomic模式了(这里能不能优化,直到percpu_ref_switch_to_atomic_rcu函数执行前再转换?即是否可以把置位该标志的语句挪到percpu_ref_switch_to_atomic_rcu中?),但是,把相关percpu上的引用计数统一整理到percpu_ref结构中的atomic_long_t变量count的工作,明明是在percpu_ref_switch_to_atomic_rcu函数里面做的。在percpu_ref_put_many函数中,若该atomic变量值为0,就会被调用release函数释放了呀,这可不行。那么,内核是怎么做的呢?
仔细观察函数percpu_ref_switch_to_atomic_rcu

static void percpu_ref_switch_to_atomic_rcu(struct rcu_head *rcu)
{
	struct percpu_ref *ref = container_of(rcu, struct percpu_ref, rcu);
	unsigned long __percpu *percpu_count = percpu_count_ptr(ref);
	unsigned long count = 0;
	int cpu;

	for_each_possible_cpu(cpu)
		count += *per_cpu_ptr(percpu_count, cpu);

	pr_debug("global %ld percpu %ld",
		 atomic_long_read(&ref->count), (long)count);

	/*
	 * It's crucial that we sum the percpu counters _before_ adding the sum
	 * to &ref->count; since gets could be happening on one cpu while puts
	 * happen on another, adding a single cpu's count could cause
	 * @ref->count to hit 0 before we've got a consistent value - but the
	 * sum of all the counts will be consistent and correct.
	 *
	 * Subtracting the bias value then has to happen _after_ adding count to
	 * &ref->count; we need the bias value to prevent &ref->count from
	 * reaching 0 before we add the percpu counts. But doing it at the same
	 * time is equivalent and saves us atomic operations:
	 */
	atomic_long_add((long)count - PERCPU_COUNT_BIAS, &ref->count);

	WARN_ONCE(atomic_long_read(&ref->count) <= 0,
		  "percpu ref (%pf) <= 0 (%ld) after switching to atomic",
		  ref->release, atomic_long_read(&ref->count));

	/* @ref is viewed as dead on all CPUs, send out switch confirmation */
	percpu_ref_call_confirm_rcu(rcu);
}

重点关注如下语句:

atomic_long_add((long)count - PERCPU_COUNT_BIAS, &ref->count);

在使用percpu引用计数和的时候,还要减去一个PERCPU_COUNT_BIAS。
原来,这是因为初始化时,加上了PERCPU_COUNT_BIAS。这个是一个比较大的数字,这样,基本能保证,在percpu_ref_switch_to_atomic_rcu函数完成前,percpu_ref_put是无法把相关引用计数降为0的。只有正常情况下,只有执行完
atomic_long_add((long)count - PERCPU_COUNT_BIAS, &ref->count);
后的语句,才有机会将该atomic值减为0,从而触发release的操作。

那么,如果把置位__PERCPU_REF_ATOMIC的动作,放到percpu_ref_switch_to_atomic_rcu函数里,这样能延长一点该变量percpu状态的时间,这样的优化不知道行不行得通。

每次读内核代码,总会感叹内核的博大精深。

  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值