窃取式调度器(Stealing Scheduler)-高并发

最新推荐文章于 2024-03-24 11:55:39 发布

蓝虎 - tanjp.com

最新推荐文章于 2024-03-24 11:55:39 发布

阅读量627

点赞数

分类专栏：极品底层(C++) 极品架构文章标签：窃取高并发 actor 多线程 lockfree

本文链接：https://blog.csdn.net/tanjpeng/article/details/89922457

版权

极品底层(C++) 同时被 2 个专栏收录

15 篇文章 0 订阅

订阅专栏

极品架构

12 篇文章 1 订阅

订阅专栏

原文转自：http://www.tanjp.com (即时修正和更新)

窃取式调度器(Stealing Scheduler)

N个业务系统生产作业加入到 M+1个队列里面(优先加入到当前线程所在队列)，队列中的作业被 M个线程按一定的规则消费。M个线程都对应一个线程局部存储的队列，和一个公共的队列。该规则按以下次序执行：

1、优先处理本线程生产的作业。

2、其次处理默认的队列的作业。

3、窃取下一个线程队列中的作业。

4、窃取上一个线程队列中的作业。

也就是说，当作业量庞大时，各个线程忙着处理各自队列，当线程自己的队列处理完，才处理默认的队列和从相邻线程的窃取作业来执行。线程竞争抽象为：M+(M*1+M*1+M*2)/3 = 2M。这是抢占式与分配式两种方案优点的结合。

互斥锁和无锁方案

窃取式调度，都是要按以上的次序，逐个进行尝试并取出作业来处理，所以都不能采用挂起等待的方式。也就是说，push和pop操作都是立即返回成功或失败。为了不丢失数据（push的时候不会因为队列满而挂起），一般都为无界队列，并由业务层来控制队列中作业数量的上限。窃取式的实现细节在于采用了线程局部存储变量(只能被一个线程来读写)。

部分实现代码：

class LockfreeStealingScheduler : public SchedulerBase
{
	typedef boost::lockfree::queue< Task *, boost::lockfree::fixed_sized<false> > LockfreeQueue;
public:
    	explicit LockfreeStealingScheduler(uint32 pn_thread_count = 8U);
    	~LockfreeStealingScheduler();
    	bool start() override;
    	bool post(Task * pp_optype) override;
    	bool stop() override;
private:
	void loop_running(uint16 pn_index);
private:
	const uint32 kThreadCount;
	LockfreeQueue mc_default_queue;
	LockfreeQueue * mc_queues[kThreadMaxCount]; //每个线程都有各自的队列
	std::thread * mc_threads[kThreadMaxCount]; //线程集合
	std::atomic<bool> mb_started;
	std::atomic<bool> mb_available;
	std::atomic<bool> mb_destroy;
	std::mutex mo_mutex;
	static thread_local LockfreeQueue * mp_local_queue;
	static thread_local uint16 mn_local_index;
};
bool LockfreeStealingScheduler::start()
{
	std::lock_guard<std::mutex> lock(mo_mutex); //这个锁主要保护，多个线程同时调用此函数
	if (mb_started.load() || mb_destroy.load())
	{
		return false;
	}
	mb_started.store(true);
	for (uint32 i = 0; i < kThreadCount; ++i)
	{
		mc_queues[i] = new LockfreeQueue(1024);
	}
	for (uint32 i = 0; i < kThreadCount; ++i)
	{
		mc_threads[i] = new std::thread(std::bind(&LockfreeStealingScheduler::loop_running, this, i));
	}
	mb_available.store(true);
	return true;
}

void LockfreeStealingScheduler::loop_running(uint16 pn_index)
{
	mn_local_index = pn_index;
	mp_local_queue = mc_queues[mn_local_index];
	Task * zf_task = 0;
	bool zb_had_task = false;
	while (true)
	{
		//线程局部存储中取
		zb_had_task = mp_local_queue->pop(zf_task);
		if (!zb_had_task)
		{
			//全局中取
			zb_had_task = mc_default_queue.pop(zf_task);
		}
		if (!zb_had_task)
		{
			//从下一个线程存储中窃取
			const uint16 zn_index = (mn_local_index + 1) % kThreadCount;
			zb_had_task = mc_queues[zn_index]->pop(zf_task);
		}
		if (!zb_had_task)
		{
			//从上一个线程存储中窃取
			const uint16 zn_index = (mn_local_index + kThreadCount - 1) % kThreadCount;
			zb_had_task = mc_queues[zn_index]->pop(zf_task);
		}
		if (zb_had_task)
		{
			//有任务可以处理
			zf_task->execute();
			zf_task->done();
		}
		else
		{
			//没有任务
			THIS_SLEEP_MILLISECONDS(1);
		}
		if (!mb_started.load() && mc_default_queue.empty() && mp_local_queue && mp_local_queue->empty())
		{
			break;
		}
	} //while
}
thread_local LockfreeStealingScheduler::LockfreeQueue * LockfreeStealingScheduler::mp_local_queue = nullptr;
thread_local uint16 LockfreeStealingScheduler::mn_local_index = 0;