《C++ Concurrency in Action》笔记19 限时等待、FP并行编程

最新推荐文章于 2022-01-28 17:54:25 发布

时空-大海水

最新推荐文章于 2022-01-28 17:54:25 发布

阅读量360

点赞数

本文链接：https://blog.csdn.net/t114211200/article/details/78086090

版权

C++11 STL 多线程专栏收录该内容

40 篇文章 13 订阅

订阅专栏

有时，当你阻塞以等待一个事件的发生时，你希望对等待的时间做一些限制。这时，一些具有时间限制的等待函数可以满足你的要求。

限时等待

限时等待允许你在交互操作中告诉对方，你“仍然活着”；或者如果用户点击了取消按钮，仍然可以结束等待状态。

有两种超时方式你可以选择：一种是时间长度，另一种是时间点。大多数的等待函数都同时提供了这两种方式，设置等待时间长度的函数以_for作为后缀，而设置等待的时间点的函数则以_until作为后缀。

作为例子，std::condition_variable类具有2个wait_for()函数的重载版本，2个wait_until()函数的重载版本，与wait()函数一致：第一种重载函数等待到信号发生、或者超时、或者假醒；第二种重载函数等待到要么超时，要么信号发生并且提供的谓词返回true。

假如你等待一个condition_ariable时，希望最多等待500毫秒，推荐的用法如下：

std::condition_variable cv;
bool done;
std::mutex m;
bool wait_loop()
{
	auto const timeout = std::chrono::steady_clock::now() + std::chrono::milliseconds(500);
	std::unique_lock<std::mutex> lk(m);
	while (!done)
	{
		if (cv.wait_until(lk, timeout) == std::cv_status::timeout)
			break;
	}
	return done;
}

如果等待condition_variable时，没有提供一个谓词，则建议使用上述方法去做限时等待，它的循环时间长度是有限的。

就像4.1.1章节所说的那样，如果你不使用一个谓词，那么你就需要使用循环来避免假醒。如果你在一个循环中使用wait_for()，那很有可能因为假醒而导致重复计时，最终需要等待更长的时间来结束wait，下面的程序可以证明这点：

bool wait_loop()
{
	std::unique_lock<std::mutex> lk(m);
	while (!done)
	{
		if (cv.wait_for(lk, chrono::milliseconds(500)) == std::cv_status::timeout)//因为假醒存在，可能重复计时
			break;
	}
	return done;
}
void f()
{
    auto start = chrono::system_clock::now();
    wait_loop();
    auto stop = chrono::system_clock::now();
    cout << chrono::duration<double>(stop - start).count() << endl;
}
void main()
{
    for (int i = 0; i < 5;++i)
    {
        f();
    }
    system("pause");
}

输出如下：

1.00009
1.00016
0.50088
1.00021
0.500858
请按任意键继续. . .

从输出可以看出，并不是每次都只等待了500毫秒，5次中有3次都等待了双倍的时间。

支持限时等待的函数

std::this_thread::sleep_for()和std::this_thread::sleep_until()可以达到凭空等待的效果。sleep::until()可以做到定点做一些操作的目的，例如当回放视频时让线程等到下一次框架刷新。

原始的std::mutex和std::recursive_mutex虽然不支持限时等待，但是std::timed_mutex以及std::recursive_timed_mutex提供了相应的成员函数：try_lock_for() 和try_lock_until()。另外还有condition variable, future, promise, packaged_task，都有相应的限时等待成员函数。

使用同步操作简化代码

一种简化代码的办法是，使用更多的函数去处理同步。相比于在不同线程间直接使用共享数据，使用future来传递任务和结果更管用。

利用future的函数式编程

函数式编程(FP)术语指的是一种编程方式，结果仅仅依赖于函数的参数，而与外部完全无关。这与函数的本身意义相符，如果使用相同的参数两次执行同一个函数，其结果必须一样。这也是C++标准库中许多数学函数的特性，例如：sin、cos、sqrt；也是基本类型的操作特性，例如：3+3、6*9、1.3/4.7。一个纯粹的函数也不应该修改任何外部的数据，它的影响仅限于它的返回值。

C++是一种多泛型语言，它完全适合编写FP风格的程序，甚至在C++98阶段。伴随C++11而来的是，lambda表达式，来自Boost和TR1的结合体：std::bind，自动类型推演。future是令C++中编写FP风格成为可能的关键手段。

FP风格的快速排序

为了举例说明使用future编写的FP风格的并发程序，让我们来看一个简单的快速排序算法。基本的想法是简单的：给定一个存放数据的list，拿出一个元素作为轴心元素，然后将list分为两部分：比轴心元素的小的，以及比它大的。然后生成一个拷贝list，前部分是经过排序的比轴心小的元素，然后是轴心元素，然后是经过排序的比轴心元素大的元素。

下面列出这个算法的FP风格的实现，它返回一个经过排序后的list拷贝，而不像std::sort()那样直接对参数指定的容器排序：

template<typename T>
std::list<T> sequential_quick_sort(std::list<T> input)
{
	if (input.empty())
	{
		return input;
	}
	std::list<T> result;
	result.splice(result.begin(), input, input.begin());
	T const& pivot = *result.begin();
	auto divide_point = std::partition(input.begin(), input.end(), [&](T const& t) {return t<pivot; });
	std::list<T> lower_part;
	lower_part.splice(lower_part.end(), input, input.begin(), divide_point);
	auto new_lower(sequential_quick_sort(std::move(lower_part)));
	auto new_higher(sequential_quick_sort(std::move(input)));
	result.splice(result.end(), new_higher);
	result.splice(result.begin(), new_lower);
	return result;
}

唯一需要说明的就是：通过move一个list可以防止拷贝带来的时间耗费。

FP风格的并行快速排序

因为上面的程序已经使用了FP风格的编程方式，因此它很容易被改成使用future的并行版本：

template<typename T>
std::list<T> parallel_quick_sort(std::list<T> input)
{
	if (input.empty())
		return input;
	std::list<T> result;
	result.splice(result.begin(), input, input.begin());
	T const& pivot = *result.begin();
	auto divide_point = std::partition(input.begin(), input.end(), [&](T const& t) {return t<pivot; });
	std::list<T> lower_part;
	lower_part.splice(lower_part.end(), input, input.begin(), divide_point);
	std::future<std::list<T> > new_lower(std::async(parallel_quick_sort<T>, std::move(lower_part)));//此处做了更改
	auto new_higher(parallel_quick_sort(std::move(input)));
	result.splice(result.end(), new_higher);
	result.splice(result.begin(), new_lower.get());
	return result;
}

最大的改变就是在另一个线程中对小序列部分排序。如果async每次都开启一个新线程，如果这个函数被递归调用了3次，那就会产生8个同时运行的线程；如果递归了10次，那就会产生1024个线程同时运行，前提是硬件可以支持。当系统发现任务过多，就会切换成同步执行这些任务，也就是在本线程中执行。如果你不指定async的运行方式的话，那么最好查看一下文档，以确定当前环境下缺省是使用什么方式执行。

你也可以使用packaged_task和thread来代替async函数，尽管这并没有明显优势（甚至可能导致更大开销），但是这可以很方便的将复杂的任务通过队列的方式转交给一个工作线程池。我们将在第9章看到接触到线程池。

It’s probably worth going this way in preference to using std::async only if you really know what you’re doing and want complete control over the way the thread pool is built and executes tasks.

只有当你明确知道你在做什么，以及完全掌控线程池是如何创建和工作的原理的情况下，使用async才会得心应手。

假设async可以最大限度的利用硬件的并发性能，但上面的算法也不是实现quicksor函数的最理想手段。一件事： std::partition做了太多的事，那是一个连续调用，不过目前为止这已经足够好了，如果对更快的并行算法感兴趣，请参考一些学术文献。