author:
- luixiao1223
title: chapter04
Using synchronization of operations to simplify code
Functional programming with futures
函数式编程的概念是,函数不依赖与外部变量,只和参数有关系。只要参数相同那么函数的结果就像。如果使用future,各个thread之间靠future参数来传递数据。就可以实现函数式编程
FP-STYLE QUICKSORT
template<typename T>
std::list<T> sequential_quick_sort(std::list<T> input)
{
if(input.empty())
{
return input;
}
std::list<T> result;
result.splice(result.begin(),input,input.begin()); //<--1
T const& pivot=*result.begin(); //<--2
auto divide_point=std::partition(input.begin(),input.end(),
[&](T const& t){return t<pivot;}); //<--3
std::list<T> lower_part;
lower_part.splice(lower_part.end(),input,input.begin(),
divide_point); //<--4
auto new_lower(
sequential_quick_sort(std::move(lower_part))); //<--5
auto new_higher(
sequential_quick_sort(std::move(input))); //<--6
result.splice(result.end(),new_higher); //<--7
result.splice(result.begin(),new_lower); //<--8
return result;
} // 这个算法会在100000量级的数据下crash.因为迭代太深的缘故.
FP-STYLE PARALLEL QUICKSORT
template<typename T>
std::list<T> parallel_quick_sort(std::list<T> input)
{
if(input.empty())
{
return input;
}
std::list<T> result;
result.splice(result.begin(),input,input.begin());
T const& pivot=*result.begin();
auto divide_point=std::partition(input.begin(),input.end(),
[&](T const& t){return t<pivot;});
std::list<T> lower_part;
lower_part.splice(lower_part.end(),input,input.begin(),
divide_point);
std::future<std::list<T> > new_lower( //<--1
std::async(¶llel_quick_sort<T>,std::move(lower_part)));
auto new_higher(
parallel_quick_sort(std::move(input))); //<--2
result.splice(result.end(),new_higher); //<--3
result.splice(result.begin(),new_lower.get()); //<--4
return result;
}
spawn_task
template<typename F,typename A>
std::future<std::result_of<F(A&&)>::type>
spawn_task(F&& f,A&& a)
{
typedef std::result_of<F(A&&)>::type result_type;
std::packaged_task<result_type(A&&)> task(std::move(f));
std::future<result_type> res(task.get_future());
std::thread t(std::move(task),std::move(a));
t.detach();
return res;
}
注意 :
- 当数据量很小的时候单线程更有优势.在数据量达到20W级的时候多线程更有优势.
- 如果线程产生过多会耗尽系统资源.
- 并发编程不只一种 functional programming.还有 CSP( communicating
Sequential Processes ).也就是Erlang采用的模型. 并使用 MPI ( messsage
passing interface ).
Synchronizing operations with message passing
CSP的原理很简单:(无共享数据)
- 线程之间通信靠message
- 每一个线程独立运行,形成一个有限状态机.
- 实际上并不是没有共享数据.message队列就是这种模型里面的共享数据.不过可以把message队列封装起来.对于线程来说可以message队列是透明的.
ATM 机例子
struct card_inserted
{
std::string account;
};
class atm
{
messaging::receiver incoming;
messaging::sender bank;
messaging::sender interface_hardware;
void (atm::*state)();
std::string account;
std::string pin;
void waiting_for_card()
{
interface_hardware.send(display_enter_card());
incoming.wait()
.handle<card_inserted>(
[&](card_inserted const& msg)
{
account=msg.account;
pin="";
interface_hardware.send(display_enter_pin());
state=&atm::getting_pin;
}
);
}
void getting_pin();
public:
void run()
{
state=&atm::waiting_for_card;
try
{
for(;;)
{
(this->*state)();
}
}
catch(messaging::close_queue const&)
{
}
}
};
The state machine for this ATM logic runs on a single thread, with other
parts of the system such as the interface to the bank and the terminal
interface running on separate threads. This style of program design is
called the Actor model
void atm::getting_pin()
{
incoming.wait()
.handle<digit_pressed>(
[&](digit_pressed const& msg)
{
unsigned const pin_length=4;
pin+=msg.digit;
if(pin.length()==pin_length)
{
bank.send(verify_pin(account,pin,incoming));
state=&atm::verifying_pin;
}
}
)
.handle<clear_last_pressed>(
[&](clear_last_pressed const& msg)
{
if(!pin.empty())
{
pin.resize(pin.length()-1);
}
}
)
.handle<cancel_pressed>(
[&](cancel_pressed const& msg)
{
state=&atm::done_processing;
}
);
}
Continuation-style concurrency with the Concurrency TS
Technical Specification (TS)提供了一些实验性的功能,可以大大简化代码.
std::experimental::future<int> find_the_answer;
auto fut=find_the_answer();
auto fut2=fut.then(find_the_question); // 一旦调用then, 原来的fut就失效了,被move走了.现在是fut2有效了.
assert(!fut.valid());
assert(fut2.valid());
注意: 你无法向then里面传递参数.因为参数默认返回的future对象.比如
std::string find_the_question(std::experimental::future<int> the_answer);
一个和TS中的async等效的spawn代码
template<typename Func>
std::experimental::future<decltype(std::declval<Func>()())>
spawn_async(Func&& func){
std::experimental::promise<
decltype(std::declval<Func>()())> p;
auto res=p.get_future();
std::thread t(
[p=std::move(p),f=std::decay_t<Func>(func)]()
mutable{
try{
p.set_value_at_thread_exit(f());
} catch(...){
p.set_exception_at_thread_exit(std::current_exception());
}
});
t.detach();
return res;
}
-
在C++11中decltype可以推断出变量的类型.可以用这个类型直接定义变量
int tempA = 2; decltype(tempA) dclTempA;
-
C++11 std::declval是一个函数.这个函数的原型如下
template<class T> typename std::add_rvalue_reference<T>::type declval() noexcept;
它返回一个类型的右值引用.
Converts any type T to a reference type, making it possible to use
member functions in decltype expressions without the need to go
through constructors.declval is commonly used in templates where acceptable template
parameters may have no constructor in common, but have the same
member function whose return type is needed.Note that declval can only be used in unevaluated contexts and is
not required to be defined; it is an error to evaluate an expression
that contains this function. Formally, the program is ill-formed if
this function is odr-used. -
C++14
template< class T > using decay_t = typename decay<T>::type;
我们来看看
decay<T>
的定义template< class T > struct decay;
Applies lvalue-to-rvalue, array-to-pointer, and function-to-pointer
implicit conversions to the type T, removes cv-qualifiers, and
defines the resulting type as the member typedef type.
Chaining continuations
void process_login(std::string const& username,std::string const& password)
{
try {
user_id const id=backend.authenticate_user(username,password);
user_data const info_to_display=backend.request_current_info(id);
update_display(info_to_display);
} catch(std::exception& e){
display_error(e);
}
}
上面的代码如果不进行多线程.那么会阻塞UI更新.所以你可能需要下面的代码来处理这次逻辑
std::future<void> process_login(
std::string const& username,std::string const& password)
{
return std::async(std::launch::async,[=]()
{
try {
user_id const id=backend.authenticate_user(username,password);
user_data const info_to_display=
backend.request_current_info(id);
update_display(info_to_display);
} catch(std::exception& e){
display_error(e);
}
});
}
上面的代码和下面的代码之间的区别为啥?上面的代码每一次调用beckend里面的函数会处罚一次线程启动.这样你会产生大量的线程互相等待.但是下面的调用就可以避免启动多个线程进行等待.
std::experimental::future<void> process_login(
std::string const& username,std::string const& password)
{
return spawn_async([=](){
return backend.authenticate_user(username,password);
}).then([](std::experimental::future<user_id> id){
return backend.request_current_info(id.get());
}).then([](std::experimental::future<user_data> info_to_display){
try{
update_display(info_to_display.get());
} catch(std::exception& e){
display_error(e);
}
});
}
下面的代码修改后的更为简化的版本.
std::experimental::future<void> process_login(
std::string const& username,std::string const& password)
{
return backend.async_authenticate_user(username,password).then(
[](std::experimental::future<user_id> id){
return backend.async_request_current_info(id.get());
}).then([](std::experimental::future<user_data> info_to_display){
try{
update_display(info_to_display.get());
} catch(std::exception& e){
display_error(e);
}
});
}
如果你是C++14你可以这么写
return backend.async_authenticate_user(username,password).then(
[](auto id){
return backend.async_request_current_info(id.get());
});
多个触发函数等待同一个future的方法.
auto fut=spawn_async(some_function).share();
auto fut2=fut.then([](std::experimental::shared_future<some_data> data){
do_stuff(data);
});
auto fut3=fut.then([](std::experimental::shared_future<some_data> data){
return do_other_stuff(data);
});
注意即便你是 shared_future
调用then,返回的对象也不是 shared_future
而是 std::experimental::future
Waiting for more than one future
生成多个future,并创建一个等待线程一个一个按顺序等待.
std::future<FinalResult> process_data(std::vector<MyData>& vec)
{
size_t const chunk_size=whatever;
std::vector<std::future<ChunkResult>> results;
for(auto begin=vec.begin(),end=vec.end();beg!=end;){
size_t const remaining_size=end-begin;
size_t const this_chunk_size=std::min(remaining_size,chunk_size);
results.push_back(std::async(process_chunk,begin,begin+this_chunk_size));
begin+=this_chunk_size;
}
return std::async([all_results=std::move(results)](){
std::vector<ChunkResult> v;
v.reserve(all_results.size());
for(auto& f: all_results)
{
v.push_back(f.get());
}
return gather_results(v);
});
}
使用 std::experimental::when_all.当然还有一个 with_any
接口.
std::experimental::future<FinalResult> process_data(
std::vector<MyData>& vec)
{
size_t const chunk_size=whatever;
std::vector<std::experimental::future<ChunkResult>> results;
for(auto begin=vec.begin(),end=vec.end();beg!=end;){
size_t const remaining_size=end-begin;
size_t const this_chunk_size=std::min(remaining_size,chunk_size);
results.push_back(
spawn_async(
process_chunk,begin,begin+this_chunk_size));
begin+=this_chunk_size;
}
return std::experimental::when_all(
results.begin(),results.end()).then(
[](std::future<std::vector<
std::experimental::future<ChunkResult>>> ready_results)
{
std::vector<std::experimental::future<ChunkResult>>
all_results=ready_results.get();
std::vector<ChunkResult> v;
v.reserve(all_results.size());
for(auto& f: all_results)
{
v.push_back(f.get());
}
return gather_results(v);
});
}
Waiting for the first future in a set with when_any
std::experimental::future<FinalResult>
find_and_process_value(std::vector<MyData> &data)
{
unsigned const concurrency = std::thread::hardware_concurrency();
unsigned const num_tasks = (concurrency > 0) ? concurrency : 2;
std::vector<std::experimental::future<MyData *>> results;
auto const chunk_size = (data.size() + num_tasks - 1) / num_tasks;
auto chunk_begin = data.begin();
std::shared_ptr<std::atomic<bool>> done_flag = std::make_shared<std::atomic<bool>>(false);
for (unsigned i = 0; i < num_tasks; ++i) {
auto chunk_end = (i < (num_tasks - 1)) ? chunk_begin + chunk_size : data.end();
results.push_back(
spawn_async([=] {
for (auto entry = chunk_begin;
!*done_flag && (entry != chunk_end);
++entry) {
if (matches_find_criteria(*entry)) {
*done_flag = true;
return &*entry;
}
}
return (MyData *)nullptr;
}));
chunk_begin = chunk_end;
}
std::shared_ptr<std::experimental::promise<FinalResult>> final_result =
std::make_shared<std::experimental::promise<FinalResult>>();
struct DoneCheck {
std::shared_ptr<std::experimental::promise<FinalResult>> final_result;
DoneCheck(
std::shared_ptr<std::experimental::promise<FinalResult>> final_result_)
: final_result(std::move(final_result_)) {}
void operator()(
std::experimental::future<std::experimental::when_any_result<
std::vector<std::experimental::future<MyData *>>>>
results_param) {
auto results = results_param.get();
MyData *const ready_result = results.futures[results.index].get();
if (ready_result)
final_result->set_value(
process_found_value(*ready_result));
else {
results.futures.erase(results.futures.begin() + results.index);
if (!results.futures.empty()) {
std::experimental::when_any(
results.futures.begin(), results.futures.end())
.then(std::move(*this));
} else {
final_result->set_exception(
std::make_exception_ptr(
std::runtime_error(“Not found”)));
}
}
}
};
std::experimental::when_any(results.begin(), results.end())
.then(DoneCheck(final_result));
return final_result->get_future();
}
when_all
是传值的借口.所以你要手动进行move
std::experimental::future<int> f1=spawn_async(func1);
std::experimental::future<std::string> f2=spawn_async(func2);
std::experimental::future<double> f3=spawn_async(func3);
std::experimental::future<
std::tuple<
std::experimental::future<int>,
std::experimental::future<std::string>,
std::experimental::future<double>>> result=
std::experimental::when_all(std::move(f1),std::move(f2),std::move(f3));
Latches and barriers in the Concurrency TS
latch
- 含有一个计数.
- 当计数减到0时.所有等待代码可以继续执行.
- 不可重置.一旦变成ready,则永远是ready状态.
barrier
- 每个线程到达barrier之后停止.
- 当n个线程都达到barrier之后.所有代码重新开始执行.
- 之后barrier重置.又可以使用了.
- barrier也有一个count
A basic latch type: std::experimental::latch
void foo(){
unsigned const thread_count=...;
latch done(thread_count);
my_data data[thread_count];
std::vector<std::future<void> > threads;
for(unsigned i=0;i<thread_count;++i)
threads.push_back(
std::async(std::launch::async,
[&,i]{
data[i]=make_data(i);
done.count_down();
do_more_stuff();
}));
done.wait();
process_data(data,thread_count);
}
latch含有 is_ready
count_down_and_wait
std::experimental::barrier
a basic barrier
这里有两种barrier
- std::experimental::barrier
- std::experimental::flex_barrier
result_chunk process(data_chunk);
std::vector<data_chunk>
divide_into_chunks(data_block data, unsigned num_threads);
void process_data(data_source &source, data_sink &sink) {
unsigned const concurrency = std::thread::hardware_concurrency();
unsigned const num_threads = (concurrency > 0) ? concurrency : 2;
std::experimental::barrier sync(num_threads);
std::vector<joining_thread> threads(num_threads);
std::vector<data_chunk> chunks;
result_block result;
for (unsigned i = 0; i < num_threads; ++i) {
threads[i] = joining_thread(
[&, i]{
while (!source.done()) {
if (!i) {
data_block current_block =source.get_next_data_block();
chunks = divide_into_chunks(current_block, num_threads);
}
sync.arrive_and_wait();
result.set_chunk(i, num_threads, process(chunks[i]));
sync.arrive_and_wait();
if (!i) {
}
sink.write_data(std::move(result));
}
});
}
}
barrier还有的比较有用的接口
arrive_and_wait
arrive_and_drop
也就是本线程不在考虑barrier了.并且barrier的count减1.
std::experimental::flex_barrier
flexible friend
这个更为灵活的barrier,可以带一段执行代码.可以大大简化代码.这段代码的返回值.可以控制下一个circles等待线程的数量.
- -1表示不变
- 0表示下次无线程等待.
- n>0表示重置下次等待线程的数量.
void process_data(data_source &source, data_sink &sink) {
unsigned const concurrency = std::thread::hardware_concurrency();
unsigned const num_threads = (concurrency > 0) ? concurrency : 2;
std::vector<data_chunk> chunks;
auto split_source = [&] {
if (!source.done()) {
data_block current_block = source.get_next_data_block();
chunks = divide_into_chunks(current_block, num_threads);
}
};
split_source();
result_block result;
std::experimental::flex_barrier sync(num_threads,
[&] {
sink.write_data(std::move(result));
split_source();
return -1;
});
std::vector<joining_thread> threads(num_threads);
for (unsigned i = 0; i < num_threads; ++i) {
threads[i] = joining_thread([&, i] {
while (!source.done()) {
result.set_chunk(i, num_threads, process(chunks[i]));
sync.arrive_and_wait();
});
}
}
}