C++ 多线程之间的数据交互

最新推荐文章于 2024-05-27 09:50:16 发布

碎步の流年

最新推荐文章于 2024-05-27 09:50:16 发布

阅读量7k

点赞数 8

分类专栏： C++

本文链接：https://blog.csdn.net/qq_24649627/article/details/112557135

版权

C++ 专栏收录该内容

43 篇文章 10 订阅

订阅专栏

参考博客https://blog.csdn.net/hai008007/article/details/80246437，在此之上整理修改。

同一个进程内的多个线程之间免不了需要进行数据的交互，队列和共享数据是实现多个线程之间的数据交互的常用方式，封装好的队列使用起来相对来说不容易出错一些，而共享数据则是最基本的也是较容易出错的，因为它会产生数据争用的情况，即有超过一个线程试图同时抢占某个资源，比如对某块内存进行读写等，如下例所示：

#include <iostream>
#include <thread>
 
using namespace std;
 
#define COUNT 10000
 
void inc(int *p){
    for(int i = 0; i < COUNT; i++){
        (*p)++;
    }
}
 
int main()
{
    int a = 0;
    
    thread ta(inc, &a);
    thread tb(inc, &a);
    
    ta.join();
    tb.join();
    
    cout << " a = " << a << endl;
    return 0;
}

上例是一个简单的数据交互，可以看出两个线程同时对 &a 这个内存地址进行读写操作。表面上看来，两个线程执行完之后，最后的 a 值应该是 COUNT * 2，但是实际上并非如此，因为可能出现两个线程同时需要访问这一块内存进行操作，会出现线程被打断的现象。要解决这个问题，对于简单的基本类型数据如字符、整型、指针等，C++提供了原子模版类 atomic，而对于复杂的对象，则提供了最常用的锁机制，比如互斥类 mutex，门锁 lock_guard，唯一锁 unique_lock，条件变量 condition_variable 等。

std::atomic

对于线程而言，原子类型属于“资源型”数据，意味着多个线程通常只能访问单个原子类型的拷贝，
因此在C++11中，原子类型只能从其模板参数类型中进行构造。
标准不允许原子类型进行拷贝构造，移动构造，以及使用operator=等操作，实际上，atomic类模板的拷贝构造，移动构造，operator=等操作默认是被删除的。
不过从atomic类型的变量来构造起模板参数类型T的变量则是可以的这是由于atomic类模板总是定义了从atomic< T >到T的类型转换函数，在需要时，编译器会隐式地完成从原子类型到其对应的模板参数类型的转换。
C11的头文件< cstdatomic >简单的定义了对应于内置类型的原子类型

atomic	类型
atomic_bool	bool
atomic_char	char
atomic_schar	signed char
atomic_uchar	unsigned char
atomic_int	int
atomic_uint	unsigned int
atomic_short	short
atomic_ushort	unsigned short
atomic_long	long
atomic_ulong	unsigned long
atomic_llong	long long
atomic_ullong	unsigned long long
atomic_char16_t	char16_t
atomic_char32_t	char32_t
atomic_wchar_t	wchar_t

同样，举个例子：

#include <iostream>
#include <thread>
#include <atomic>
 
using namespace std;
 
#define COUNT 10000
 
void inc(atomic<int> *p){
    for(int i = 0; i < COUNT; i++){
        (*p)++;
    }
}
 
int main()
{
    atomic<int> a{0};
    
    thread ta(inc, &a);
    thread tb(inc, &a);
    
    ta.join();
    tb.join();
    
    cout << " a = " << a << endl;
    return 0;
}

std::lock_guard

先来个小例子吧：

mutex m;
m.lock();
sharedVariable= getVar();
m.unlock();

在这点代码中，互斥体m确保关键部分sharedVariable= getVar();的访问是顺序的。
顺序意味着：在这种特殊情况下，每个线程按顺序获得对关键部分的访问。
代码很简单，但容易出现死锁。如果关键部分抛出异常或程序员只是忘记解锁互斥锁，则会出现死锁。

使用std::lock_guard，我们可以做到更优雅：

{
  std::mutex m,
  std::lock_guard<std::mutex> lockGuard(m);
  sharedVariable= getVar();
}

这很容易。但是开括号 { 和闭括号 }是啥？
为了保证std::lock_guard生命周期只在这{}里面有效。
也就是说，当生命周期离开临界区时，它的生命周期就结束了。
确切地说，在那个时间点，std::lock_guard的析构函数被调用，是的，互斥体被释放了。过程是全自动的，此外，如果getVar()在sharedVariable = getVar()抛出异常时也会释放互斥体。当然，函数体范围或循环范围也限制了对象的生命周期。

#include <iostream>
#include <thread>
#include <mutex>
 
using namespace std;
 
#define COUNT 10000

static mutex g_mutex;
 
void inc(int *p){
    for(int i = 0; i < COUNT; i++){
    	lock_guard<mutex> lck(g_mutex);
        (*p)++;
    }
}
 
int main()
{
    int a{0};
    
    thread ta(inc, &a);
    thread tb(inc, &a);
    
    ta.join();
    tb.join();
    
    cout << " a = " << a << endl;
    return 0;
}

此外，unique_lock()也可以使用。
unique_lock是个类模板，工作中，一般lock_guard(推荐使用)；lock_guard取代了mutex的lock()和unlock();unique_lock比lock_guard灵活很多，效率上差一点，内存占用多一点。具体的unique_lock()用法另外说明，这里就不赘述了。

std::mutex

std::mutex 是C++11 中最基本的互斥量，std::mutex 对象提供了独占所有权的特性——即不支持递归地对 std::mutex 对象上锁，而 std::recursive_lock 则可以递归地对互斥量对象上锁。

#include <iostream>
#include <thread>
#include <mutex>
 
using namespace std;
 
#define COUNT 10000

static mutex g_mutex;
 
void inc(int *p){
    for(int i = 0; i < COUNT; i++){
    	g_mutex.lock();
        (*p)++;
        g_mutex.unlock();
    }
}
 
int main()
{
    int a{0};
    
    thread ta(inc, &a);
    thread tb(inc, &a);
    
    ta.join();
    tb.join();
    
    cout << " a = " << a << endl;
    return 0;
}

std::condition_variable

对于线程间的事件通知，C++11 提供了条件变量类 condition_variable(可视为 pthread_cond_t 的封装)，使用条件变量可以让一个线程等待其它线程的通知 (wait，wait_for，wait_until)，也可以给其它线程发送通知 (notify_one，notify_all)，条件变量必须和锁配合使用，在等待时因为有解锁和重新加锁，所以，在等待时必须使用可以手工解锁和加锁的锁，比如 unique_lock，而不能使用 lock_guard，示例如下：

#include <thread>
#include <iostream>
#include <condition_variable>

# define THREAD_COUNT 10

using namespace std;
mutex m;
condition_variable cv;

int main(void){
    thread** t = new thread*[THREAD_COUNT];
    int i;
    for(i = 0; i < THREAD_COUNT; i++){
	    t[i] = new thread( [](int index){
	        unique_lock<mutex> lck(m);
	        cv.wait_for(lck, chrono::hours(1000));
	        cout << index << endl;}, i );
            
 	    this_thread::sleep_for( chrono::milliseconds(50) );
    }
    
    for(i = 0; i < THREAD_COUNT; i++){
	    lock_guard<mutex> _(m);
 	    cv.notify_one();
    }
    
    for(i = 0; i < THREAD_COUNT; i++){
 	    t[i]->join();
	    delete t[i];
    }
    delete t;
    
    return 0;
}

编译运行程序输出结果后，可以看到，条件变量是不保证次序的，即首先调用 wait 的不一定首先被唤醒。

std::promise/future

promise/future 可以用来在线程之间进行简单的数据交互，而不需要考虑锁的问题，线程 A 将数据保存在一个 promise 变量中，另外一个线程 B 可以通过这个 promise 变量的 get_future() 获取其值，当线程 A 尚未在 promise 变量中赋值时，线程 B 也可以等待这个 promise 变量的赋值：

#include <thread>
#include <iostream>
#include <future>

using namespace std;

promise<string> val;

int main(void){
    thread ta([](){
	    future<string> fu = val.get_future();
	    cout << "waiting promise->future" << endl;
	    cout << fu.get() << endl;
    });
    
    thread tb([](){
	    this_thread::sleep_for( chrono::milliseconds(5000) );
	    val.set_value("promise is set");
    });
    
    ta.join();
    tb.join();
    
    return 0;
}

一个 future 变量只能调用一次 get()，如果需要多次调用 get()，可以使用 shared_future，通过 promise/future 还可以在线程之间传递异常。

std::packaged_task

如果将一个 callable 对象和一个 promise 组合，那就是 packaged_task，它可以进一步简化操作：

#include <thread>
#include <iostream>
#include <mutex>
#include <future>

using namespace std;

static mutex g_mutex;

int main(void){
    auto run = [=](int index){ 
		{
	    	lock_guard<mutex> lck(g_mutex);
	    	cout << "tasklet " << index << endl;
		}
		this_thread::sleep_for( chrono::seconds(5) );
		return index * 1000;
    };
    
    packaged_task<int(int)> pt1(run);
    packaged_task<int(int)> pt2(run);
    thread t1( [&](){pt1(2);} );
    thread t2( [&](){pt2(3);} );

    int f1 = pt1.get_future().get();
    int f2 = pt2.get_future().get();
    cout << "task result=" << f1 << endl;
    cout << "task result=" << f2 << endl;

    t1.join();
    t2.join();
    
    return 0;
}

std::async

还可以试图将一个 packaged_task 和一个线程组合，那就是 async() 函数。使用 async() 函数启动执行代码，返回一个 future 对象来保存代码返回值，不需要我们显式地创建和销毁线程等，而是由 C++11 库的实现决定何时创建和销毁线程，以及创建几个线程等，示例如下：

#include <thread>
#include <iostream>
#include <mutex>
#include <future>
#include <vector>

# define COUNT 1000000

using namespace std;

static long do_sum(vector<long> *arr, size_t start, size_t count){
    static mutex m;
    long sum = 0;
    
    for(size_t i = 0; i < count; i++){
	    sum += (*arr)[start + i];
    }
    
    {
	    lock_guard<mutex> lck(m);
	    cout << "thread " << this_thread::get_id() << ", count=" << count
	        << ", start="<< start << ", sum=" << sum << endl;
    }
    return sum;
}

int main(void){
    vector<long> data(COUNT);
    for(size_t i = 0; i < COUNT; i++){
        data[i] = random() & 0xff;
    }
    
    vector< future<long> > result;
    
    size_t ptc = thread::hardware_concurrency() * 2;
    for(size_t batch = 0; batch < ptc; batch++) {
	    size_t batch_each = COUNT / ptc;
	    if (batch == ptc - 1) {
	        batch_each = COUNT - (COUNT / ptc * batch);
	    }
	    result.push_back(async(do_sum, &data, batch * batch_each, batch_each));
    }

    long total = 0;
    for(size_t batch = 0; batch < ptc; batch++) {
	    total += result[batch].get();
    }
    cout << "total=" << total << endl;
    
    return 0;
}