基于锁的并发数据结构(二)

bigtree712

已于 2023-11-09 14:41:04 修改

阅读量75

点赞数

分类专栏： c++并发编程文章标签：服务器 c++

于 2023-11-09 14:38:20 首次发布

本文链接：https://blog.csdn.net/Big_Tree712/article/details/134311257

版权

c++并发编程专栏收录该内容

4 篇文章 0 订阅

订阅专栏

文章目录

前文已经讨论过，如果想进一步提升并发程度，需要用更精细粒度的锁来实现容器。

单线程队列的实现

单向链表可以充当队列最简单的数据结构。
包含一个头指针，指向头结点，队列弹出数据就是把头指针指向头结点后继，然后返回第一个数据
包含一个尾指针，指向最后一个节点，新节点假如就是让最后一个节点的后继指向新节点，然后尾指针指向新节点。
队列为空，头尾指针都为 NULL


template <typename T>
class queue
{
private:
    struct node
    {
        T data;
        std::unique_ptr<node> next;
        node(T data_) : data(std::move(data_))
        {
        }
    };
    std::unique_ptr<node> head;
    node *tail;

public:
    queue() : tail(nullptr) {}
    queue(const queue &other) = delete;
    queue &operator=(const queue &other) = delete;

    std::shared_ptr<T> try_pop()
    {
        if (!head)
        {
            return std::shared_ptr<T>();
        }
        std::shared_ptr<T> const res(std::make_shared<T>(std::move(head->data)));
        std::unique_ptr<node> const old_head = std::move(head);
        head = std::move(old_head->next);
        if (!head)
            tail = nullptr;
        return res;
    }

    void push(T new_value)
    {
        std::unique_ptr<node> p(new node(std::move(new_value)));
        node *const new_tail = p.get();
        if (tail)
        {
            tail->next = std::move(p);
        }
        else
        {
            head = std::move(p);
        }
        tail = new_tail;
    }
};

这是一个单向链表示例，首先通过std::unique_ptr管理节点，通过只能指针的特性，可以确保在不需要的节点可以及时删除。从队列头到尾，相邻节点之间形成归属关系，最后一个节点的归属在前一节点,但我们仍需要对其操作，所以用原生指针指向，既tail。
这种数据结构在单线程下可以良好工作，但是如果换成多线程模式，并配合精细粒度的锁，队列中有两个成员，head 和 tail, 原则上我们可以用两个互斥来分别保护，但是有几个问题。

push() 需要同时操作 head 和 tail，需要两个互斥都锁住。
push() 和 try_pop() 可能同时访问同一个节点的 next 指针，在只有一个节点的时候。

分离数据而实现并发

为了解决上述问题，我们通过设立一个不包含数据的虚节点，确保至少存在一个节点，用于区别头尾节点的访问。如果队列为空，头尾指针都指向虚节点。这样一来，空队列 try_pop 不会再访问 head->next。添加数据时，head 和tail 也会指向不同的节点。


template <typename T>
class queue
{
private:
    struct node
    {
        // 直接保存智能指针
        std::shared_ptr<T> data;
        std::unique_ptr<node> next;
        node(T data_) : data(std::move(data_))
        {
        }
    };
    std::unique_ptr<node> head;
    node *tail;

public:
    // 构造时，就创建一个虚节点
    queue() : head(new node), tail(head.get())
    {
    }

    queue(const queue &other) = delete;
    queue &operator=(const queue &other) = delete;

    std::shared_ptr<T> try_pop()
    {
        if (head.get() == tail)
        {
            return std::shared_ptr<T>();
        }

        std::shared_ptr<T> const res(head->data);
        std::unique_ptr<node> const old_head = std::move(head);

        head = std::move(old_head->next);
        if (!head)
            tail = nullptr;
        return res;
    }

    void push(T new_value)
    {
        std::shared_ptr<T> new_data(std::make_shared<T>(std::move(new_value)));
        std::unique_ptr<node> p(new_node);

        tail->data = new_data;

        node *const new_tail = p.get();
        tail->next = std::move(p);
        tail = new_tail;
    }
};

对于try_pop，头指针不在是NULL,而是在构造时创建了一个虚节点，因为是unique_ptr所以用head.get()来和tail比较，判断是不是空队列。因为保存对象已经是只能指针，所以抛出时直接复制，不再需要构造。
对于push, 我们需要先在堆上创建新实例，通过shared_ptr管理。新创建一个节点，作为新的虚拟节点，所以新创建的节点不需要提供new_valaue。最后增加数据就是把新构造的共享指针存入原先的虚拟节点，然后将tail指向新的虚拟节点。

优势

push()操作的对象只有tail指针，不再访问head。虽然try_pop仍然会访问tail , 但是仅作为空队列比较，不会持有太久。
虚拟节点的假如，try_pop与push操作不会再同时操作相同的节点，所以不需要一个全局的变量来保护。我们仅需要对head和tail各用一个互斥来保护。

带有精细粒度锁的队列

现在我们需要考虑的问题变成了需要再哪里加锁。

对于push来说，tail指针的所有访问都需要加锁
对于try_pop来说，首先要对head加锁，完成使用后才能解锁。互斥会被多个线程争抢，决定哪个线程可以弹出数据，所以要一开始锁住，head指针操作完成后才能解开，返回过程不需要加锁。其次是tail的访问需要加锁，因为只需要访问一次，所以在临近读取时加锁就可以，最好是可以封装为一个函数。

template <typename T>
class threadsafe_queue
{
private:
    struct node
    {
        // 直接保存智能指针
        std::shared_ptr<T> data;
        std::unique_ptr<node> next;
    };
    std::mutex head_mutex;
    std::unique_ptr<node> head;

    std::mutex tail_mutex;
    node *tail;

    node *get_tail()
    {
        std::lock_guard<std::mutex> tail_lock(tail_mutex);
        return tail;
    }

    std::unique_ptr<node> pop_head()
    {
        std::lock_guard<std::mutex> head_lock(head_mutex);
        if (head.get() == get_tail())
        {
            return nullptr;
        }
        std::unique_ptr<node> old_head = std::move(head);
        head = std::move(old_head->next);
        return old_head;
    }

public:
    // 构造时，就创建一个虚节点
    threadsafe_queue() : head(new node), tail(head.get())
    {
    }

    threadsafe_queue(const threadsafe_queue &other) = delete;
    queue &operator=(const threadsafe_queue &other) = delete;

    std::shared_ptr<T> try_pop()
    {
        std::unique_ptr<node> old_head = pop_head();
        return old_head ? old_head->data : std::shared_ptr<T>();
    }

    void push(T new_value)
    {
        std::shared_ptr<T> new_data(std::make_shared<T>(std::move(new_value)));
        std::unique_ptr<node> p(new_node);
        node* const new_tail = p.get()

        std::lock_guard<std::mutex> tail_lock(tail_mutex);
        tail->data = new_data;
        tail->next = std::move(p);
        tail = new_tail;
    }
};

锁的正确性判断

首先对于push()操作，涉及到数据结构的改动都在tail_mutex保护下进行。原先的尾结点data与next都正确设置，tail指针也指向了新的虚拟节点
try_pop则相对复杂

对于tail指针可能的竞争发生在push和try_pop同时进行时。tail_mutex上锁，保证对于get_tail和push操作是有序的，get_tail获得的是push操作之前或者之后的结果，要么是tail指针的旧值，要么是被赋予了新值的tail指针，同时还会发现原来的tail指针增加了新的数据。
其次，get_tail一定要在head_mutex之下进行。不然多个线程同时pop_head时，在阻塞期间可能有多个节点已被抛出，get_tail取到的队尾可能已经脱离了队列的范围,空队列的判断自然也会失效，head指针可能会移动到队列之外，破坏整个数据结构。

    std::unique_ptr<node> pop_head()
    {
        node* const old_tail == get_tail())
        // 这里时候有问题的,如果其他线程正在抛出数据，当前线程拿到的old_tail不是真正的队尾
        std::lock_guard<std::mutex> head_lock(head_mutex);
        if (head.get() == old_tail)
        {
            return nullptr;
        }
        std::unique_ptr<node> old_head = std::move(head);
        head = std::move(old_head->next);
        return old_head;
    }

pop_head将头结点移除队列，互斥解锁，如果是真实节点，**try_pop()**会取出数据，并销毁节点。

关于异常判断

try_pop中仅有互斥加锁一项可能会引发异常，且只有加锁后才会发生数据改动，所以异常安全。
push需要再堆上进行两次内存分配，分别是node和需要保存的T，但是都用智能指针管理，可以保证异常出现也能正确清理。在获取锁之后也不会有产生异常的操作。

关于死锁

只有一个需要获得两个锁的操作，在pop_head时总会先获得head_mutex然后在获得tail_mutex，所以不会出现死锁。

并发性能

最关键的问题，这样是否可以提升并发。

相对于之前的实现，**push()**在没有锁的状态下完成了内存分配。这样可以有多个线程进行内存分配，然后再依次加入队列，而加入多列只是指针操作，持有时间短很多。
try_pop() 操作同理，仅在获取head指针时持有锁，而开销较大的拷贝和析构操作放在了锁外进行，虽然只允许一个线程调用pop_head,但是其他部分是可以并发进行的。

等待数据弹出

正如前文所说，可以增加wait_pop功能来避免轮训的开销。
下面是对外接口。

template <typename T>
class threadsafe_queue
{
private:
    struct node
    {
        // 直接保存智能指针
        std::shared_ptr<T> data;
        std::unique_ptr<node> next;
    };
    std::mutex head_mutex;
    std::unique_ptr<node> head;

    std::mutex tail_mutex;
    node *tail;
    std::condition_variable data_cond;

public:
    threadsafe_queue() : head(new node), tail(head.get())
    {
    }
    threadsafe_queue(const threadsafe_queue &other) = delete;
    queue &operator=(const threadsafe_queue &other) = delete;

    std::shared_ptr<T> try_pop();
    bool try_pop(T &value);
    std::shared_ptr<T> wait_and_pop();
    void wait_and_pop(T &value);
    void push(T new_value);
    bool empty();
};

push的改造

首先考虑push()功能，需要在添加数据后调用data_cond.notify_one()即可。需要注意的是调用时机，如果tail_mutex还没释放，就通知，其他线程还是需要等待获得锁之后才能进行。

template <typename T>
void threadsafe_queue<T>::push(T new_value)
{
    std::shared_ptr<T> new_data(std::make_shared<T>(std::move(new_value)));
    std::unique_ptr<node> p(new_node);
    node *const new_tail = p.get();
    {
        std::lock_guard<std::mutex>
            tail_lock(tail_mutex);
        tail->data = new_data;
        tail->next = std::move(p);
        tail = new_tail;
    }
    // 释放锁再通知其他线程，才能获得tail_mutex
    data_cond.notify_one();
}

wait_and_pop()

wait_and_pop更加复杂, 我们需要确定，在哪里等待，根据什么断言唤醒，需要锁住什么互斥。唤醒的断言是head.get*() != get_tail()，需要获得两个互斥，通过之前的分析可以确定，只有读取tail指针时，才需要锁住tail_mutex而比较运算不需要保护。所以在调用data_cond.wait()时，只需要持有互斥head_mutex。
要注意wait_and_pop两个重载，一个返回共享指针，通过old_head获得。而另一种写法，传入引用再通过拷贝赋值，拷贝操作时有可能抛出异常的。如果我们按照之前写法，先获得head然后进行拷贝，抛出异常后，容器本身数据已经更改，这是有问题的。所以我们先获得head的内容进行拷贝，然后再取出。
最后要注意wait_for_data返回的是锁的实例，因为**wait_for_head()**的两个重载都会修改队列数据，所以需要返回锁保证头结点弹出过程中始终持有同一个锁。

template <typename T>
class threadsafe_queue
{
private:
    node *get_tail()
    {
        std::lock_guard<std::mutex> tail_lock(tail_mutex);
        return tail;
    }

    std::unique_ptr<node> pop_head()
    {
        std::unique_ptr<node> old_head = std::move(head);
        head = std::move(old_head->next);
        return old_head;
    }

    std::unique_lock<std::mutex> wait_for_data()
    {
        std::unique_lock<std::mutex> head_lock(head_mutex);
        data_cond.wait(head_lock, [&]
                       { return head.get() != get_tail(); });
        return std::move(head_lock);
    }

    std::unique_ptr<node> wait_pop_head()
    {
        std::unique_lock<std::mutex> head_lock(wait_for_data());
        return pop_head();
    }

    std::unique_ptr<node> wait_pop_head(T &value)
    {
        std::unique_lock<std::mutex> head_lock(wait_for_data());
        value = std::move(*head->data);
        return pop_head();
    }

public:
    std::shared_ptr<T> wait_and_pop()
    {
        std::unique_ptr<node> const old_head = wait_pop_head();
        return old_head->data;
    }
    void wait_and_pop(T &value)
    {
        std::unique_ptr<node> const old_head = wait_pop_head(value);
    }
};