基于上一篇文章,dropHead取出节点后,删除节点,会出现内存访问的问题。按照这个逻辑,如果将移出的节点保存到一个无锁队列中,然后在需要节点的时候,从这个备用的无锁队列中取出节点,那么应该就可以避开之前的问题,现在重要的是,判断在程序运行
过程中,备用的琐碎队列的大致长度,会不会需要耗费很多的资源。
下面为修改后的folly代码:
/* * Copyright 2014-present Facebook, Inc. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ #pragma once #include <atomic> #include <cassert> #include <utility> namespace folly { /** * A very simple atomic single-linked list primitive. * * Usage: * * class MyClass { * AtomicIntrusiveLinkedListHook<MyClass> hook_; * } * * AtomicIntrusiveLinkedList<MyClass, &MyClass::hook_> list; * list.insert(&a); * list.sweep([] (MyClass* c) { doSomething(c); } */ template <class T> struct AtomicIntrusiveLinkedListHook { T* next{ nullptr }; }; template <class T, AtomicIntrusiveLinkedListHook<T> T::*HookMember> class AtomicIntrusiveLinkedList { public: AtomicIntrusiveLinkedList() {} AtomicIntrusiveLinkedList(const AtomicIntrusiveLinkedList&) = delete; AtomicIntrusiveLinkedList& operator=(const AtomicIntrusiveLinkedList&) = delete; AtomicIntrusiveLinkedList(AtomicIntrusiveLinkedList&& other) noexcept { auto tmp = other.head_.load(); other.head_ = head_.load(); head_ = tmp; } AtomicIntrusiveLinkedList& operator=( AtomicIntrusiveLinkedList&& other) noexcept { auto tmp = other.head_.load(); other.head_ = head_.load(); head_ = tmp; return *this; } /** * Note: list must be empty on destruction. */ ~AtomicIntrusiveLinkedList() { assert(empty()); } bool empty() const { return head_.load() == nullptr; } /** * Atomically insert t at the head of the list. * @return True if the inserted element is the only one in the list * after the call. */ bool insertHead(T* t) { assert(next(t) == nullptr); auto oldHead = head_.load(std::memory_order_relaxed); do { next(t) = oldHead; /* oldHead is updated by the call below. NOTE: we don't use next(t) instead of oldHead directly due to compiler bugs (GCC prior to 4.8.3 (bug 60272), clang (bug 18899), MSVC (bug 819819); source: http://en.cppreference.com/w/cpp/atomic/atomic/compare_exchange */ } while (!head_.compare_exchange_weak(oldHead, t, std::memory_order_release, std::memory_order_relaxed)); return oldHead == nullptr; } /** * Replaces the head with nullptr, * and calls func() on the removed elements in the order from tail to head. * Returns false if the list was empty. */ template <typename F> bool sweepOnce(F&& func) { if (auto head = head_.exchange(nullptr)) { auto rhead = reverse(head); unlinkAll(rhead, std::forward<F>(func)); return true; } return false; } // new function // if std::memory_order_acquire applies to next(oldHead)(the first one, the argument of compare_exchange_weak) // and I don't know if following bugs affect the code // GCC prior to 4.8.3 (bug 60272), clang prior to 2014-05-05 (bug 18899) // MSVC prior to 2014-03-17 (bug 819819). // template <typename F> T* sweepHead() { // handle if the list is not empty auto oldHead = head_.load(std::memory_order_relaxed); while (oldHead != nullptr && !head_.compare_exchange_weak(oldHead, next(oldHead), std::memory_order_acquire, std::memory_order_relaxed)) ; // if drop out head successfully if (oldHead) { next(oldHead) = nullptr; return oldHead; } return nullptr; } // new function // if std::memory_order_acquire does not apply to next(oldHead) // and I don't know if following bugs affect the code // GCC prior to 4.8.3 (bug 60272), clang prior to 2014-05-05 (bug 18899) // MSVC prior to 2014-03-17 (bug 819819). //template <typename F> T* dropHead() { T* oldHead = nullptr; // handle if the list is not empty while ((oldHead = head_.load(std::memory_order_acquire))) { assert(oldHead != nullptr); T* nextHead = next(oldHead); // because insert and drop out will be involving with head_, they // will change head_ first, then others bool res = head_.compare_exchange_weak(oldHead, nextHead, std::memory_order_relaxed, std::memory_order_relaxed); if (res && oldHead != nullptr) { assert(next(oldHead) == nextHead); next(oldHead) = nullptr; return oldHead; } } return nullptr; } /** * Repeatedly replaces the head with nullptr, * and calls func() on the removed elements in the order from tail to head. * Stops when the list is empty. */ template <typename F> void sweep(F&& func) { while (sweepOnce(func)) { } } /** * Similar to sweep() but calls func() on elements in LIFO order. * * func() is called for all elements in the list at the moment * reverseSweep() is called. Unlike sweep() it does not loop to ensure the * list is empty at some point after the last invocation. This way callers * can reason about the ordering: elements inserted since the last call to * reverseSweep() will be provided in LIFO order. * * Example: if elements are inserted in the order 1-2-3, the callback is * invoked 3-2-1. If the callback moves elements onto a stack, popping off * the stack will produce the original insertion order 1-2-3. */ template <typename F> void reverseSweep(F&& func) { // We don't loop like sweep() does because the overall order of callbacks // would be strand-wise LIFO which is meaningless to callers. auto head = head_.exchange(nullptr); unlinkAll(head, std::forward<F>(func)); } private: std::atomic<T*> head_{ nullptr }; static T*& next(T* t) { return (t->*HookMember).next; } /* Reverses a linked list, returning the pointer to the new head (old tail) */ static T* reverse(T* head) { T* rhead = nullptr; while (head != nullptr) { auto t = head; head = next(t); next(t) = rhead; rhead = t; } return rhead; } /* Unlinks all elements in the linked list fragment pointed to by `head', * calling func() on every element */ template <typename F> void unlinkAll(T* head, F&& func) { while (head != nullptr) { auto t = head; head = next(t); next(t) = nullptr; func(t); } } }; } // namespace folly
下面是测试使用的代码:
#include <memory> #include <cassert> #include <iostream> #include <vector> #include <thread> #include <future> #include <random> #include <cmath> #include "folly.h" using namespace folly; struct student_name { student_name(int age = 0) : age(age) { } int age; AtomicIntrusiveLinkedListHook<student_name> node; }; using ATOMIC_STUDENT_LIST = AtomicIntrusiveLinkedList<student_name, &student_name::node>; ATOMIC_STUDENT_LIST g_students; ATOMIC_STUDENT_LIST g_backStudents; // 统计backStudents的大小 int g_backSize = 0; std::atomic<int> g_inserts; // insert num (successful) std::atomic<int> g_drops; // drop num (successful) std::atomic<int> g_printNum; // as same as g_drops std::atomic<long> g_ageInSum; // age sum when producing student_name std::atomic<long> g_ageOutSum; // age sum when consuming student_name constexpr int HANDLE_NUM = 2000000; // when testing, no more than this number, you know 20,000,000 * 100 ~= MAX_INT constexpr int PRODUCE_THREAD_NUM = 3; // producing thread number constexpr int CONSUME_THREAD_NUM = 3; // consuming thread number inline void printOne(student_name* t) { g_printNum.fetch_add(1, std::memory_order_relaxed); g_ageOutSum.fetch_add(t->age, std::memory_order_relaxed); // clean node // delete t; g_backStudents.insertHead(t); } void eraseOne(student_name* t) { ++g_backSize; delete t; } void insert_students(int idNo) { std::default_random_engine dre(time(nullptr)); std::uniform_int_distribution<int> ageDi(1, 99); while (true) { int newAge = ageDi(dre); g_ageInSum.fetch_add(newAge, std::memory_order_relaxed); auto ns = g_backStudents.dropHead(); if (ns == nullptr) { ns = new student_name(newAge); } g_students.insertHead(ns); // use memory_order_relaxed avoiding affect folly memory order g_inserts.fetch_add(1, std::memory_order_relaxed); // use memory_order_relaxed avoiding affect folly memory order if (g_inserts.load(std::memory_order_relaxed) >= HANDLE_NUM) { return; } } } void drop_students(int idNo) { while (true) { auto st = g_students.dropHead(); if (st) { printOne(st); // use memory_order_relaxed avoiding affect folly memory order g_drops.fetch_add(1, std::memory_order_relaxed); } // use memory_order_relaxed avoiding affect folly memory order if (g_drops.load(std::memory_order_relaxed) >= HANDLE_NUM) { return; } } } int main() { std::vector<std::future<void>> insert_threads; for (int i = 0; i != PRODUCE_THREAD_NUM; ++i) { insert_threads.push_back(std::async(std::launch::async, insert_students, i)); } std::vector<std::future<void>> drop_threads; for (int i = 0; i != CONSUME_THREAD_NUM; ++i) { drop_threads.push_back(std::async(std::launch::async, drop_students, i)); } for (auto& item : insert_threads) { item.get(); } for (auto& item : drop_threads) { item.get(); } std::cout << "insert count1: " << g_inserts.load() << std::endl; std::cout << "drop count1: " << g_drops.load() << std::endl; std::cout << "print num1: " << g_printNum.load() << std::endl; std::cout << "age in1: " << g_ageInSum.load() << std::endl; std::cout << "age out1: " << g_ageOutSum.load() << std::endl; std::cout << std::endl; while (true) { auto st = g_students.dropHead(); if (st) { printOne(st); // use memory_order_relaxed avoiding affect folly memory order g_drops.fetch_add(1, std::memory_order_relaxed); } if (g_students.empty()) { break; } } std::cout << "insert count2: " << g_inserts.load() << std::endl; std::cout << "drop count2: " << g_drops.load() << std::endl; std::cout << "print num2: " << g_printNum.load() << std::endl; std::cout << "age in2: " << g_ageInSum.load() << std::endl; std::cout << "age out2: " << g_ageOutSum.load() << std::endl; g_backStudents.sweepOnce(eraseOne); std::cout << "back Students size: " << g_backSize << std::endl; }
测试结果显示:
在folly.h文件中,dropHead函数的断言 assert(next(oldHead) == nextHead); 会触发,这个问题让我感到很意外,经过我认真思考,我发现了其中可能出现的问题。
说明如下:
现在假设有两个获取g_students节点的线程(调用drop_students函数),两者同时运行到获取nextHead(参考dropHead函数),然后其中一个线程(线程A)中断,另外一个线程(线程B)获取了节点(节点a,节点a的next指向节点b),这个节点被插入到g_backStudents中,这时线程B从g_students中再取出一个节点(节点b,节点b的next指向节点c),然后向g_students中插入节点的线程(调用insert_students函数)(线程C)将节点a插入到g_students中,这时,线程A继续运行,运行head_.compare_exchange_weak函数后,则head_指向节点b,而实际上此时的head_应该指向节点c,当前情况下,有两个节点指向了节点b,程序会出现问题。
当然,我所描述的只是出现问题的一种情况,实际上可能会有很多类似的情况,在这里就不一一举例,但是对于更多线程的情况,显然上面描述的情况是合理的,因为只要假设新增加的线程在上述过程中都处于中断状态就可以了。另外,在更多线程的时候,可能会有更多种出现问题的情况,在这里,我只是为了说明上述实现的不合理性。在上一篇,第一条评论中描述的问题,也可以做类似分析,只是将插入到g_backStudents改为delete,将从g_backStudents中获取节点,改为又在delete的地址创建了一个新的节点(虽然可能性很小,但是这种可能性是存在的)。
在这里,我只是展示一种错误的情况,上述的问题,如果将next节点改为shared_ptr,那么在C++20的编译环境下,或许能够解决,不过,这种修改带来的性能损耗,内存占用增加,与使用无锁队列的本意相违背,这种情况下,将原子操作改为自旋锁,说不定更好。
所以我暂时没有继续尝试下去,有兴趣的人可以考虑,如果有什么好的发现,希望能够分享一下。