# boost无锁队列官方文档（转）

Boost_1_53_0终于迎来了久违的Boost.Lockfree模块，本着学习的心态，将其翻译如下。（原文地址：http://www.boost.org/doc/libs/1_53_0/doc/html/lockfree.html

## 第17章．Boost.Lockfree

ABA预防

## 简介和动机

### 简介及术语

The term non-blocking denotes concurrent data structures,which do not use traditional synchronization primitives like guards to ensurethread-safety. Maurice Herlihy and Nir Shavit (compare "TheArt of Multiprocessor Programming")distinguish between 3 types of non-blocking data structures, each havingdifferent properties:

• data structures are wait-free, if every concurrent operation is guaranteed to be finished in a finite number of steps. It is therefore possible to give worst-case guarantees for the number of operations.
• 无等待数据结构，如果所有并发操作都保证会在有限步骤内完成。因此就有可能给出一个对操作数目的最坏保证。
• data structures are lock-free, if some concurrent operations are guaranteed to be finished in a finite number of steps. While it is in theory possible that some operations never make any progress, it is very unlikely to happen in practical applications.
• 无锁数据结构，如果一些并发操作保证在有限步骤内完成。虽然理论上有些操作可能不会有任何进展，但实际应用中基本不太可能发生。
• data structures are obstruction-free, if a concurrent operation is guaranteed to be finished in a finite number of steps, unless another concurrent operation interferes.
• 无梗阻数据结构，如果除非被另外一个并发操作干预，否则一个并发操作保证在有限步骤内完成。

Some data structures can only be implemented in a lock-freemanner, if they are used under certain restrictions. The relevant aspects forthe implementation of boost.lockfreeare thenumber of producer and consumer threads. Single-producer (sp) or multiple producer (mp) means that only a single thread ormultiple concurrent threads are allowed to add data to a data structure. Single-consumer (sc) or Multiple-consumer (mc) denote the equivalent for the removalof data from the data structure.

### 无阻塞数据结构的性质

Non-blocking data structures do not rely on locks and mutexes toensure thread-safety. The synchronization is done completely in user-spacewithout any direct interaction with the operating system [4].This implies that they are not prone to issues like priority inversion (alow-priority thread needs to wait for a high-priority thread).

Instead of relying on guards, non-blocking data structures require atomicoperations (specificCPU instructions executed without interruption). This means that any threadeither sees the state before or after the operation, but no intermediate statecan be observed. Not all hardware supports the same set of atomic instructions.If it is not available in hardware, it can be emulated in software usingguards. However this has the obvious drawback of losing the lock-free property.

### 无锁数据结构的性能

When discussing the performance of non-blocking data structures,one has to distinguish between amortized and worst-case costs. The definition of 'lock-free'and 'wait-free' only mention the upper bound of an operation. Thereforelock-free data structures are not necessarily the best choice for every usecase. In order to maximise the throughput of an application one should considerhigh-performance concurrent data structures [5].

Lock-free data structures will be a better choice in order tooptimize the latency of a system or to avoid priority inversion, which may be necessaryin real-time applications. In general we advise to consider if lock-free datastructures are necessary or if concurrent data structures are sufficient. Inany case we advice to perform benchmarks with different data structures for aspecific workload.

### 阻塞行为的来源

Apart from locks and mutexes (which we are not using in boost.lockfree anyway),there are three other aspects, that could violate lock-freedom:

AtomicOperations

Somearchitectures do not provide the necessary atomic operations in natively inhardware. If this is not the case, they are emulated in software usingspinlocks, which by itself is blocking.

MemoryAllocations

Allocatingmemory from the operating system is not lock-free. This makes it impossible toimplement true dynamically-sized non-blocking data structures. The node-baseddata structures of boost.lockfree usea memory pool to allocate the internal nodes. If this memory pool is exhausted,memory for new nodes has to be allocated from the operating system. However alldata structures of boost.lockfree canbe configured to avoid memory allocations (instead the specific calls will fail).This is especially useful for real-time systems that require lock-free memoryallocations.

ExceptionHandling

TheC++ exception handling does not give any guarantees about its real-timebehavior. We therefore do not encourage the use of exceptions and exceptionhandling in lock-free code.

C++异常处理对其实时性不做任何保证。因此我们不鼓励在无锁代码中使用异常和异常处理。

### 数据结构

boost.lockfree implementsthree lock-free data structures:

boost.lockfree实现了三种无锁数据结构：

boost::lockfree::queue

alock-free multi-produced/multi-consumer queue

boost::lockfree::stack

alock-free multi-produced/multi-consumer stack

boost::lockfree::spsc_queue

await-free single-producer/single-consumer queue (commonly known as ringbuffer)

### 数据结构配置

The data structures can be configured with Boost.Parameter-styletemplates:

boost::lockfree::fixed_sized

Configuresthe data structure as fixed sized. Theinternal nodes are stored inside an array and they are addressed by arrayindexing. This limits the possible size of the queue to the number of elementsthat can be addressed by the index type (usually 2**16-2), but on platformsthat lack double-width compare-and-exchange instructions, this is the best wayto achieve lock-freedom.

boost::lockfree::capacity

Setsthe capacity of a data structure at compile-time.This implies that a data structure is fixed-sized.

boost::lockfree::allocator

Definesthe allocator. boost.lockfree supportsstateful allocator and is compatible with Boost.Interprocess allocators.

## 示例

### 队列

The boost::lockfree::queue classimplements a multi-writer/multi-reader queue. The following example shows howinteger values are produced and consumed by 4 threads each:

2. #include <boost/lockfree/queue.hpp>
3. #include <iostream>
5. #include <boost/atomic.hpp>
6.
7. boost::atomic_int producer_count(0);
8. boost::atomic_int consumer_count(0);
9.
10. boost::lockfree::queue<int> queue(128);
11.
12. const int iterations = 10000000;
13. const int producer_thread_count = 4;
14. const int consumer_thread_count = 4;
15.
16. void producer(void)
17. {
18.     for (int i = 0; i != iterations; ++i) {
19.         int value = ++producer_count;
20.         while (!queue.push(value))
21.             ;
22.     }
23. }
25. boost::atomic<bool> done (false);
26. void consumer(void)
27. {
28.     int value;
29.     while (!done) {
30.         while (queue.pop(value))
31.             ++consumer_count;
32.     }
33.
34.     while (queue.pop(value))
35.         ++consumer_count;
36. }
37.
38. int main(int argc, char* argv[])
39. {
40.     using namespace std;
41.     cout << "boost::lockfree::queue is ";
42.     if (!queue.is_lock_free())
43.         cout << "not ";
44.     cout << "lockfree" << endl;
48.     for (int i = 0; i != producer_thread_count; ++i)
51.     for (int i = 0; i != consumer_thread_count; ++i)
55.     done = true;
59.     cout << "produced " << producer_count << " objects." << endl;
60.     cout << "consumed " << consumer_count << " objects." << endl;
61. }

The program output is:

1. produced 40000000 objects.
2. consumed 40000000 objects.

### 栈

The boost::lockfree::stack classimplements a multi-writer/multi-reader stack. The following example shows howinteger values are produced and consumed by 4 threads each:

boost::lockfree::stack实现了一个多写入/多读取栈。下面的例子展示了如何产生整数，并被4个线程分别消费：

2. #include <boost/lockfree/stack.hpp>
3. #include <iostream>
5. #include <boost/atomic.hpp>
6.
7. boost::atomic_int producer_count(0);
8. boost::atomic_int consumer_count(0);
9.
10. boost::lockfree::stack<int> stack(128);
12. const int iterations = 1000000;
13. const int producer_thread_count = 4;
14. const int consumer_thread_count = 4;
15.
16. void producer(void)
17. {
18.     for (int i = 0; i != iterations; ++i) {
19.         int value = ++producer_count;
20.         while (!stack.push(value))
21.             ;
22.     }
23. }
25. boost::atomic<bool> done (false);
27. void consumer(void)
28. {
29.     int value;
30.     while (!done) {
31.         while (stack.pop(value))
32.             ++consumer_count;
33.     }
34.
35.     while (stack.pop(value))
36.         ++consumer_count;
37. }
39. int main(int argc, char* argv[])
40. {
41.     using namespace std;
42.     cout << "boost::lockfree::stack is ";
43.     if (!stack.is_lock_free())
44.         cout << "not ";
45.     cout << "lockfree" << endl;
49.     for (int i = 0; i != producer_thread_count; ++i)
52.     for (int i = 0; i != consumer_thread_count; ++i)
56.     done = true;
60.     cout << "produced " << producer_count << " objects." << endl;
61.     cout << "consumed " << consumer_count << " objects." << endl;
62. }

The program output is:

1. produced 4000000 objects.
2. consumed 4000000 objects.

### 无等待单生产者/单消费者队列

The boost::lockfree::spsc_queue classimplements a wait-free single-producer/single-consumer queue. The followingexample shows how integer values are produced and consumed by 2 separatethreads:

boost::lockfree::spsc_queue实现了一个无等待的单生产者/单消费者队列。下面的例子展示了如何产生整数，并被2个单独的线程消费：
2. #include <boost/lockfree/spsc_queue.hpp>
3. #include <iostream>
5. #include <boost/atomic.hpp>
6.
7. int producer_count = 0;
8. boost::atomic_int consumer_count (0);
9.
10. boost::lockfree::spsc_queue<int, boost::lockfree::capacity<1024> > spsc_queue;
12. const int iterations = 10000000;
13.
14. void producer(void)
15. {
16.     for (int i = 0; i != iterations; ++i) {
17.         int value = ++producer_count;
18.         while (!spsc_queue.push(value))
19.             ;
20.     }
21. }
23. boost::atomic<bool> done (false);
25. void consumer(void)
26. {
27.     int value;
28.     while (!done) {
29.         while (spsc_queue.pop(value))
30.             ++consumer_count;
31.     }
32.
33.     while (spsc_queue.pop(value))
34.         ++consumer_count;
35. }
37. int main(int argc, char* argv[])
38. {
39.     using namespace std;
40.     cout << "boost::lockfree::queue is ";
41.     if (!spsc_queue.is_lock_free())
42.         cout << "not ";
43.     cout << "lockfree" << endl;
49.     done = true;
52.     cout << "produced " << producer_count << " objects." << endl;
53.     cout << "consumed " << consumer_count << " objects." << endl;
54. }

The program output is:

1. produced 10000000 objects.
2. consumed 10000000 objects.

## 解释

ABA阻止

### 数据结构

The implementations are implementations of well-known datastructures. The queue is based on Simple, Fast, and Practical Non-Blocking and Blocking ConcurrentQueue Algorithms by Michael Scott and Maged Michael,the stack is based on Systemsprogramming: coping with parallelism by R. K. Treiber andthe spsc_queue is considered as 'folklore' and is implemented in severalopen-source projects including the linux kernel. All data structures arediscussed in detail in "TheArt of Multiprocessor Programming" by Herlihy & Shavit.

### 内存管理

The lock-free boost::lockfree::queue and boost::lockfree::stack classesare node-based data structures, based on a linked list. Memory management oflock-free data structures is a non-trivial problem, because we need to avoidthat one thread frees an internal node, while another thread still uses it. boost.lockfree usesa simple approach not returning any memory to the operating system. Insteadthey maintain a free-list in order to reuse them later. This isdone for two reasons: first, depending on the implementation of the memoryallocator freeing the memory may block (so the implementation would not belock-free anymore), and second, most memory reclamation algorithms arepatented.

### ABA预防

The ABA problem is a common problem when implementing lock-freedata structures. The problem occurs when updating an atomic variable using a compare_exchange operation:if the value A was read, thread 1 changes it to say C and tries to update thevariable, it uses compare_exchange towrite C, only if the current value is A. This might be a problem if in themeanwhile thread 2 changes the value from A to B and back to A, because thread1 does not observe the change of the state. The common way to avoid the ABAproblem is to associate a version counter with the value and change bothatomically.

ABA问题是实现无锁数据结构的一个常见问题。当使用比较交换运算更新一个原子变量时，问题就会出现：如果值A被读取，线程1试图将它改为C并尝试 更新该变量，它使用比较交换来写C，仅当当前值为A时。如果同时线程2将值从A变为B再变为A，这将是个问题，因为线程1没有观察到状态的改变（具体可参 考：http://hustpawpaw.blog.163.com/blog/static/184228324201210811243127/）。通常避免ABA问题的方法是关联一个版本计数器至该值，并且一起原子的变化。

boost.lockfree usesa tagged_ptr helperclass which associates a pointer with an integer tag. This usually requires adouble-width compare_exchange, whichis not available on all platforms. IA32 did not provide the cmpxchg8b opcodebefore the pentium processor and it is also lacking on many RISC architectureslike PPC. Early X86-64 processors also did not provide a cmpxchg16b instruction.On 64bit platforms one can work around this issue, because often not the full64bit address space is used. On X86_64 for example, only 48bit are used for theaddress, so we can use the remaining 16bit for the ABA prevention tag. Fordetails please consult the implementation of theboost::lockfree::detail::tagged_ptr class.

boost.lockfree使用了一个tagged_ptr助手类，它使用一整数标签关联了一个指针。这通常需要一个双宽的比较交 换，该操作并非在所有的平台上都可用。IA32在奔腾处理器之前不提供cmpxchg8b操作码，并且它也缺少许多RISC架构例如PPC。早期的 X86-64处理器也不提供cmpxchg16b 指令。在64位平台上可以解决这个问题，因为经常并非完整的64位地址空间都被使用。例如在X86-64平台上，仅仅使用了地址空间的48位，因此我们可以使用剩下的16位来做为ABA预防标签。具体细节请参考类boost::lockfree::detail::tagged_ptr 的实现。

For lock-free operations on 32bit platforms without double-width compare_exchange, wesupport a third approach: by using a fixed-sized array to store the internalnodes we can avoid the use of 32bit pointers, but instead 16bit indices intothe array are sufficient. However this is only possible for fixed-sized datastructures, that have an upper bound of internal nodes.

### 进程间支持

The boost.lockfree datastructures have basic support for Boost.Interprocess. Theonly problem is the blocking emulation of lock-free atomics, which in thecurrent implementation is not guaranteed to be interprocess-safe.

boost.lockfree数据结构具有对Boost.Interprocess的基本支持。唯一的问题在于对无锁原子的阻塞模拟，这在当前实现中是不保证进程安全的。

### 未来发展

• More data structures (set, hash table, dequeue)
• 更多的数据结构（集合，哈希表，双端队列）
• Backoff schemes (exponential backoff or elimination)
• 退避计划（指数退避或消除）