Double Checked Locking Optimization

最新推荐文章于 2023-03-30 10:38:47 发布

hhf

最新推荐文章于 2023-03-30 10:38:47 发布

阅读量570

点赞数

分类专栏： Design Patterns 文章标签： optimization locking compiler caching alignment pointers

本文链接：https://blog.csdn.net/qwidget/article/details/6635641

版权

Design Patterns 专栏收录该内容

21 篇文章 0 订阅

订阅专栏

This article is extracted from POSA2 as a reading note.

I recall that I read an article argued about whether DCL is truely safe sometime. The author told that this pattern was dangerous and outlined the reason, but all the details were forgotten(google may help you find out). I was surprised that POSA2 referred to this pattern.

The Double-Checked Locking Optimization, also known as Lock Hint, reduces contention and synchronization overhead whenever critical sections of code must acquire locks in a thread-safe manner just once during program execution.

The DCLO pattern implementation may require modifications if a compiler optimizes the first-time-in flag by caching it in some way, such as storing it in a CPU register. In this case, cache coherency may become a problem. For example, copies of the first-time-in flag held simultaneously in registers by multiple threads may become inconsistent if one thread's setting of the value is not reflected in other threads's copies.

A related problem is that a highly optimizing compiler may consider the second check of flag==0 to be superfluous and optimize it away. A solution to both these problems is to declare the flag as volatile data, which ensures the compiler will not perform aggressive optimizations that change the program's semantics.

The downsize of using volatile is that all access to will be through memory rather than through registers, which may degrade performance.

POSA2 told us this pattern have three liabilities:

Non-atomic pointer or integral assignment semantics. If an instance_ pointer is used as the flag in a singleton implementation, all bits of the singleton instance_ pointer must be read and written atomically in a single operation. If the write to memory after the call to new is not atomic, other threads may try to read an invalid pointer. This can result in sporadic illegal memory accesses. These scenarios are possible on systems where memory addresses straddle word alignment boundaries, such as 32-bit pointers used on a computer with a 16 bit word bus, which requires two fetches from memory for each pointer access. In this case it may be necessary to use a separate, word-aligned integral flag(assuming that the hardware supports atomic word-based reads and writes) rather than using an instance_ pointer.

Multi-processor cache coherency. Certain multi-processor platforms perform aggressive memory caching optimizations in which read and write operations can execute out of order across multiple CUP caches. On these platforms, it may not be possible to use the DCLO pattern without further modifications because CPU cache lines will not be flushed properly if shared data is accessed without locks held. Unfortunately, the need for CPU-specific code in implementations of the DCLO pattern makes this pattern inapplicable for Java apps. Java's bytecodes are designed to be cross-platform and therefor its JVMs lack a memory barrier instruction that can resolve the problem outlined in this liability.

Additional mutex usage. Regardless of whether a singleton is allocated on demand, some type of lock is allocated and retained for the lifetime of the program. One technique for minimizing this overhead is to preallocate a singleton lock within an object manager and use this lock to serialize all singleton initialization. Although this may increase lock contention, it may not affect program performance because each singleton will most likely acquire and release the lock only once when it's initialized.