Double Checked Locking Optimization

 This article is extracted from POSA2 as a reading note.

I recall that I read an article argued about whether DCL is truely safe sometime. The author told that this pattern was dangerous and outlined the reason, but all the details were forgotten(google may help you find out). I was surprised that POSA2 referred to this pattern.

The Double-Checked Locking Optimization, also known as Lock Hint, reduces contention and synchronization overhead whenever critical sections of code must acquire locks in a thread-safe manner just once during program execution.

The DCLO pattern implementation may require modifications if a compiler optimizes the first-time-in flag by caching it in some way, such as storing it in a CPU register. In this case, cache coherency may become a problem. For example, copies of the first-time-in flag held simultaneously in registers by multiple threads may become inconsistent if one thread's setting of the value is not reflected in other threads's copies.

A related problem is that a highly optimizing compiler may consider the second check of flag==0 to be superfluous and optimize it away. A solution to both these problems is to declare the flag as volatile data, which ensures the compiler will not perform aggressive optimizations that change the program's semantics.

The downsize of using volatile is that all access to will be through memory rather than through registers, which may degrade performance.

POSA2 told us this pattern have three liabilities:

Non-atomic pointer or integral assignment semantics. If an instance_ pointer is used as the flag in a singleton implementation, all bits of the singleton instance_ pointer must be read and written atomically in a single operation. If the write to memory after the call to new is not atomic, other threads may try to read an invalid pointer. This can result in sporadic illegal memory accesses. These scenarios are possible on systems where memory addresses straddle word alignment boundaries, such as 32-bit pointers used on a computer with a 16 bit word bus, which requires two fetches from memory for each pointer access. In this case it may be necessary to use a separate, word-aligned integral flag(assuming that the hardware supports atomic word-based reads and writes) rather than using an instance_ pointer.

Multi-processor cache coherency. Certain multi-processor platforms perform aggressive memory caching optimizations in which read and write operations can execute out of order across multiple CUP caches. On these platforms, it may not be possible to use the DCLO pattern without further modifications because CPU cache lines will not be flushed properly if shared data is accessed without locks held. Unfortunately, the need for CPU-specific code in implementations of the DCLO pattern makes this pattern inapplicable for Java apps. Java's bytecodes are designed to be cross-platform and therefor its JVMs lack a memory barrier instruction that can resolve the problem outlined in this liability.

Additional mutex usage. Regardless of whether a singleton is allocated on demand, some type of lock is allocated and retained for the lifetime of the program. One technique for minimizing this overhead is to preallocate a singleton lock within an object manager and use this lock to serialize all singleton initialization. Although this may increase lock contention, it may not affect program performance because each singleton will most likely acquire and release the lock only once when it's initialized.


 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值