关于双重锁

最新推荐文章于 2024-04-12 10:00:00 发布

赵小刚

最新推荐文章于 2024-04-12 10:00:00 发布

阅读量9.3k

点赞数 3

分类专栏：基础语言

基础语言专栏收录该内容

10 篇文章 0 订阅

订阅专栏

Double-Checked Locking( 双检锁 ) 是普遍应用的技术，尤其在多线程环境下是实现延迟加载的有效方法。

然而，在其 Java 实现中，如果不做同步控制它不能保证在任何平台总能正确的执行。而在其他语言实现中如： C++ ，双检锁能否正确执行取决于处理器的内存模型、编译器的对指令的乱序优化以及编译器与同步库之间的相互影响。因为上面三种因素在诸如 C++ 等编程语言中并没有明确的规范，因此，很难明确的说明在什么情形下双检锁能正确执行。在 C++ 中显式的内存障栅 (Memory Barrier 解释见 : http://en.wikipedia.org/wiki/Memory_barrier) 能保证双检锁正确执行，但在 Java 中并没有此类内存障栅。

为了解释程序执行的预期行为，看下面的代码 :

[java] view plain copy print ?

// Single threaded version
class Foo {
private Helper helper = null;
public Helper getHelper() {
if (helper == null)
helper = new Helper();
return helper;
}
// other functions and members...
}

// Single threaded version
class Foo { 
  private Helper helper = null;
  public Helper getHelper() {
    if (helper == null) 
        helper = new Helper();
    return helper;
    }
  // other functions and members...
  }

如果这段代码运行在多线程环境，存在很多导致运行出错的情形。最显而易见的是 : 可能在 Foo 中分配不止一个 Helper 对象 ( 还有其他问题待会下面再解释 ) ，要修复这个问题只需给 getHelper 方法加上同步：

[java] view plain copy print ?

// Correct multithreaded version
class Foo {
private Helper helper = null;
public synchronized Helper getHelper() {
if (helper == null)
helper = new Helper();
return helper;
}
// other functions and members...
}

// Correct multithreaded version
class Foo { 
  private Helper helper = null;
  public synchronized Helper getHelper() {
    if (helper == null) 
        helper = new Helper();
    return helper;
    }
  // other functions and members...
  }

上面的代码中，每次调用 getHelper() 方法都会执行同步操作，而双检锁则是避免每次调用都进行同步的习惯用法（只有 helper 对象在第一次构造的时候需要同步）。

[java] view plain copy print ?

// Broken multithreaded version
// "Double-Checked Locking" idiom
class Foo {
private Helper helper = null;
public Helper getHelper() {
if (helper == null)
synchronized(this) {
if (helper == null)
helper = new Helper();
}
return helper;
}
// other functions and members...
}

// Broken multithreaded version
// "Double-Checked Locking" idiom
class Foo { 
  private Helper helper = null;
  public Helper getHelper() {
    if (helper == null) 
      synchronized(this) {
        if (helper == null) 
          helper = new Helper();
      }    
    return helper;
    }
  // other functions and members...
  }

不幸的是，以上代码在当前优化编译器或共享内存式的多处理器中不能正常工作。

有很多原因导致代码执行失败，我们先描述一些比较明显的出错情形，等明白这些后你可能会尝试修复双检锁的惯用法。但是你的方法很可能不会奏效：因为其中还有很多微妙的原因。明白这些后，你想出一个更好的修复方法，但是仍旧不会奏效，因为还有更多秘密蕴含其中。

很多聪明的人花费了大量时间研究这个，看有没有办法在保证执行正确的情况下，无需每个访问 helper 对象的线程都进行同步。

1．最明显的导致执行失败的情形是：当初始化 Helper 对象时，对 helper 实例赋值这一操作可能已经完成也可能没有。因此，当一个线程调用 getHelper() 可能看到一个非空的 helper 引用，但此时 Helper 对象并没有完成全部的初始化工作，线程这时看到的 helper 对象里面的值都是默认值，而不是 Helper() 构造函数里面设置的值。

如果编译器将构造函数的调用内联化，那么 Helper 对象的初始化和对 helper 实例赋值的指令可以随意重新排序，只要编译器能够确保构造器不会抛出异常或执行同步。即使编译器不重新排序指令，在多处理器系统中，当某个线程运行在一个处理器上，另外的处理器或内存系统可能打乱写操作顺序。 ( 具体细节见：more detailed description of compiler-based reorderings)

一个测试失败执行的案例：

Paul Jakubik 找到一个使用双检锁无法正常执行的例子。当运行的系统使用 SymantecJIT ，代码不能正常工作。特别的， SymantecJIT 编译： singletons[i].reference = new Singleton(); 时产生的机器码如下：

[java] view plain copy print ?

0206106A mov eax,0F97E78h
0206106F call 01F6B210 ; 为Singleton分配空间，返回值到eax
02061074 mov dword ptr [ebp],eax ; EBP 是 &singletons[i]的引用，未构造好的对象存储在这里
02061077 mov ecx,dword ptr [eax] ; 取消引用句柄获取原始指针
02061079 mov dword ptr [ecx],100h ; 接下来的4行是Singleton的内联构造函数
0206107F mov dword ptr [ecx+4],200h
02061086 mov dword ptr [ecx+8],400h
0206108D mov dword ptr [ecx+0Ch],0F84030h

0206106A   mov         eax,0F97E78h
0206106F   call        01F6B210                  ; 为Singleton分配空间，返回值到eax  
02061074   mov         dword ptr [ebp],eax       ; EBP 是 &singletons[i]的引用 ，未构造好的对象存储在这里
02061077   mov         ecx,dword ptr [eax]       ; 取消引用句柄获取原始指针
02061079   mov         dword ptr [ecx],100h      ; 接下来的4行是Singleton的内联构造函数 
0206107F   mov         dword ptr [ecx+4],200h    
02061086   mov         dword ptr [ecx+8],400h
0206108D   mov         dword ptr [ecx+0Ch],0F84030h

正如上面演示的，对 singletons[i].reference 赋值的操作，是在构造器被调用前执行的。而这在现有的JMM 下是完全合法的，C/C++ 同样也合法( 它们都没有指定内存模型)

无法正常工作的修复方式：

通过上面的解释，有人提出如下的解决方法;

[java] view plain copy print ?

// (Still) Broken multithreaded version
// "Double-Checked Locking" idiom
class Foo {
private Helper helper = null;
public Helper getHelper() {
if (helper == null) {
Helper h;
synchronized(this) {
h = helper;
if (h == null)
synchronized (this) {
h = new Helper();
} // release inner synchronization lock
helper = h;
}
}
return helper;
}
// other functions and members...
}

// (Still) Broken multithreaded version
// "Double-Checked Locking" idiom
class Foo { 
  private Helper helper = null;
  public Helper getHelper() {
    if (helper == null) {
      Helper h;
      synchronized(this) {
        h = helper;
        if (h == null) 
            synchronized (this) {
              h = new Helper();
            } // release inner synchronization lock
        helper = h;
        } 
      }    
    return helper;
    }
  // other functions and members...
  }

这段代码把Helper 对象的构造放进了内部同步块中，这里的想法是在同步锁释放的地方需要一个Memory Barrier 以阻止Helper 对象的初始化操作和对helper 实例赋值操作之间的乱序。

但是，这个想法是错误的，同步不是按照那个规则执行的。Monitorexit 执行的规则是在monitorexit 前面的动作，必须在monitor 释放前执行。然而，并没有规则说明monitorexit 之后的动作不会在monitorexit 释放前执行。编译器将赋值语句：helper=h ；放进里面的同步块中是合理的，这样又回到了前面我们介绍的情形。许多处理器提供了这种执行单路memory barrier 的指令，因为像上面这种以变更语义的方法，通过释放锁来获取全局的Memory Barrier 的方式是由性能上的损耗的。

更多无法凑效的修复方法

也许你想强制让writer 执行一个全局、双向的memory barrier ，但这种方式臃肿，低效而且一旦JMM 进行修订就不能保证能正确工作了。然而，即使线程在初始化helper 对象时使用全局的memory barrier ，仍不能达到预期的目标。在有些系统中，线程如果要看到一个非空的helper 实例，也需要执行memory barrier 。因为处理器中有对内存数据的本地 cache ，一些处理器，除非执行一条cache coherence 指令如memory barrier ，否则即使其他处理器使用memory barrier 把结果写入全局内存中，处理器执行的还是本地cache 保存的过时的数据。( 这里说明了需要使用 volatile 变量的理由) 。

这是一个讨论在Aopha 处理器发生这种情形的页面：http://www.cs.umd.edu/~pugh/java/memoryModel/AlphaReordering.html。

是否值得这么麻烦去做：

对大多数应用，执行getHeler 方法进行同步的代价并不会很大。应用中当这种同步对性能造成负担时有必要考虑使用其他的方法。比如：不使用交换排序而使用Java 内置的归并排序时同步会有较大的性能影响。

使用静态单例：

如果你创建的是静态的单实例，那么会有更好的解决方法。只需在单独的类中定义一个静态实例，Java 的语义会保证实例只有在主动使用时被初始化，且任何线程看到的总是完整的初始化结果。

[java] view plain copy print ?

class HelperSingleton {
static Helper singleton = new Helper();
}

class HelperSingleton {
  static Helper singleton = new Helper();
  }

32 位原始数据类型能正常工作

尽管双检锁惯用法不适用于对象引用，但是对32 位的原始数据类型是可以正常工作的。需要注意的是long 和double 类型也是不能正常工作的，因为64 位原始数据类型的读写操作，如果不同步的话也是不能保证原子性的。

[java] view plain copy print ?

// Correct Double-Checked Locking for 32-bit primitives
class Foo {
private int cachedHashCode = 0;
public int hashCode() {
int h = cachedHashCode;
if (h == 0)
synchronized(this) {
if (cachedHashCode != 0) return cachedHashCode;
h = computeHashCode();
cachedHashCode = h;
}
return h;
}
// other functions and members...
}

// Correct Double-Checked Locking for 32-bit primitives
class Foo { 
  private int cachedHashCode = 0;
  public int hashCode() {
    int h = cachedHashCode;
    if (h == 0) 
    synchronized(this) {
      if (cachedHashCode != 0) return cachedHashCode;
      h = computeHashCode();
      cachedHashCode = h;
      }
    return h;
    }
  // other functions and members...
  }

跟int或float基本是相同的，他们能保证原子性。

实际上，假设computeHashCode 函数总是返回相同的结果，而且没有副作用( 即幂等的) ，此时你可以完全不用同步操作。(h 是32int 型，所以赋值是原子的，而函数调用也是幂等因此不会存在不一致的中间结果)

[java] view plain copy print ?

// Lazy initialization 32-bit primitives
// Thread-safe if computeHashCode is idempotent
class Foo {
private int cachedHashCode = 0;
public int hashCode() {
int h = cachedHashCode;
if (h == 0) {
h = computeHashCode();
cachedHashCode = h;
}
return h;
}
// other functions and members...
}

// Lazy initialization 32-bit primitives
// Thread-safe if computeHashCode is idempotent
class Foo { 
  private int cachedHashCode = 0;
  public int hashCode() {
    int h = cachedHashCode;
    if (h == 0) {
      h = computeHashCode();
      cachedHashCode = h;
      }
    return h;
    }
  // other functions and members...
  }

使用显式的memory barriers保证正确执行
如果提供了显式的memory barriers指令，就能够保证双检锁模式正确执行。例如：如果使用C++，你可以使用Doug Schmidt书中的代码：

 
 
  
  
   
   
    
    [java] 
    
    view plain
    
    copy
    
    print
    
    ?
   
   
  
  
  
  // C++ implementation with explicit memory barriers   
// Should work on any platform, including DEC Alphas   
// From "Patterns for Concurrent and Distributed Objects",   
// by Doug Schmidt   
template <class TYPE, class LOCK> TYPE *  
Singleton<TYPE, LOCK>::instance (void) {  
    // First check   
    TYPE* tmp = instance_;  
    // Insert the CPU-specific memory barrier instruction   
    // to synchronize the cache lines on multi-processor.   
    asm ("memoryBarrier");  
    if (tmp == 0) {  
        // Ensure serialization (guard   
        // constructor acquires lock_).   
        Guard<LOCK> guard (lock_);  
        // Double check.   
        tmp = instance_;  
        if (tmp == 0) {  
                tmp = new TYPE;  
                // Insert the CPU-specific memory barrier instruction   
                // to synchronize the cache lines on multi-processor.   
                asm ("memoryBarrier");  
                instance_ = tmp;  
        }  
    return tmp;  
    }  
 
 
// C++ implementation with explicit memory barriers
// Should work on any platform, including DEC Alphas
// From "Patterns for Concurrent and Distributed Objects",
// by Doug Schmidt
template <class TYPE, class LOCK> TYPE *
Singleton<TYPE, LOCK>::instance (void) {
    // First check
    TYPE* tmp = instance_;
    // Insert the CPU-specific memory barrier instruction
    // to synchronize the cache lines on multi-processor.
    asm ("memoryBarrier");
    if (tmp == 0) {
        // Ensure serialization (guard
        // constructor acquires lock_).
        Guard<LOCK> guard (lock_);
        // Double check.
        tmp = instance_;
        if (tmp == 0) {
                tmp = new TYPE;
                // Insert the CPU-specific memory barrier instruction
                // to synchronize the cache lines on multi-processor.
                asm ("memoryBarrier");
                instance_ = tmp;
        }
    return tmp;
    }

使用TheadLocal修复双检锁
Alexander Terekhov (TEREKHOV@de.ibm.com) 提出一个巧妙的方法即通过线程局部存储实现双检锁。每一个线程都保持有一个局部标志来判断显示是否完成所需的同步操作。

 
 
  
  
   
   
    
    [java] 
    
    view plain
    
    copy
    
    print
    
    ?
   
   
  
  
  
  class Foo {  
  /** If perThreadInstance.get() returns a non-null value, this thread 
  has done synchronization needed to see initialization 
  of helper */  
         private final ThreadLocal perThreadInstance = new ThreadLocal();  
         private Helper helper = null;  
         public Helper getHelper() {  
             if (perThreadInstance.get() == null) createHelper();  
             return helper;  
         }  
         private final void createHelper() {  
             synchronized(this) {  
                 if (helper == null)  
                     helper = new Helper();  
             }  
      // Any non-null value would do as the argument here   
             perThreadInstance.set(perThreadInstance);  
         }  
 }  
 
 
class Foo {
  /** If perThreadInstance.get() returns a non-null value, this thread
  has done synchronization needed to see initialization
  of helper */
         private final ThreadLocal perThreadInstance = new ThreadLocal();
         private Helper helper = null;
         public Helper getHelper() {
             if (perThreadInstance.get() == null) createHelper();
             return helper;
         }
         private final void createHelper() {
             synchronized(this) {
                 if (helper == null)
                     helper = new Helper();
             }
      // Any non-null value would do as the argument here
             perThreadInstance.set(perThreadInstance);
         }
 }

这种实现方法的性能开销不同的JDK实现有较大的差别。在SUN JDK1.2中，ThreadLocal执行很慢，但1.3快了很多，1.4更快了。
Doug Lea analyzed the performance of some techniques for implementing lazy initialization .

在新的JMM下：
JDK5，提出了新的JMM和线程规范 a new Java Memory Model and Thread specification .。
使用Volatile修复双检锁：
JDK5及以后的版本扩展了原有Volatile的语义以确保系统对一个Volatile变量的写操作跟前面的读写操作不会被重新排序，读操作也如此。更多的细节见： this entry in Jeremy Manson's blog 
这样，在双检锁中声明helper变量为Volatile即可，这种方式在JDK1.4及以前不能正常工作：

 
 
  
  
   
   
    
    [java] 
    
    view plain
    
    copy
    
    print
    
    ?
   
   
  
  
  
  // Works with acquire/release semantics for volatile   
// Broken under current semantics for volatile   
  class Foo {  
        private volatile Helper helper = null;  
        public Helper getHelper() {  
            if (helper == null) {  
                synchronized(this) {  
                    if (helper == null)  
                        helper = new Helper();  
                }  
            }  
            return helper;  
        }  
    }  
 
 
// Works with acquire/release semantics for volatile
// Broken under current semantics for volatile
  class Foo {
        private volatile Helper helper = null;
        public Helper getHelper() {
            if (helper == null) {
                synchronized(this) {
                    if (helper == null)
                        helper = new Helper();
                }
            }
            return helper;
        }
    }

不可变对象的双检锁：
如果Helper对象是不可变的，比如：Helper对象的所有实例变量都是final，此时双检锁不需使用volatile关键词。其主要原理是对不可变对象如：String，Integer的引用的读写操作跟基本的32位数据类型一样，都能保证原子性。

Descriptions of double-check idiom

Reality Check , Douglas C. Schmidt, C++ Report, SIGS, Vol. 8, No. 3, March 1996.
Double-Checked Locking: An Optimization Pattern for Efficiently Initializing and Accessing Thread-safe Objects , Douglas Schmidt and Tim Harrison.3rd annual Pattern Languages of Program Design conference , 1996
Lazy instantiation , Philip Bishop and Nigel Warren, JavaWorld Magazine
Programming Java threads in the real world, Part 7 , Allen Holub, Javaworld Magazine, April 1999.
Java 2 Performance and Idiom Guide, Craig Larman and Rhett Guthrie, p100.
Java in Practice: Design Styles and Idioms for Effective Java, Nigel Warren and Philip Bishop, p142.
Rule 99, The Elements of Java Style , Allan Vermeulen, Scott Ambler, Greg Bumgardner, Eldon Metz, Trvor Misfeldt, Jim Shur, Patrick Thompson, SIGS Reference library
Global Variables in Java with the Singleton Pattern , Wiebe de Jong, Gamelan

赵小刚

关注

3
点赞
踩
4

收藏

觉得还不错? 一键收藏
2
评论
关于双重锁

Double-Checked Locking( 双检锁 ) 是普遍应用的技术，尤其在多线程环境下是实现延迟加载的有效方法。然而，在其 Java 实现中，如果不做同步控制它不能保证在任何平台总能正确的执行。而在其他语言实现中如： C++ ，双检锁能否正确执行取决于处理器的内存模型、编译器的对指令的乱序优化以及编译器与同步库之间的相互影响。因为上面三种因素在诸如 C++ 等编程语言中并没有明确的
复制链接

扫一扫

专栏目录