JVM对synchronized的优化——锁膨胀

6 篇文章 0 订阅

JVM对synchronized的优化——锁膨胀

前言

  • 通常我们在synchronized(…)中传一个对象,即可实现加锁,非常简单。而使用锁,常见的问题就是效率慢。在较早期的Java中synchronized会直接调用系统的重量级互斥锁(monitor)来实现加锁,效率较慢。锁膨胀则是针对该问题的优化方案:先由JVM自己管理锁,如果不行才调用系统的重量级锁(由无锁升级为偏向锁,再升级为轻量级锁,最后升级为重量级锁)。

synchronized字节码操作

  • synchronized代码示例
    public class Demo01 {
    
        public static void main(String[] args) {
            synchronized ("lockObj") {
                System.out.println("Demo01.main");
            }
        }
    
    }
    
  • 执行 javap -c .\Demo01.class,反编译生成字节码操作
  • main方法的字节码操作
    public static void main(java.lang.String[]);
        Code:
           0: ldc           #2                  // String lockObj
           2: dup
           3: astore_1
           4: monitorenter
           5: getstatic     #3                  // Field java/lang/System.out:Ljava/io/PrintStream;
           8: ldc           #4                  // String Demo01.main
          10: invokevirtual #5                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
          13: aload_1
          14: monitorexit
          15: goto          23
          18: astore_2
          19: aload_1
          20: monitorexit
          21: aload_2
          22: athrow
          23: return
    
  • 锁的指令
    • monitorenter 加锁
    • monitorexit 释放锁

锁标志位、偏向信息

  • 锁膨胀的原理:通过修改对象头的锁标志位、偏向信息等,从而标识出对象的不同锁状态
  • 锁状态信息存在于对象头中,包括以下几种:
    • 无锁:新建一个对象的默认状态
    • 偏向锁:只需比较 Thread ID,适用于单个线程重入
    • 轻量级锁:CAS自旋,速度快,但存在CPU空转问题、线程长时间获取不到锁问题、CPU高速缓存的频繁同步问题等
    • 重量级锁:需调用系统级互斥锁(mutex/monitor),效率低
    • GC标记:由markSweep使用,标记一个对象为无效状态
  • 看到这里,你应该明白了“为什么synchronized(…)中只能传对象,不能传基础数据类型?”(基础数据类型不是对象,没有对象头,也就没有锁信息)

锁膨胀的流程

Tips:建议对着MarkWord结构图

  • 执行synchronized同步块,JVM进行优化。
  • —> 单个线程A获取锁,使用CAS修改对象的锁信息(偏向线程id),[无锁]变为[偏向锁]
    • 修改锁对象 MarkWord 中的Thread ID为该线程
    • 锁对象 MarkWord 中的偏向值转为1
  • —> 线程A再次获取锁,因为是偏向锁,所以非常快
  • —> 线程A运行一会儿后,退出了同步代码块
  • —> 线程B过来获取锁,因为锁记录的偏向线程id是线程A,所以进行CAS修改会失败,此时会暂停(安全点)原偏向线程A并检测线程A的状态。因为A之前已退出同步代码块,此时需要线程A释放锁(即修改锁对象为无锁状态)。
  • —> 一会儿后,线程B拿到了偏向锁
  • —> —> 线程C开始过来抢锁,判断锁的偏向线程,显然CAS修改会失败,同时,若线程B仍然存活(未退出同步代码块),原偏向线程B会将 [偏向锁]将升级为[轻量级锁]
    • 在 Thread 栈上建立 LockRecord
    • 拷贝锁对象的 MarkWord 到 Thread 栈上的 LockRecord
    • CAS 替换锁对象的 MarkWord 的 LockRecord 指针 (即指向该 Thread 的 LockRecord)
  • —> —> 线程C持续进行CAS自旋获取执行权(即替换 MarkWord 的 LockRecord 指针),如果成功,获得轻量级锁(CAS自旋存在CPU空转问题)
  • —> —> —> 如果线程C持续进行CAS自旋超过n次(早期版本是 -XX:PreBlockSpin=10,新的是自适应自旋次数),[轻量级锁]将升级为[重量级锁]
    • 锁对象 MarkWord 中的 LockRecord 指针指向系统的重量级锁monitor
    • Thread 进入系统的EntryList队列,等待系统调用,获取到锁
  • 关于膨胀流程,博客dreamtobe有一份示意图如下
    锁膨胀流程图

为什么要进行锁膨胀?较轻的锁就一定好?

  • 首先,由低级别的锁逐步升级到高级别的锁,可以优先使用较轻的锁,一般来说效率更好。
  • 但是,较轻的锁也有自身的问题:
    • 偏向锁,锁标志位是01(和无锁状态一样),通过偏向信息来确定情况,只适合同一线程重入锁的情况(这个时候效率非常高)。当其他线程过来获取锁时,如果竞争激烈,需要经常执行偏向锁撤销与升级为偏向锁的操作,效率较慢。因此,遇到多个线程同时竞争时,需要升级锁,以便于提高效率。
    • 轻量级锁,锁标志位是00,通过CAS自旋实现,效率也不错。但CAS自旋存在浪费CPU性能问题,为避免浪费性能,所以需要规定达到某个自旋次数后(自适应),不再自旋。此时,就需要转为重量级锁。
  • 因此,不能认为较轻的锁(偏向锁、轻量级锁)性能就一定好,应该根据情况达到一个合适的点
  • JVM参数
    • 是否使用偏向锁:默认值 -XX:+UseBiasedLocking

附:OpenJDK8 源码

  • 对象锁状态 hotspot/src/share/vm/oops/markOop.hpp
     enum { locked_value             = 0, // 偏向锁
            unlocked_value           = 1, // 无锁
            monitor_value            = 2, // 重量级锁
            marked_value             = 3, // GC标记
            biased_lock_pattern      = 5  // 偏向锁
     };
    
  • CAS操作 hotspot/src/os_cpu/linux_x86/vm/atomic_linux_x86.inline.hpp (windows同理)
    // int 的 CAS操作
    inline jint     Atomic::cmpxchg    (jint     exchange_value, volatile jint*     dest, jint     compare_value) {
      int mp = os::is_MP();
      __asm__ volatile (LOCK_IF_MP(%4) "cmpxchgl %1,(%3)"
                        : "=a" (exchange_value)
                        : "r" (exchange_value), "a" (compare_value), "r" (dest), "r" (mp)
                        : "cc", "memory");
      return exchange_value;
    }
    
    // long 的 CAS操作
    inline jlong    Atomic::cmpxchg    (jlong    exchange_value, volatile jlong*    dest, jlong    compare_value) {
      bool mp = os::is_MP();
      __asm__ __volatile__ (LOCK_IF_MP(%4) "cmpxchgq %1,(%3)"
                            : "=a" (exchange_value)
                            : "r" (exchange_value), "a" (compare_value), "r" (dest), "r" (mp)
                            : "cc", "memory");
      return exchange_value;
    }
    
  • 偏向锁 hotspot/src/share/vm/runtime/synchronizer.cpp
    // -----------------------------------------------------------------------------
    //  Fast Monitor Enter/Exit
    // This the fast monitor enter. The interpreter and compiler use
    // some assembly copies of this code. Make sure update those code
    // if the following function is changed. The implementation is
    // extremely sensitive to race condition. Be careful.
    
    void ObjectSynchronizer::fast_enter(Handle obj, BasicLock* lock, bool attempt_rebias, TRAPS) {
     if (UseBiasedLocking) { // JVM参数: -XX:+UseBiasedLocking (是否使用偏向锁)
        if (!SafepointSynchronize::is_at_safepoint()) {
          // 获取偏向锁
          BiasedLocking::Condition cond = BiasedLocking::revoke_and_rebias(obj, attempt_rebias, THREAD);
          if (cond == BiasedLocking::BIAS_REVOKED_AND_REBIASED) {
            return;
          }
        } else {
          assert(!attempt_rebias, "can not rebias toward VM thread");
          BiasedLocking::revoke_at_safepoint(obj);
        }
        assert(!obj->mark()->has_bias_pattern(), "biases should be revoked by now");
     }
    
     slow_enter (obj, lock, THREAD) ;
    }
    
  • 轻量级锁 hotspot/src/share/vm/runtime/synchronizer.cpp
    // -----------------------------------------------------------------------------
    // Interpreter/Compiler Slow Case
    // This routine is used to handle interpreter/compiler slow case
    // We don't need to use fast path here, because it must have been
    // failed in the interpreter/compiler code.
    void ObjectSynchronizer::slow_enter(Handle obj, BasicLock* lock, TRAPS) {
      markOop mark = obj->mark(); // 获取对象的mark
      assert(!mark->has_bias_pattern(), "should not see bias pattern here");
    
      if (mark->is_neutral()) { // 是否为无锁状态
        // Anticipate successful CAS -- the ST of the displaced mark must
        // be visible <= the ST performed by the CAS.
        lock->set_displaced_header(mark); // 将mark保存到线程的的锁记录中
        // 使用CAS操作将对象的锁记录指针指向 mark
        if (mark == (markOop) Atomic::cmpxchg_ptr(lock, obj()->mark_addr(), mark)) {
          TEVENT (slow_enter: release stacklock) ;
          return ;
        }
        // Fall through to inflate() ...
      } else
      if (mark->has_locker() && THREAD->is_lock_owned((address)mark->locker())) {
        // 如果处于加锁状态,且mark指向当前线程,继续执行同步代码
        assert(lock != mark->locker(), "must not re-lock the same lock");
        assert(lock != (BasicLock*)obj->mark(), "don't relock with same BasicLock");
        lock->set_displaced_header(NULL);
        return;
      }
    
    #if 0
      // The following optimization isn't particularly useful.
      if (mark->has_monitor() && mark->monitor()->is_entered(THREAD)) {
        lock->set_displaced_header (NULL) ;
        return ;
      }
    #endif
    
      // The object header will never be displaced to this lock,
      // so it does not matter what the value is, except that it
      // must be non-zero to avoid looking like a re-entrant lock,
      // and must not look locked either.
      lock->set_displaced_header(markOopDesc::unused_mark());
      ObjectSynchronizer::inflate(THREAD, obj())->enter(THREAD);
    }
    
  • 锁膨胀函数
    ObjectMonitor * ATTR ObjectSynchronizer::inflate (Thread * Self, oop object) {
      // Inflate mutates the heap ...
      // Relaxing assertion for bug 6320749.
      assert (Universe::verify_in_progress() ||
              !SafepointSynchronize::is_at_safepoint(), "invariant") ;
    
      for (;;) { // 自旋
          const markOop mark = object->mark() ;
          assert (!mark->has_bias_pattern(), "invariant") ;
    
          // The mark can be in one of the following states:
          // *  Inflated     - just return
          // *  Stack-locked - coerce it to inflated
          // *  INFLATING    - busy wait for conversion to complete
          // *  Neutral      - aggressively inflate the object.
          // *  BIASED       - Illegal.  We should never see this
    
          // CASE: inflated
          if (mark->has_monitor()) { // 是否已经是重量级锁
              ObjectMonitor * inf = mark->monitor() ; // 如果是,那就获取到monitor,然后返回它
              assert (inf->header()->is_neutral(), "invariant");
              assert (inf->object() == object, "invariant") ;
              assert (ObjectSynchronizer::verify_objmon_isinpool(inf), "monitor is invalid");
              return inf ;
          }
    
          // CASE: inflation in progress - inflating over a stack-lock.
          // Some other thread is converting from stack-locked to inflated.
          // Only that thread can complete inflation -- other threads must wait.
          // The INFLATING value is transient.
          // Currently, we spin/yield/park and poll the markword, waiting for inflation to finish.
          // We could always eliminate polling by parking the thread on some auxiliary list.
          // 如果处于膨胀中,等待完成(被其他线程执行膨胀)
          if (mark == markOopDesc::INFLATING()) {
             TEVENT (Inflate: spin while INFLATING) ;
             ReadStableMark(object) ;
             continue ; // 继续自旋
          }
    
          // CASE: stack-locked
          // Could be stack-locked either by this thread or by some other thread.
          //
          // Note that we allocate the objectmonitor speculatively, _before_ attempting
          // to install INFLATING into the mark word.  We originally installed INFLATING,
          // allocated the objectmonitor, and then finally STed the address of the
          // objectmonitor into the mark.  This was correct, but artificially lengthened
          // the interval in which INFLATED appeared in the mark, thus increasing
          // the odds of inflation contention.
          //
          // We now use per-thread private objectmonitor free lists.
          // These list are reprovisioned from the global free list outside the
          // critical INFLATING...ST interval.  A thread can transfer
          // multiple objectmonitors en-mass from the global free list to its local free list.
          // This reduces coherency traffic and lock contention on the global free list.
          // Using such local free lists, it doesn't matter if the omAlloc() call appears
          // before or after the CAS(INFLATING) operation.
          // See the comments in omAlloc().
    
          if (mark->has_locker()) { // 是否是轻量级锁
              ObjectMonitor * m = omAlloc (Self) ;
              // Optimistically prepare the objectmonitor - anticipate successful CAS
              // We do this before the CAS in order to minimize the length of time
              // in which INFLATING appears in the mark.
              m->Recycle();
              m->_Responsible  = NULL ;
              m->OwnerIsThread = 0 ;
              m->_recursions   = 0 ;
              m->_SpinDuration = ObjectMonitor::Knob_SpinLimit ;   // Consider: maintain by type/class
              
              // 通过CAS操作标识为正在膨胀中
              markOop cmp = (markOop) Atomic::cmpxchg_ptr (markOopDesc::INFLATING(), object->mark_addr(), mark) ;
              if (cmp != mark) {
                 omRelease (Self, m, true) ;
                 // 失败,继续重试自旋(已经被其他线程标识膨胀了)
                 continue ;       // Interference -- just retry
              }
    
              // We've successfully installed INFLATING (0) into the mark-word.
              // This is the only case where 0 will appear in a mark-work.
              // Only the singular thread that successfully swings the mark-word
              // to 0 can perform (or more precisely, complete) inflation.
              //
              // Why do we CAS a 0 into the mark-word instead of just CASing the
              // mark-word from the stack-locked value directly to the new inflated state?
              // Consider what happens when a thread unlocks a stack-locked object.
              // It attempts to use CAS to swing the displaced header value from the
              // on-stack basiclock back into the object header.  Recall also that the
              // header value (hashcode, etc) can reside in (a) the object header, or
              // (b) a displaced header associated with the stack-lock, or (c) a displaced
              // header in an objectMonitor.  The inflate() routine must copy the header
              // value from the basiclock on the owner's stack to the objectMonitor, all
              // the while preserving the hashCode stability invariants.  If the owner
              // decides to release the lock while the value is 0, the unlock will fail
              // and control will eventually pass from slow_exit() to inflate.  The owner
              // will then spin, waiting for the 0 value to disappear.   Put another way,
              // the 0 causes the owner to stall if the owner happens to try to
              // drop the lock (restoring the header from the basiclock to the object)
              // while inflation is in-progress.  This protocol avoids races that might
              // would otherwise permit hashCode values to change or "flicker" for an object.
              // Critically, while object->mark is 0 mark->displaced_mark_helper() is stable.
              // 0 serves as a "BUSY" inflate-in-progress indicator.
    
    
              // fetch the displaced mark from the owner's stack.
              // The owner can't die or unwind past the lock while our INFLATING
              // object is in the mark.  Furthermore the owner can't complete
              // an unlock on the object, either.
              markOop dmw = mark->displaced_mark_helper() ;
              assert (dmw->is_neutral(), "invariant") ;
    
              // Setup monitor fields to proper values -- prepare the monitor
              m->set_header(dmw) ;
    
              // Optimization: if the mark->locker stack address is associated
              // with this thread we could simply set m->_owner = Self and
              // m->OwnerIsThread = 1. Note that a thread can inflate an object
              // that it has stack-locked -- as might happen in wait() -- directly
              // with CAS.  That is, we can avoid the xchg-NULL .... ST idiom.
              m->set_owner(mark->locker());
              m->set_object(object);
              // TODO-FIXME: assert BasicLock->dhw != 0.
    
              // Must preserve store ordering. The monitor state must
              // be stable at the time of publishing the monitor address.
              guarantee (object->mark() == markOopDesc::INFLATING(), "invariant") ;
              object->release_set_mark(markOopDesc::encode(m));
    
              // Hopefully the performance counters are allocated on distinct cache lines
              // to avoid false sharing on MP systems ...
              if (ObjectMonitor::_sync_Inflations != NULL) ObjectMonitor::_sync_Inflations->inc() ;
              TEVENT(Inflate: overwrite stacklock) ;
              if (TraceMonitorInflation) {
                if (object->is_instance()) {
                  ResourceMark rm;
                  tty->print_cr("Inflating object " INTPTR_FORMAT " , mark " INTPTR_FORMAT " , type %s",
                    (void *) object, (intptr_t) object->mark(),
                    object->klass()->external_name());
                }
              }
              return m ;
          }
    
          // CASE: neutral
          // TODO-FIXME: for entry we currently inflate and then try to CAS _owner.
          // If we know we're inflating for entry it's better to inflate by swinging a
          // pre-locked objectMonitor pointer into the object header.   A successful
          // CAS inflates the object *and* confers ownership to the inflating thread.
          // In the current implementation we use a 2-step mechanism where we CAS()
          // to inflate and then CAS() again to try to swing _owner from NULL to Self.
          // An inflateTry() method that we could call from fast_enter() and slow_enter()
          // would be useful.
    		
    	  // 无锁状态
          assert (mark->is_neutral(), "invariant");
          ObjectMonitor * m = omAlloc (Self) ;
          // prepare m for installation - set monitor to initial state
          m->Recycle();
          m->set_header(mark);
          m->set_owner(NULL);
          m->set_object(object);
          m->OwnerIsThread = 1 ;
          m->_recursions   = 0 ;
          m->_Responsible  = NULL ;
          m->_SpinDuration = ObjectMonitor::Knob_SpinLimit ;       // consider: keep metastats by type/class
    
          if (Atomic::cmpxchg_ptr (markOopDesc::encode(m), object->mark_addr(), mark) != mark) {
              m->set_object (NULL) ;
              m->set_owner  (NULL) ;
              m->OwnerIsThread = 0 ;
              m->Recycle() ;
              omRelease (Self, m, true) ;
              m = NULL ;
              continue ;
              // interference - the markword changed - just retry.
              // The state-transitions are one-way, so there's no chance of
              // live-lock -- "Inflated" is an absorbing state.
          }
    
          // Hopefully the performance counters are allocated on distinct
          // cache lines to avoid false sharing on MP systems ...
          if (ObjectMonitor::_sync_Inflations != NULL) ObjectMonitor::_sync_Inflations->inc() ;
          TEVENT(Inflate: overwrite neutral) ;
          if (TraceMonitorInflation) {
            if (object->is_instance()) {
              ResourceMark rm;
              tty->print_cr("Inflating object " INTPTR_FORMAT " , mark " INTPTR_FORMAT " , type %s",
                (void *) object, (intptr_t) object->mark(),
                object->klass()->external_name());
            }
          }
          return m ;
      }
    }
    
  • 关于自适应自旋
    • 请查阅hotspot/src/share/vm/runtime/objectMonitor.cpp 中的 _SpinDuration
  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值