ReentrantReadWriteLock死锁分析

成明宁杰

已于 2022-05-31 17:06:12 修改

阅读量895

点赞数

分类专栏： JVM Java 文章标签： java 开发语言

于 2022-05-31 16:52:52 首次发布

本文链接：https://blog.csdn.net/adaivskean/article/details/122516464

版权

Java 同时被 2 个专栏收录

14 篇文章 0 订阅

订阅专栏

JVM

4 篇文章 0 订阅

订阅专栏

背景

服务底层核心逻辑使用ReentrantReadWriteLock控制缓存，出现死锁后，缓存读写全部阻塞。排查线程dump发现，等待的对象既没有被读锁获取，也没有被写锁获取。

可能原因

1. 锁升级

加入服务先获取了读锁，然后又尝试获取写锁，就会发生锁升级。服务JDK版本为1.8，当前版本 ReentrantReadWriteLock并不支持锁升级操作

线程dump如下所示，对象0x000000076abe9300

"Thread-readLock-1" #11 prio=5 os_prio=31 tid=0x00007ffd5b092000 nid=0xa803 waiting on condition [0x0000700002118000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x000000076abe9300> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
	at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
	at com.example.test.TestReentrant.lambda$main$1(TestReentrant.java:47)
	at com.example.test.TestReentrant$$Lambda$2/142257191.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
	- None


"Thread-writeLock" #13 prio=5 os_prio=31 tid=0x00007ffd5b091000 nid=0xa703 waiting on condition [0x000070000221b000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x000000076abe9300> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
	at com.example.test.TestReentrant.lambda$main$3(TestReentrant.java:86)
	at com.example.test.TestReentrant$$Lambda$4/1826771953.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
	- None

堆转储分析对象0x000000076abe9300未被释放

Java Doc中示例写法

Lock downgrading 锁降级（用于减少写锁耗时；保证其他线程写锁被阻塞，数据不被改变）

Sample usages. Here is a code sketch showing how to perform lock downgrading after updating a cache (exception handling is particularly tricky when handling multiple locks in a non-nested fashion):

class CachedData {
  Object data;
  volatile boolean cacheValid;
  final ReentrantReadWriteLock rwl = new ReentrantReadWriteLock();

  void processCachedData() {
    rwl.readLock().lock();
    if (!cacheValid) {
      // Must release read lock before acquiring write lock
      rwl.readLock().unlock();
      rwl.writeLock().lock();
      try {
        // Recheck state because another thread might have
        // acquired write lock and changed state before we did.
        if (!cacheValid) {
          data = ...
          cacheValid = true;
        }
        // Downgrade by acquiring read lock before releasing write lock
        rwl.readLock().lock();
      } finally {
        rwl.writeLock().unlock(); // Unlock write, still hold read
      }
    }

    try {
      use(data);
    } finally {
      rwl.readLock().unlock();
    }
  }
}

声明了volatile类型的cacheValid变量，保证可见性。先获取读锁，如果cache不可用，则释放读锁，获取写锁，在更改数据之前，再检查一次cacheValid的值（防止别的写操作已经改变了缓存状态），然后修改数据，将cacheValid置为true，然后在释放写锁前获取读锁；此时，cache中数据可用，处理cache中数据，最后释放读锁。这个过程就是一个完整的锁降级的过程，目的是保证数据可见性，如果当前的线程C在修改完cache中的数据后，没有获取读锁而是直接释放了写锁，那么假设此时另一个线程T获取了写锁并修改了数据，那么C线程无法感知到数据已被修改，则数据出现错误。如果遵循锁降级的步骤，线程C在释放写锁之前获取读锁，那么线程T在获取写锁时将被阻塞，直到线程C完成数据处理过程，释放读锁。

读多写少，并且远超同步读取的时

ReentrantReadWriteLocks can be used to improve concurrency in some uses of some kinds of Collections. This is typically worthwhile only when the collections are expected to be large, accessed by more reader threads than writer threads, and entail operations with overhead that outweighs synchronization overhead. For example, here is a class using a TreeMap that is expected to be large and concurrently accessed.


class RWDictionary {
  private final Map<String, Data> m = new TreeMap<String, Data>();
  private final ReentrantReadWriteLock rwl = new ReentrantReadWriteLock();
  private final Lock r = rwl.readLock();
  private final Lock w = rwl.writeLock();

  public Data get(String key) {
    r.lock();
    try { return m.get(key); }
    finally { r.unlock(); }
  }
  public String[] allKeys() {
    r.lock();
    try { return m.keySet().toArray(); }
    finally { r.unlock(); }
  }
  public Data put(String key, Data value) {
    w.lock();
    try { return m.put(key, value); }
    finally { w.unlock(); }
  }
  public void clear() {
    w.lock();
    try { m.clear(); }
    finally { w.unlock(); }
  }
}

死锁

1. 锁升级造成的死锁

final ReadWriteLock lock = new ReentrantReadWriteLock();
lock.getReadLock().lock();

// In real code we would go call other methods that end up calling back and
// thus locking again
lock.getReadLock().lock();

// Now we do some stuff and realise we need to write so try to escalate the
// lock as per the Javadocs and the above description
lock.getReadLock().unlock(); // Does not actually release the lock
lock.getWriteLock().lock();  // Blocks as some thread (this one!) holds read lock

System.out.println("Will never get here");

2. Stack Overflow造成死锁

finally其实不难保证锁绝对被释放。如果deepCall中存在StackOverflowError，JVM在执行finally的unlock方法时又触发另外一个StackOverflowError，那么该锁将永远无法释放

Although JEP 270 makes lock and unlock methods somewhat atomic, it does not guarantee the invocation of these methods will always succeed. Unfortunately, the simplest lock-unlock pattern remains fragile:

  lock.lock();
  try {
      deepCall();
  } finally {
      lock.unlock();
  }

If StackOverflowError happens here inside deepCall, JVM will attempt to execute the finally block, but a call to unlock method may result in another StackOverflowError, leaving the object locked forever. For some reason ReentrantLock.unlock() method itself is not annotated with @ReservedStackAccess 😞