HDFS await()和interrupt()引起的死循环问题分析

1.背景

hdfs短路读利用UNIX域套接字,可以让客户端和DataNode通信(客户端与dn是同一台机器)。需要在datanode和客户端的配置中都配置一个套接字路径,并且开启短路读特性。官方链接

hdfs在短路读时, 2.7版本的代码中DomainSocketWatcher(linux机器进程的socket通信的监听)类中有个bug需要修复。分析如下:

//DomainSocketWatcher#add相关代码
public void add(DomainSocket sock, Handler handler) {
  lock.lock();
  try {
    Entry entry = new Entry(sock, handler);
    toAdd.add(entry);
    while (true) {
      try {
        processedCond.await();
      } catch (InterruptedException e) {
        Thread.currentThread().interrupt();
      }
      if (!toAdd.contains(entry)) {
        break;
      }
    }
  } finally {
    lock.unlock();
  }
}
//DomainSocketWatcher#remove相关代码
public void remove(DomainSocket sock) {
  lock.lock();
  try {
    toRemove.put(sock.fd, sock);
    while (true) {
      try {
        processedCond.await();
      } catch (InterruptedException e) {
        Thread.currentThread().interrupt();
      }
      if (!toRemove.containsKey(sock.fd)) {
        break;
      }
    }
  } finally {
    lock.unlock();
  }
}
final Thread watcherThread = new Thread(new Runnable() {
  @Override
  public void run() {  
    final TreeMap<Integer, Entry> entries = new TreeMap<Integer, Entry>();
    try {
      while (true) {
        lock.lock();
        try {
            // Handle pending removals
            while (true) {
              Map.Entry<Integer, DomainSocket> entry = toRemove.firstEntry();
              if (entry == null) break;
              sendCallbackAndRemove("handlePendingRemovals", entries, fdSet, entry.getValue().fd);
            }
            processedCond.signalAll();
          }
   ...
}

上述add / remove / watcherThread 是典型的线程通信代码。在 add / remove 中,注意以下代码:

   while (true) {
      try {
        processedCond.await();
      } catch (InterruptedException e) {
        Thread.currentThread().interrupt();
      }
  ...

线程如果捕获到中断,则再次通过Thread.currentThread().interrupt();中断当前线程,由于while(true)会进入下一次循环,执行processedCond.await();等待,进入await代码:

public final void await() throws InterruptedException {
    if (Thread.interrupted())
        throw new InterruptedException();
    Node node = addConditionWaiter();
...

如果线程已经被中断了,则抛出中断异常,会再次被InterruptedException捕获。从而陷入死循环。

2.测试

首先定义一个线程通信类Service,里面定义三个方法,分别用于测试await、awaitUninterruptibly、signalAll方法。具体如下:

public class Service {
    private ReentrantLock lock = new ReentrantLock();
    private Condition condition = lock.newCondition();
    public void testAwaitMethod(){
        try {
            while (true) {
                try {
                    lock.lock();
                    System.out.println("await begin");
                    condition.await();
                    System.out.println("await end");
                } catch (InterruptedException e) {
                    e.printStackTrace();
                    System.out.println("catch await");
                    try {
                        Thread.sleep(2000);
                    } catch (InterruptedException e1) {
                        e1.printStackTrace();
                    }
                    Thread.currentThread().interrupt();
                } 
            }
        }finally {
            lock.unlock();
        }
        
    }

    public void testAwaitUninterMethod(){
        try {
            while (true) {
                try {
                    lock.lock();
                    System.out.println("awaitUninterruptibly begin");
                    condition.awaitUninterruptibly();
                    System.out.println("awaitUninterruptibly end");
                } catch (Exception e) {
                    e.printStackTrace();
                    System.out.println("catch awaitUninterruptibly");
                    try {
                        Thread.sleep(2000);
                    } catch (InterruptedException e1) {
                        e1.printStackTrace();
                    }
                    Thread.currentThread().interrupt();
                }
            }
        } finally {
            lock.unlock();
        }
    }
    
    public void testSignalAll(){
        try { 
            System.out.println("signalAll init");
            lock.lock();
            System.out.println("signalAll begin");
            condition.signalAll();
            System.out.println("signalAll end");
        } catch (Exception e){
            e.printStackTrace();
            System.out.println("catch signalAll");
        }finally {
            lock.unlock();
        }
    }
}

定义两个线程类分别调用await()方法和awaitUninterruptibly(),如下:

public class MyThread0 extends Thread {
    private Service service;

    public MyThread0(Service service) {
        this.service = service;
    }
    
    @Override
    public void run() {
        service.testAwaitMethod();
    }
}
public class MyThread1 extends Thread {
    private Service service;

    public MyThread1(Service service) {
        this.service = service;
    }

    @Override
    public void run() {
        service.testAwaitUninterMethod();
    }
}

测试1:两个线程均await状态

public class Run {
    public static void main(String[] args) {
        try {
            Service service = new Service();
            
            MyThread0 myThread0 = new MyThread0(service);
            myThread0.start();

            MyThread1 myThread1 = new MyThread1(service);
            myThread1.start();
            
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

结果:

await begin
awaitUninterruptibly begin

测试2:signalAll后两线程再次await状态

public class Run {
    public static void main(String[] args) {
        try {
            Service service = new Service();
            
            MyThread0 myThread0 = new MyThread0(service);
            myThread0.start();

            MyThread1 myThread1 = new MyThread1(service);
            myThread1.start();
            
            Thread.sleep(100);
            service.testSignalAll();
            
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

结果:

await begin
awaitUninterruptibly begin
signalAll init
signalAll begin
signalAll end
await end
await begin
awaitUninterruptibly end
awaitUninterruptibly begin

测试3:显示调用interrupt()

显示调用interrupt()时,thread0进入catch,然后进入死循环;thread1直接忽略,不会进入catch。thread0进入死循环后,testSignalAll无法调用signalAll方法。

public class Run {
    public static void main(String[] args) {
        try {
            Service service = new Service();
            
            MyThread0 myThread0 = new MyThread0(service);
            myThread0.start();
            myThread0.interrupt();

            MyThread1 myThread1 = new MyThread1(service);
            myThread1.start();
            myThread1.interrupt();
            
            Thread.sleep(100);
            service.testSignalAll();
            
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

结果:

await begin
java.lang.InterruptedException
catch await
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2034)
	at thread.lock.awaituninterruptibly.Service.testAwaitMethod(Service.java:22)
	at thread.lock.awaituninterruptibly.MyThread0.run(MyThread0.java:15)
signalAll init
await begin
catch await
java.lang.InterruptedException
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2034)
	at thread.lock.awaituninterruptibly.Service.testAwaitMethod(Service.java:22)
	at thread.lock.awaituninterruptibly.MyThread0.run(MyThread0.java:15)
java.lang.InterruptedException
await begin
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2034)
catch await
	at thread.lock.awaituninterruptibly.Service.testAwaitMethod(Service.java:22)
	at thread.lock.awaituninterruptibly.MyThread0.run(MyThread0.java:15)
await begin
java.lang.InterruptedException
catch await
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2034)
	at thread.lock.awaituninterruptibly.Service.testAwaitMethod(Service.java:22)
	at thread.lock.awaituninterruptibly.MyThread0.run(MyThread0.java:15)

小结

通过上述测试3表明,使用 condition.awaitUninterruptibly(); 代替 condition.await();会忽略中断。另外,在while中,出现await和Thread.currentThread().interrupt();是不合理的。社区代码HADOOP-14214中已换成awaitUninterruptibly,并去掉Thread.currentThread().interrupt();

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值