java线程堆栈线上案例

线上案例

(1) mq发送时间过长导致dubbo线程池沾满

16:54:34,026 WARN [New I/O server worker #1-4] -  [DUBBO] Thread pool is EXHAUSTED! Thread Name: DubboServerHandler-10.8.64.57:20880, Pool Size: 200 (active: 200, core: 200, max: 200, largest: 200), Task: 5823 (completed: 5623), Executor status:(isShutdown:false, isTerminated:false, isTerminating:false), in dubbo://10.8.64.57:20880!, dubbo version: 2.5.3, current host: 10.8.64.57
16:54:34,027 WARN [New I/O server worker #1-4] -  [DUBBO] Thread pool is EXHAUSTED! Thread Name: DubboServerHandler-10.8.64.57:20880, Pool Size: 200 (active: 200, core: 200, max: 200, largest: 200), Task: 5823 (completed: 5623), Executor status:(isShutdown:false, isTerminated:false, isTerminating:false), in dubbo://10.8.64.57:20880!, dubbo version: 2.5.3, current host: 10.8.64.57
16:54:34,027 WARN [New I/O server worker #1-4] -  [DUBBO] An exception was thrown by a user handler while handling an exception event ([id: 0x7434600d, /10.8.64.57:59102 => /10.8.64.57:20880] EXCEPTION: com.alibaba.dubbo.remoting.ExecutionException: class com.alibaba.dubbo.remoting.transport.dispatcher.all.AllChannelHandler error when process received event .), dubbo version: 2.5.3, current host: 10.8.64.57
com.alibaba.dubbo.remoting.ExecutionException: class com.alibaba.dubbo.remoting.transport.dispatcher.all.AllChannelHandler error when process caught event .
	at com.alibaba.dubbo.remoting.transport.dispatcher.all.AllChannelHandler.caught(AllChannelHandler.java:67)
	at com.alibaba.dubbo.remoting.transport.AbstractChannelHandlerDelegate.caught(AbstractChannelHandlerDelegate.java:44)
	at com.alibaba.dubbo.remoting.transport.AbstractChannelHandlerDelegate.caught(AbstractChannelHandlerDelegate.java:44)
	at com.alibaba.dubbo.remoting.transport.AbstractPeer.caught(AbstractPeer.java:127)
	at com.alibaba.dubbo.remoting.transport.netty.NettyHandler.exceptionCaught(NettyHandler.java:112)


报的是dubbo的线程池满了,那到底是什么业务阻塞了?
ps 查看了进程号,然后jstack dump出了线程堆栈,发现

"DubboServerHandler-10.8.64.57:20880-thread-111" daemon prio=10 tid=0x00007fd574193800 nid=0x16cf1 waiting for monitor entry [0x00007fd691887000..0x00007fd691888810]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:40)
	- waiting to lock <0x00007fd6c9fa4ba8> (a java.lang.Object)
	at org.apache.activemq.transport.ResponseCorrelator.oneway(ResponseCorrelator.java:60)
	at org.apache.activemq.ActiveMQConnection.doAsyncSendPacket(ActiveMQConnection.java:1265)
	at org.apache.activemq.ActiveMQConnection.asyncSendPacket(ActiveMQConnection.java:1259)
	at org.apache.activemq.ActiveMQSession.<init>(ActiveMQSession.java:250)
	at org.apache.activemq.ActiveMQConnection.createSession(ActiveMQConnection.java:318)
	at sun.reflect.GeneratedMethodAccessor222.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)

线程状态:java.lang.Thread.State: BLOCKED
上面分析过了,是synchronized拿不到锁导致阻塞的。

通过

at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:40)

找到代码block的位置, 原因就是 其它的地方没有释放writeMutex监视器锁。
在这里插入图片描述
搜索 这个 writeMutex ,发现用到的都是 mq的发送消息方法。
在这里插入图片描述
问题原因就知道了,mq的同步发送消息一直阻塞着,导致客户端这边卡死在发送方法上,导致dubbo线程池耗尽。

mq发生了瓶颈,叫运维处理了。

(2) 下面这个案例相当经典

public class Main {
    static final AtomicInteger threadNumber = new AtomicInteger(1);
    static ThreadFactory factory = new ThreadFactory() {
        @Override
        public Thread newThread(Runnable r) {
            return new Thread(r, "DubboServerHandler-10.0.105.140:29997-thread-" + threadNumber.getAndIncrement());
        }
    };
    private static Lock lock = new ReentrantLock();

    public static void main(String[] args) throws Exception {
        ExecutorService service = createDubboThreadPools();
        for (int i = 0; i < 10; i++) {
            try {
                service.execute(new Runnable() {
                    @Override
                    public void run() {
                        try {
                            Thread.sleep(100);
                        } catch (InterruptedException e) {
                            e.printStackTrace();
                        }
                        doService();
                    }
                });
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
        //第二波提交任务
        for (int i = 0; i < 10; i++) {
            service.execute(new Runnable() {
                @Override
                public void run() {
                    while (true) {
                    	System.err.println(Thread.currentThread().getName() + "  第二波执行==========");
               		 }
                }
            });
            Thread.sleep(100);
        }
    }

    public static void doService() {
        try {
            lock.lock();
            boolean ret = new Random().nextBoolean();
            if (ret || !ret) {
                // 模拟业务出异常了
                System.err.println(Thread.currentThread().getName() + "  开始执行业务异常,没有unlock");
                throw new RuntimeException("业务异常了");
            }
            lock.unlock();
            System.err.println("业务执行正常 , unlock");
        } catch (Exception e) {
        }
    }

    public static ExecutorService createDubboThreadPools() {
        ExecutorService s =
            new ThreadPoolExecutor(3, 4, 1, TimeUnit.SECONDS, new LinkedBlockingQueue<Runnable>(1000), factory);
        return s;
    }
}

上面代码解释:

1.模拟创建了dubbo线程池,指定了线程名称。
2.先提交了10个任务执行doService方法,注意:doService模拟了业务直接抛出异常,unlock失败。
3.在提交10个任务执行业务。

上述代码执行结果十分恐怖:

DubboServerHandler-10.0.105.140:29997-thread-2  开始执行业务异常,没有unlock
DubboServerHandler-10.0.105.140:29997-thread-2  开始执行业务异常,没有unlock
DubboServerHandler-10.0.105.140:29997-thread-2  开始执行业务异常,没有unlock
DubboServerHandler-10.0.105.140:29997-thread-2  开始执行业务异常,没有unlock
DubboServerHandler-10.0.105.140:29997-thread-2  开始执行业务异常,没有unlock
DubboServerHandler-10.0.105.140:29997-thread-2  开始执行业务异常,没有unlock
DubboServerHandler-10.0.105.140:29997-thread-2  开始执行业务异常,没有unlock
DubboServerHandler-10.0.105.140:29997-thread-2  开始执行业务异常,没有unlock
DubboServerHandler-10.0.105.140:29997-thread-2  第二波执行==========
DubboServerHandler-10.0.105.140:29997-thread-2  第二波执行==========
DubboServerHandler-10.0.105.140:29997-thread-2  第二波执行==========
DubboServerHandler-10.0.105.140:29997-thread-2  第二波执行==========
DubboServerHandler-10.0.105.140:29997-thread-2  第二波执行==========
DubboServerHandler-10.0.105.140:29997-thread-2  第二波执行==========
DubboServerHandler-10.0.105.140:29997-thread-2  第二波执行==========
DubboServerHandler-10.0.105.140:29997-thread-2  第二波执行==========
DubboServerHandler-10.0.105.140:29997-thread-2  第二波执行==========
DubboServerHandler-10.0.105.140:29997-thread-2  第二波执行==========

可以看到 整个过程就只有DubboServerHandler-xx-thread-2 一个线程在执行。我们开的线程池线程大小是3个,真正执行的只有一个,说明我们的其它线程都阻塞了,这在线上就会导致dubbo线程池沾满。

我们用jstack分析上面线程堆栈:
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
发现 我们核心3个线程是已经开启了的,只有线程2处于RUNNABLE 运行状态,其它的都是Waiting也就是阻塞在lock.lock那里了(从Main58行代码看出)。
在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值