项目启动死锁问题排查与解决
项目死锁
概念:
死锁是指两个或两个以上的进程在执行过程中,由于竞争资源或者由于彼此通信而造成的一种阻塞的现象,若无外力作用,它们都将无法推进下去。此时称系统处于死锁状态或系统产生了死锁,这些永远在互相等待的进程称为死锁进程。
发现死锁:
现象:
tomcat启动之后,长时间卡住,不动,通过jvisualvm.exe观察tomcat情况,通过线程Dump,获取关键日志信息。
关键日志:
Found one Java-level deadlock:
=============================
"arterySchedule_Worker-10":
waiting to lock monitor 0x000000005b0166b8 (object 0x0000000081754d88, a java.util.concurrent.ConcurrentHashMap),
which is held by "arteryScheduleStarter"
"arteryScheduleStarter":
waiting to lock monitor 0x000000005b0164a8 (object 0x00000000817553a8, a java.util.concurrent.ConcurrentHashMap),
which is held by "localhost-startStop-1"
"localhost-startStop-1":
waiting to lock monitor 0x000000005b0166b8 (object 0x0000000081754d88, a java.util.concurrent.ConcurrentHashMap),
which is held by "arteryScheduleStarter"
Java stack information for the threads listed above:
===================================================
"arterySchedule_Worker-10":
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:180)
- waiting to lock <0x0000000081754d88> (a java.util.concurrent.ConcurrentHashMap)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:166)
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:206)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:185)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:164)
at org.springframework.context.support.AbstractApplicationContext.getBean(AbstractApplicationContext.java:880)
at com.*.artery.util.ArterySpringUtil.getBean(ArterySpringUtil.java:29)
at com.***********.sfgk.dssz.DsszUtil.isEnabled(DsszUtil.java:23)
at com.***********.artery.module.schedule.impl.PlanInvoker.execute(PlanInvoker.java:63)
at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:525)
"arteryScheduleStarter":
at org.springframework.beans.factory.support.DefaultListableBeanFactory.getBeanDefinitionNames(DefaultListableBeanFactory.java:192)
- waiting to lock <0x00000000817553a8> (a java.util.concurrent.ConcurrentHashMap)
at org.springframework.beans.factory.support.DefaultListableBeanFactory.getBeanNamesForType(DefaultListableBeanFactory.java:209)
at org.springframework.beans.factory.BeanFactoryUtils.beanNamesForTypeIncludingAncestors(BeanFactoryUtils.java:187)
at org.springframework.beans.factory.support.DefaultListableBeanFactory.findAutowireCandidates(DefaultListableBeanFactory.java:652)
at org.springframework.beans.factory.support.DefaultListableBeanFactory.resolveDependency(DefaultListableBeanFactory.java:610)
at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredFieldElement.inject(AutowiredAnnotationBeanPostProcessor.java:412)
at org.springframework.beans.factory.annotation.InjectionMetadata.injectFields(InjectionMetadata.java:105)
at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor.postProcessAfterInstantiation(AutowiredAnnotationBeanPostProcessor.java:240)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.populateBean(AbstractAutowireCapableBeanFactory.java:959)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:472)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory$1.run(AbstractAutowireCapableBeanFactory.java:409)
at java.security.AccessController.doPrivileged(Native Method)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:380)
at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:264)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:222)
- locked <0x0000000081754d88> (a java.util.concurrent.ConcurrentHashMap)
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:261)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:185)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:164)
at org.springframework.context.support.AbstractApplicationContext.getBean(AbstractApplicationContext.java:880)
at com.***********.artery.util.ArterySpringUtil.getBean(ArterySpringUtil.java:29)
at com.***********.artery.module.schedule.cache.ScheduleSummerCache$1.run(ScheduleSummerCache.java:112)
at java.lang.Thread.run(Unknown Source)
"localhost-startStop-1":
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:180)
- waiting to lock <0x0000000081754d88> (a java.util.concurrent.ConcurrentHashMap)
at org.springframework.beans.factory.support.AbstractBeanFactory.isFactoryBean(AbstractBeanFactory.java:747)
at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:422)
- locked <0x00000000817553a8> (a java.util.concurrent.ConcurrentHashMap)
at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:728)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:380)
- locked <0x000000008174e1e0> (a java.lang.Object)
at org.springframework.web.context.ContextLoader.createWebApplicationContext(ContextLoader.java:255)
at org.springframework.web.context.ContextLoader.initWebApplicationContext(ContextLoader.java:199)
at org.springframework.web.context.ContextLoaderListener.contextInitialized(ContextLoaderListener.java:45)
at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4853)
at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5314)
- locked <0x0000000080134a30> (a org.apache.catalina.core.StandardContext)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:145)
- locked <0x0000000080134a30> (a org.apache.catalina.core.StandardContext)
at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:753)
at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:729)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:717)
at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:940)
at org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1816)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Found 1 deadlock.
线程日志解析:
三条线程互相锁住了
1.localhost-startStop-1
此线程是tomcat的启动线程,作用看堆栈应该是在initWebApplicationContext初始化web项目的一些信息,在初始化SpringBean的过程中锁住了。
全局的创建SpringBean,先锁住beanDefinitionMap,再在for循环中创建每一个SpringBean的时候,每次都锁住Spring的一级缓存singletonObjects。
2.arteryScheduleStarter
公司封装的定时任务的启动器,使用线程启动org.opensymphony.quartz的一些定时任务,由于通过ArterySpringUtil.getBean去获取某一个SpringBean,这个时候因为SpringBean未初始化完成,所以去创建一个SpringBean,创建SpringBean的过程中会锁住Spring的一级缓存singletonObjects,然后去创建SpringBean的时候锁住了beanDefinitionMap。
3.arterySchedule_Worker-10
arteryScheduleStarter线程封装的10个子线程,去定时的执行内存里的任务。躺枪的,arteryScheduleStarter初始化也要去获取bean,所以也锁住了资源,导致dump中提到了它的姓名。
死锁的优化方案:
1.去掉锁
2.优化锁的粒度
3.逻辑上让锁住的这段代码按照顺序执行,不争抢资源。
4.其他(我暂时没想到)
解决项目死锁:
既然是由于spring启动过程中锁住了资源,那么就让arteryScheduleStarter这个启动线程往后稍稍,就可以解决这个问题了,查了一下这个任务的启动器,在配置文件中添加一个默认的配置,延迟200s进行定时任务的初始化,解决此问题。
疑问:
源码里有几个地方我想问一下哩
1.getBeanDefinitionNames这个方法为啥需要使用synchronized锁住 beanDefinitionMap[ConcurrentHashMap]
DefaultListableBeanFactory.getBeanDefinitionNames
应该没必要加锁,我看了一下spring2.5.6和spring4.2.3的代码,4.2.3已经不加锁了。
2.getSingleton方法中需要使用使用synchronized锁住 singletonObjects[ConcurrentHashMap]
答案抄来的,哈哈
DefaultSingletonBeanRegistry.getSingleton如果不上锁就可能会出现两个线程同时进到getSingleton方法去初始化,虽然最后初始化后的bean放到singletonObjects时,后一个会覆盖前一个,但是初始化两也是不允许的情况,而且可能出现A这个bean依赖的singleton和B依赖的不是同一个。 这个不仅有死锁问题,也有性能问题,因为beanA和beanB它俩并不冲突,是不是可以把锁粒度拆小一点。这个spring官方在讨论,参见
https://github.com/spring-projects/spring-framework/issues/13117
https://github.com/spring-projects/spring-framework/issues/25667