问题背景
对项目中的关键应用调用链日志需要结构化得统一吐出到mongodb中,同时项目中日志输出使用log4j,故准备使用log4j的Appender直接集成mongodb的输出,同时mongodb采用集群模式。因为对于日志有特殊结构化要求,所以没有使用log4mongo库。
mongodb驱动版本:3.5.0
log4j版本:1.2.17
问题现象
error和warn级别的日志输出没有任何问题,但是当输出info级别日志的时候,保存失败,进程被阻塞,过一阵子后控制台报出如下:
Exception in thread "main" com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting for a server that matches WritableServerSelector. Client view of cluster state is {type=UNKNOWN, servers=[{address=127.0.0.1:27017, type=UNKNOWN, state=CONNECTING}]
排查过程
首先查看mongodb进程和端口全部正常,并且切回warn和error级别再次输出后,一切正常,mongodb数据成功插入,排除了mongodb进程的原因。此步直接确定是log4j与mongodb集成驱动的原因
再次执行info级别输出,在进程无响应的时候,通过jstack拿取线程栈,发现尽然后线程阻塞引发的死锁。如下:main线程锁住了0x0000000797b8f4d0,而cluster-ClusterId再等待该锁的释放。
"cluster-ClusterId{value='59a3c74a936270211a375fc6', description='null'}-127.0.0.1:27017" #12 daemon prio=5 os_prio=31 tid=0x00007f97c2263800 nid=0x570f waiting for monitor entry [0x0000700001452000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.log4j.Category.callAppenders(Category.java:202)
- waiting to lock <0x0000000797b8f4d0> (a org.apache.log4j.spi.RootLogger)
at org.apache.log4j.Category.forcedLog(Category.java:388)
at org.apache.log4j.Category.log(Category.java:853)
at org.slf4j.impl.Log4jLoggerAdapter.info(Log4jLoggerAdapter.java:300)
at com.mongodb.diagnostics.logging.SLF4JLogger.info(SLF4JLogger.java:71)
at com.mongodb.connection.InternalStreamConnection.open(InternalStreamConnection.java:110)
at com.mongodb.connection.DefaultServerMonitor$ServerMonitorRunnable.run(DefaultServerMonitor.java:111)
- locked <0x0000000797b01148> (a com.mongodb.connection.DefaultServerMonitor$ServerMonitorRunnable)
"main" #1 prio=5 os_prio=31 tid=0x00007f97c202f000 nid=0x1003 waiting on condition [0x0000700000182000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000797b03588> (a java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:114)
at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.