一、问题分析
围栏系统cms gc时间长,导致其他依赖系统接口报警。 查询gc日志如下:
2020-05-08T16:12:11.082+0800: 883881.058: [GC remark 2020-05-08T16:12:11.082+0800: 883881.058: [Finalize Marking, 0.0272172 secs] 2020-05-08T16:12:11.109+0800: 883881.085: [GC ref-proc2020-05-08T16:
12:11.109+0800: 883881.085: [SoftReference, 7857 refs, 0.0041019 secs]2020-05-08T16:12:11.114+0800: 883881.089: [WeakReference, 542 refs, 0.0019797 secs]2020-05-08T16:12:11.116+0800: 883881.091: [Fi
nalReference, 6957 refs, 0.0873560 secs]2020-05-08T16:12:11.203+0800: 883881.179: [PhantomReference, 18144 refs, 96 refs, 2.6098675 secs]2020-05-08T16:12:13.813+0800: 883883.789: [JNI Weak Reference
, 0.0005067 secs], 2.7160672 secs] 2020-05-08T16:12:13.825+0800: 883883.801: [Unloading, 1.3218515 secs], 4.0970590 secs]
[Times: user=6.13 sys=2.26, real=4.10 secs]
gc的过程中回收虚引用耗时过多,然后dump了一下围栏内存,分析得出mysql虚引用占比90%已上
为什么会生成如此多的PhantomReference?从NonRegisteringDriver#trackConnection谈起
public class NonRegisteringDriver implements java.sql.Driver {
protected static final ConcurrentHashMap<ConnectionPhantomReference, ConnectionPhantomReference> connectionPhantomRefs = new ConcurrentHashMap<ConnectionPhantomReference, ConnectionPhantomReference>();
protected static final ReferenceQueue<ConnectionImpl> refQueue = new ReferenceQueue<ConnectionImpl>();
protected static void trackConnection(Connection newConn) {
ConnectionPhantomReference phantomRef = new ConnectionPhantomReference((ConnectionImpl) newConn, refQueue);
connectionPhantomRefs.put(phantomRef, phantomRef);
}
}
mysql driver在创建Connection时,会调用NonRegisteringDriver#trackConnection,生成虚引用,并存入connectionPhantomRefs中。在本例中显然connectionPhantomRefs中存放的虚引用数目过多,导致了gc时间过长。
connectionPhantomRefs的清理代码:AbandonedConnectionCleanupThread#run
public void run() {
for (;;) {
try {
checkContextClassLoaders();
Reference<? extends ConnectionImpl> ref = NonRegisteringDriver.refQueue.remove(5000);
if (ref != null) {
try {
((ConnectionPhantomReference) ref).cleanup();
} finally {
NonRegisteringDriver.connectionPhantomRefs.remove(ref);
}
}
} catch (InterruptedException e) {
threadRef = null;
return;
} catch (Exception ex) {
// Nowhere to really log this.
}
}
}
当虚引用持有的对象被垃圾回收时,会进入ReferenceQueue中(在本次GC时,虚引用及其关联对象并不会被回收),等待处理。daemon线程AbandonedConnectionCleanupThread会监测ReferenceQueue,不停的取出对象执行cleanup清理操作,清理完成后,从connectionPhantomRefs中删除虚引用,此时虚引用及其关联的对象彻底变成不可达,在下次垃圾回收时被回收。
连接池中的对象因为生命周期较长,经过若干次YGC后,进入老年代。仅在full GC时,才会尝试回收,而围栏系统full GC频率低,导致累积的虚引用未得到及时的清理,只增不减,日积月累,越存越多,至发生full GC时,处理耗时较长。
二、解决问题
1: 虚引用往往做为一种兜底策略,避免用户忘记释放资源,引发内存泄露。在对系统把握性比较强时,可以不用兜底策略,直接删除中connectionPhantomRefs中的虚引用,对象不可达,在full GC时直接回收,从而减少PhantomReference的处理时间。
//自定义实现就是mysql连接的虚引用
//3.添加定时任务
@Scheduled(fixedRate=5000)
public void configureTasks() {
try {
Field connectionPhantomRefs = NonRegisteringDriver.class.getDeclaredField("connectionPhantomRefs");
if(connectionPhantomRefs!=null){
connectionPhantomRefs.setAccessible(true);
Map map = (Map) connectionPhantomRefs.get(NonRegisteringDriver.class);
log.info("connectionPhantomRefs size=[{}]",map.size());
boolean cleanSwitch = SwitchDicts.getBoolean("clean_connection_task",false);
if(cleanSwitch&&map.size()>50){
map.clear();
}
}
} catch (Exception e) {
//e.printStackTrace();
log.info("connectionPhantomRefs clear error!",e);
}
}
2.上述方案显然过于暴力,connectionPhantomRefs中虚引用过多,说明生成的Connection太多,重复利用率低。更优雅的方案是调整连接池参数。
重点调大idleTimeout和maxLifttime的值。maxLifttime设置为0时表示生命周期无限大,此值的设定需要小于MySQL服务端wait_timeout的值(默认为8小时),原因详见关于MySQL的wait_timeout连接超时问题报错解决方案。
三、虚引用科普
public class Main {
public static void main(String[] args) throws InterruptedException {
Main.printlnMemory("1.原可用内存和总内存");
byte[] object = new byte[10 * Main.M];
Main.printlnMemory("2.实例化10M的数组后");
//建立虚引用
ReferenceQueue<Object> referenceQueue = new ReferenceQueue<Object>();
PhantomReference<Object> phantomReference = new PhantomReference<Object>(object, referenceQueue);
Main.printlnMemory("3.建立虚引用后");
System.out.println("phantomReference : " + phantomReference);
//虚引用通过get()方法无法获取到关联对象
System.out.println("phantomReference.get() : " + phantomReference.get());
System.out.println("referenceQueue.poll() : " + referenceQueue.poll());
//断开byte[10*PhantomReferenceTest.M]的强引用
object = null;
Main.printlnMemory("4.执行object = null;强引用断开后");
System.gc();
Thread.sleep(1000*5);
Main.printlnMemory("5.GC后");
System.out.println("phantomReference : " + phantomReference);
System.out.println("phantomReference.get() : " + phantomReference.get());
//第一次Full GC仅仅将虚引用加入到referenceQueue中,并未真实回收object【】对象
System.out.println("referenceQueue.poll() : " + referenceQueue.poll());
System.out.println("referenceQueue.poll() : " + referenceQueue.poll());
//断开虚引用,关联对象变为不可达。第二次Full GC时,回收object【】对象
phantomReference = null;
System.gc();
Main.printlnMemory("6.断开虚引用后GC");
System.out.println("phantomReference : " + phantomReference);
System.out.println("referenceQueue.poll() : " + referenceQueue.poll());
// 总结: 虚引用持有的对象,GC时发现该对象没有其他引用后,会把该对象放入 ReferenceQueue ,通过referenceQueue.poll()可以获取并移除。
}
public static int M = 1024 * 1024;
public static void printlnMemory(String tag) {
Runtime runtime = Runtime.getRuntime();
int M = Main.M;
System.out.println("\n" + tag + ":");
System.out.println(runtime.freeMemory() / M + "M(free)/" + runtime.totalMemory() / M + "M(total)");
}
}
执行结果:
1.原可用内存和总内存:
55M(free)/57M(total)
2.实例化10M的数组后:
45M(free)/57M(total)
3.建立虚引用后:
45M(free)/57M(total)
phantomReference : java.lang.ref.PhantomReference@7852e922
phantomReference.get() : null
referenceQueue.poll() : null
4.执行object = null;强引用断开后:
45M(free)/57M(total)
5.GC后:
46M(free)/57M(total)
phantomReference : java.lang.ref.PhantomReference@7852e922
phantomReference.get() : null
referenceQueue.poll() : java.lang.ref.PhantomReference@7852e922
referenceQueue.poll() : null
6.断开虚引用后GC:
56M(free)/57M(total)
phantomReference : null
referenceQueue.poll() : null
最后以一段PhantomReference的DOC文档结束本文:
<p> Unlike soft and weak references, phantom references are not automatically cleared by the garbage collector as they are enqueued. An object that is reachable via phantom references will remain so until all such references are cleared or themselves become unreachable.
与软引用和弱引用不同,虚引用在垃圾回收时并不会伴随着入队而被清除。虚引用关联的对象将继续保留,一直到所有的引用被清除或者他们本身变为不可达。