1.现象
某次上线公司内部系统时候,发现系统反应很慢,停顿时间过长。
2.问题排查
遇到这种情况,推测是频繁GC导致,首先是查看日志(配置参数-XX:+PrintGCApplicationStoppedTime),结果如下:
Total time for which application threads were stopped: 0.0050077 seconds, Stopping threads took: 0.0009924
Total time for which application threads were stopped: 0. 0048003 seconds, Stopping threads took: 0.000722
Total time for which application threads were stopped: 0. 0042130 seconds, Stopping threads took: 0.0006124
Total time for which application threads were stopped: 0.0167637 seconds, Stopping threads took: 0. 0002226
Total time for which appLication threads were stopped: 0.0041316 seconds, stopping threads took: 0.0003492
Total time for which application threads were stopped: 0.0116966 seconds, Stopping threads took: 0.0004518
Total time for which application threads were stopped: 0.0031024 seconds, Stopping threads took: 0.0002000
Total time for which application threads were stopped: 0.0031730 seconds, Stopping threads took: 0. 0001507
Total time for which application threads were stopped: 0.0643528 seconds, Stopping threads took: 0.0005159
Total time for which application threads were stopped: 0.0051813 seconds, Stopping threads took: 0.001691
Total time for which application threads were stopped: 0.0041419 seconds, Stopping threads took: 0. 0005032
Total time for which application threads were stopped: 0. 0036222 seconds, Stopping threads took: 0. 0005483
可以看出这是stop the world过于频繁导致,在测试环境复现此情况,配置下面JVM参数打印出stop the world的原因。
-XX:+PrintSafepointStatistics
-XX:+PrintSafepointStatisticsCount=1
-XX:+SafepointTimeout
-XX:SafepointTimeoutDelay=200
-XX:+UnlockDiagnosticVMOptions
-XX:-DisplayVMOutput
-XX:+LogVMOutput
-XX:LogFile=/XXXX.log
最终在日志中发现是频繁的RevokeBias(偏向锁撤销)导致的STW时间过长,从而导致系统停顿。
3.问题解决
1)为什么频繁的偏向锁撤销会导致STW时间增加呢?阅读偏向锁源码可以知道:偏向锁的撤销需要等待全局安全点(safe point),暂停持有偏向锁的线程,检查持有偏向锁的线程状态。首先遍历当前JVM的所有线程,如果能找到偏向线程,则说明偏向的线程还存活,此时检查线程是否在执行同步代码块中的代码,如果是,则升级为轻量级锁,进行CAS竞争锁。可以看出撤销偏向锁的时候会导致stop the word。
static BiasedLocking::Condition revoke_bias(oop obj, bool allow_rebias, bool is_bulk, JavaThread* requesting_thread) {
markOop mark = obj->mark();
// 如果对象不是偏向锁,直接返回 NOT_BIASED
if (!mark->has_bias_pattern()) {
...
return BiasedLocking::NOT_BIASED;
}
uint age = mark->age();
markOop biased_prototype = markOopDesc::biased_locking_prototype()->set_age(age);//匿名偏向模式(101)
markOop unbiased_prototype = markOopDesc::prototype()->set_age(age);//无锁模式(001)
...
JavaThread* biased_thread = mark->biased_locker();
if (biased_thread == NULL) {// 匿名偏向。
if (!allow_rebias) {
obj->set_mark(unbiased_prototype);// 如果不允许重偏向,则将对象的 mark word 设置为无锁模式
}
...
return BiasedLocking::BIAS_REVOKED;
}
/*判断偏向线程是否还存活*/
bool thread_is_alive = false;
if (requesting_thread == biased_thread) {
// 如果当前线程就是偏向线程
thread_is_alive = true;
} else {
// 当前线程不是偏向线程,遍历当前 jvm 的所有线程,如果能找到,则说明偏向的线程还存活
for (JavaThread* cur_thread = Threads::first(); cur_thread != NULL; cur_thread = cur_thread->next()) {
if (cur_thread == biased_thread) {
thread_is_alive = true;
break;
}
}
}
// 如果偏向的线程已经不存活了
if (!thread_is_alive) {
// 如果允许重偏向,则将对象 mark word 设置为匿名偏向状态,否则设置为无锁状态
if (allow_rebias) {
obj->set_mark(biased_prototype);
} else {
obj->set_mark(unbiased_prototype);
}
...
return BiasedLocking::BIAS_REVOKED;
}
/*线程还存活则遍历线程栈中所有的 lock record*/
GrowableArray<MonitorInfo*>* cached_monitor_info = get_or_compute_monitor_info(biased_thread);
BasicLock* highest_lock = NULL;
for (int i = 0; i < cached_monitor_info->length(); i++) {
MonitorInfo* mon_info = cached_monitor_info->at(i);
// 如果能找到对应的 lock record,说明偏向所有者正在持有锁
if (mon_info->owner() == obj) {
...
/*升级为轻量级锁,修改栈中所有关联该锁的 lock record,
先处理所有锁重入的情况,轻量级锁的 displaced mark word 为 NULL,表示锁重入*/
markOop mark = markOopDesc::encode((BasicLock*) NULL);
highest_lock = mon_info->lock();
highest_lock->set_displaced_header(mark);
} else {
...
}
}
/* highest_lock 如果非空,则它是最早关联该锁的 lock record,这个 lock record 是线程彻底退出该锁的最后一个 lock record,所以要设置 lock record 的 displaced mark word 为无锁状态的 mark word,并让锁对象的 mark word 指向当前 lock record*/
if (highest_lock != NULL) {
highest_lock->set_displaced_header(unbiased_prototype);
obj->release_set_mark(markOopDesc::encode(highest_lock));
...
} else {
// 偏向所有者没有在持有锁
...
if (allow_rebias) {
obj->set_mark(biased_prototype);// 设置为匿名偏向状态
} else {
obj->set_mark(unbiased_prototype);// 将 mark word 设置为无锁状态
}
}
return BiasedLocking::BIAS_REVOKED;
}
2)定位到问题了,此时可以有两种思路:第一种,减少竞争锁的线程数量,第二种,关闭偏向锁,JVM参数-XX:-UseBiasedLocking 。经过修改代码,系统停顿时间显著减少。
小结一下:我们需要结合自身应用并发情况,来评估偏向锁带来的收益。偏向锁主要影响重启后短时间内的负载尖刺,平滑流量场景影响不大。