Java 工具（jmap,jstack）在linux上的源码分析（四）safe point

最新推荐文章于 2024-08-23 09:56:59 发布

raintungli

最新推荐文章于 2024-08-23 09:56:59 发布

阅读量4.2k

点赞数 2

CC 4.0 BY-SA版权

分类专栏： Linux 内核源码分析 JVM 源码分析 JVM 源码分析文章标签： java linux 工具 table thread exception

本文链接：https://blog.csdn.net/raintungli/article/details/7162468

JVM 源码分析同时被 3 个专栏收录

57 篇文章

订阅专栏

JVM 源码分析

55 篇文章

订阅专栏

Linux 内核源码分析

32 篇文章

订阅专栏

本文深入探讨了JVM中的Safepoint机制，解释了其如何确保在线程进行如垃圾回收等重要操作时的安全性。文章详细介绍了三种Safepoint状态，并针对不同线程状态（解释执行、本地代码、编译代码、阻塞状态）阐述了进入Safepoint的具体方法。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

safe point 顾明思意，就是安全点，当需要jvm做一些操作的时候，需要把当前正在运行的线程进入一个安全点的状态（也可以说停止状态），这样才能做一些安全的操作，比如线程的dump，堆栈的信息。

在jvm里面通常vm_thread（我们一直在谈论的做一些属于vm 份内事情的线程）和cms_thread（内存回收的线程）做的操作，是需要将其他的线程通过调用SafepointSynchronize::begin 和 SafepointSynchronize:end来实现让其他的线程进入或者退出safe point 的状态。

通常safepoint 的有三种状态

_not_synchronized	说明没有任何打断现在所有线程运行的操作，也就是vm thread, cms thread 没有接到操作的指令
_synchronizing	vm thread,cms thread 接到操作指令，正在等待所有线程进入safe point
_synchronized	所有线程进入safe point, vm thread, cms thread 可以开始指令操作

Java线程的状态

通常在java 进程中的Java 的线程有几个不同的状态，如何让这些线程进入safepoint 的状态中，jvm是采用不同的方式

a. 正在解释执行

由于java是解释性语言，而线程在解释java 字节码的时候，需要dispatch table,记录方法地址进行跳转的，那么这样让线程进入停止状态就比较容易了，只要替换掉dispatch table 就可以了，让线程知道当前进入softpoint 状态。

java里会设置3个DispatchTable， _active_table， _normal_table， _safept_table

_active_table 正在解释运行的线程使用的dispatch table

_normal_table 就是正常运行的初始化的dispatch table

_safept_table safe point需要的dispatch table

解释运行的线程一直都在使用_active_table,关键处就是在进入saftpoint 的时候，用_safept_table替换_active_table, 在退出saftpoint 的时候，使用_normal_table来替换_active_table

具体实现可以查看源码

void TemplateInterpreter::notice_safepoints() {
  if (!_notice_safepoints) {
    // switch to safepoint dispatch table
    _notice_safepoints = true;
    copy_table((address*)&_safept_table, (address*)&_active_table, sizeof(_active_table) / sizeof(address));
  }
}

// switch from the dispatch table which notices safepoints back to the
// normal dispatch table.  So that we can notice single stepping points,
// keep the safepoint dispatch table if we are single stepping in JVMTI.
// Note that the should_post_single_step test is exactly as fast as the
// JvmtiExport::_enabled test and covers both cases.
void TemplateInterpreter::ignore_safepoints() {
  if (_notice_safepoints) {
    if (!JvmtiExport::should_post_single_step()) {
      // switch to normal dispatch table
      _notice_safepoints = false;
      copy_table((address*)&_normal_table, (address*)&_active_table, sizeof(_active_table) / sizeof(address));
    }
  }
}

b. 运行在native code

如果线程运行在native code的时候，vm thread 是不需要等待线程执行完的，只需要在从native code 返回的时候去判断一下 _state 的状态就可以了。

在方法体里就是前面博客也出现过的 SafepointSynchronize::do_call_back()

  inline static bool do_call_back() {
    return (_state != _not_synchronized);
  }

判断了_state 不是_not_synchronized状态

为了能让线程从native code 回到java 的时候为了能读到/设置正确线程的状态，通常的解决方法使用memory barrier，java 使用OrderAccess::fence(); 在汇编里使用__asm__ volatile ("lock; addl $0,0(%%rsp)" : : : "cc", "memory"); 保证从内存里读到正确的值，但是这种方法严重影响系统的性能，于是java使用了每个线程都有独立的内存页来设置状态。通过使用使用参数-XX:+UseMembar 参数使用memory barrier，默认是不打开的，也就是使用独立的内存页来设置状态。

c. 运行编译的代码

1. Poling page 页面

Poling page是在jvm初始化启动的时候会初始化的一个单独的内存页面，这个页面是让运行的编译过的代码的线程进入停止状态的关键。

在linux里面使用了mmap初始化，源码如下

address polling_page = (address) ::mmap(NULL, Linux::page_size(), PROT_READ, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);

2. 编译

java 的JIT 会直接编译一些热门的源码到机器码，直接执行而不需要在解释执行从而提高效率，在编译的代码中，当函数或者方法块返回的时候会去访问一个内存poling页面.

x86架构下

void LIR_Assembler::return_op(LIR_Opr result) {
  assert(result->is_illegal() || !result->is_single_cpu() || result->as_register() == rax, "word returns are in rax,");
  if (!result->is_illegal() && result->is_float_kind() && !result->is_xmm_register()) {
    assert(result->fpu() == 0, "result must already be on TOS");
  }

  // Pop the stack before the safepoint code
  __ remove_frame(initial_frame_size_in_bytes());

  bool result_is_oop = result->is_valid() ? result->is_oop() : false;

  // Note: we do not need to round double result; float result has the right precision
  // the poll sets the condition code, but no data registers
  AddressLiteral polling_page(os::get_polling_page() + (SafepointPollOffset % os::vm_page_size()),
                              relocInfo::poll_return_type);

  // NOTE: the requires that the polling page be reachable else the reloc
  // goes to the movq that loads the address and not the faulting instruction
  // which breaks the signal handler code

  __ test32(rax, polling_page);

  __ ret(0);
}

在前面提到的SafepointSynchronize::begin 函数源码中

  if (UseCompilerSafepoints && DeferPollingPageLoopCount < 0) {
    // Make polling safepoint aware
    guarantee (PageArmed == 0, "invariant") ;
    PageArmed = 1 ;
    os::make_polling_page_unreadable();
  }

这里提到了2个参数 UseCompilerSafepoints 和 DeferPollingPageLoopCount ，在默认的情况下这2个参数是true和-1

函数体将会调用os:make_polling_page_unreadable();在linux os 下具体实现是调用了mprotect(bottom,size,prot) 使polling 内存页变成不可读。

3. 信号

到当编译好的程序尝试在去访问这个不可读的polling页面的时候，在系统级别会产生一个错误信号SIGSEGV, 可以参考笔者的一篇博客中曾经讲过java 的信号处理，可以知道信号SIGSEGV的处理函数在x86体系下见下源码：

JVM_handle_linux_signal(int sig,
                        siginfo_t* info,
                        void* ucVoid,
                        int abort_if_unrecognized){
   ....
   if (sig == SIGSEGV && os::is_poll_address((address)info->si_addr)) {
        stub = SharedRuntime::get_poll_stub(pc);
      } 
   ....
}

在linux x86,64 bit的体系中，poll stub 的地址就是 SafepointSynchronize::handle_polling_page_exception 详细程序可见shareRuntime_x86_64.cpp

回到safepoint.cpp中，SafepointSynchronize::handle_polling_page_exception通过取出线程的safepoint_stat,调用函数void ThreadSafepointState::handle_polling_page_exception，最后通过调用SafepointSynchronize::block(thread()); 来block当前线程。