感谢支付宝同事【寒泉子】的投稿
attach是什么
在讲这个之前,我们先来点大家都知道的东西,当我们感觉线程一直卡在某个地方,想知道卡在哪里,首先想到的是进行线程dump,而常用的命令是jstack <pid>,
大家是否注意过上面圈起来的两个线程,”Attach Listener”和“Signal Dispatcher”,这两个线程是我们这次要讲的attach机制的关键,先偷偷告诉各位,其实Attach Listener这个线程在jvm起来的时候可能并没有的,后面会细说。
那attach机制是什么?说简单点就是jvm提供一种jvm进程间通信的能力,能让一个进程传命令给另外一个进程,并让它执行内部的一些操作,比如说我们为了让另外一个jvm进程把线程dump出来,那么我们跑了一个jstack的进程,然后传了个pid的参数,告诉它要哪个进程进行线程dump,既然是两个进程,那肯定涉及到进程间通信,以及传输协议的定义,比如要执行什么操作,传了什么参数等。
attach能做些什么
总结起来说,比如内存dump,线程dump,类信息统计(比如加载的类及大小以及实例个数等),动态加载agent(使用过btrace的应该不陌生),动态设置vm flag(但是并不是所有的flag都可以设置的,因为有些flag是在jvm启动过程中使用的,是一次性的),打印vm flag,获取系统属性等,这些对应的源码(attachListener.cpp)如下
01 | static AttachOperationFunctionInfo funcs[] = { |
02 | { "agentProperties", get_agent_properties }, |
03 | { "datadump", data_dump }, |
04 | { "dumpheap", dump_heap }, |
05 | { "load", JvmtiExport::load_agent_library }, |
06 | { "properties", get_system_properties }, |
07 | { "threaddump", thread_dump }, |
08 | { "inspectheap", heap_inspection }, |
09 | { "setflag", set_flag }, |
10 | { "printflag", print_flag }, |
11 | { "jcmd", jcmd }, |
后面是命令对应的处理函数。
attach在jvm里如何实现的
Attach Listener线程的创建
前面也提到了,jvm在启动过程中可能并没有启动Attach Listener这个线程,可以通过jvm参数来启动,代码(Threads::create_vm)如下:
01 | if (!DisableAttachMechanism) { |
02 | if (StartAttachListener || AttachListener::init_at_startup()) { |
03 | AttachListener::init(); |
06 | bool AttachListener::init_at_startup() { |
07 | if (ReduceSignalUsage) { |
其中DisableAttachMechanism,StartAttachListener ,ReduceSignalUsage均默认是false(globals.hpp)
1 | product(bool, DisableAttachMechanism, false , \ |
2 | "Disable mechanism that allows tools to attach to this VM”) |
3 | product(bool, StartAttachListener, false , \ |
4 | "Always start Attach Listener at VM startup") |
5 | product(bool, ReduceSignalUsage, false , \ |
6 | "Reduce the use of OS signals in Java and/or the VM”) |
因此AttachListener::init()并不会被执行,而Attach Listener线程正是在此方法里创建的
02 | void AttachListener::init() { |
04 | klassOop k = SystemDictionary::resolve_or_fail(vmSymbols::java_lang_Thread(), true , CHECK); |
05 | instanceKlassHandle klass (THREAD, k); |
06 | instanceHandle thread_oop = klass->allocate_instance_handle(CHECK); |
08 | const char thread_name[] = "Attach Listener"; |
09 | Handle string = java_lang_String::create_from_str(thread_name, CHECK); |
12 | Handle thread_group (THREAD, Universe::system_thread_group()); |
13 | JavaValue result(T_VOID); |
14 | JavaCalls::call_special(&result, thread_oop, |
16 | vmSymbols::object_initializer_name(), |
17 | vmSymbols::threadgroup_string_void_signature(), |
22 | KlassHandle group(THREAD, SystemDictionary::ThreadGroup_klass()); |
23 | JavaCalls::call_special(&result, |
26 | vmSymbols::add_method_name(), |
27 | vmSymbols::thread_void_signature(), |
31 | { MutexLocker mu(Threads_lock); |
32 | JavaThread* listener_thread = new JavaThread(&attach_listener_thread_entry); |
35 | if (listener_thread == NULL || listener_thread->osthread() == NULL) { |
36 | vm_exit_during_initialization("java.lang.OutOfMemoryError", |
37 | "unable to create new native thread"); |
40 | java_lang_Thread::set_thread(thread_oop(), listener_thread); |
41 | java_lang_Thread::set_daemon(thread_oop()); |
43 | listener_thread->set_threadObj(thread_oop()); |
44 | Threads::add(listener_thread); |
45 | Thread::start(listener_thread); |
既然在启动的时候不会创建这个线程,那么我们在上面看到的那个线程是怎么创建的呢,这个就要关注另外一个线程“Signal Dispatcher”了,顾名思义是处理信号的,这个线程是在jvm启动的时候就会创建的,具体代码就不说了。
下面以jstack的实现来说明触发attach这一机制进行的过程,jstack命令的实现其实是一个叫做JStack.java的类,查看jstack代码后会走到下面的方法里
01 | private static void runThreadDump(String pid, String args[]) throws Exception { |
02 | VirtualMachine vm = null ; |
04 | vm = VirtualMachine.attach(pid); |
05 | } catch (Exception x) { |
06 | String msg = x.getMessage(); |
08 | System.err.println(pid + ": " + msg); |
12 | if ((x instanceof AttachNotSupportedException) && |
13 | (loadSAClass() != null )) { |
14 | System.err.println("The -F option can be used when the target " + |
15 | "process is not responding"); |
22 | InputStream in = ((HotSpotVirtualMachine)vm).remoteDataDump((Object[])args); |
25 | byte b[] = new byte [ 256 ]; |
30 | String s = new String(b, 0 , n, "UTF- 8 "); |
请注意VirtualMachine.attach(pid);这行代码,触发attach pid的关键,如果是在linux下会走到下面的构造函数
01 | LinuxVirtualMachine(AttachProvider provider, String vmid) |
02 | throws AttachNotSupportedException, IOException |
04 | super (provider, vmid); |
09 | pid = Integer.parseInt(vmid); |
10 | } catch (NumberFormatException x) { |
11 | throw new AttachNotSupportedException("Invalid process identifier"); |
17 | path = findSocketFile(pid); |
19 | File f = createAttachFile(pid); |
29 | mpid = getLinuxThreadsManager(pid); |
30 | } catch (IOException x) { |
31 | throw new AttachNotSupportedException(x.getMessage()); |
34 | sendQuitToChildrenOf(mpid); |
42 | int retries = ( int )(attachTimeout() / delay); |
46 | } catch (InterruptedException x) { } |
47 | path = findSocketFile(pid); |
49 | } while (i <= retries && path == null ); |
51 | throw new AttachNotSupportedException( |
52 | "Unable to open socket file: target process not responding " + |
53 | "or HotSpot VM not loaded"); |
62 | checkPermissions(path); |
这里要解释下代码了,首先看到调用了createAttachFile方法在目标进程的cwd目录下创建了一个文件/proc/<pid>/cwd/.attach_pid<pid>,这个在后面的信号处理过程中会取出来做判断(为了安全),另外我们知道在linux下线程是用进程实现的,在jvm启动过程中会创建很多线程,比如我们上面的信号线程,也就是会看到很多的pid(应该是LWP),那么如何找到这个信号处理线程呢,从上面实现来看是找到我们传进去的pid的父进程,然后给它的所有子进程都发送一个SIGQUIT信号,而jvm里除了vm thread,其他线程都设置了对此信号的屏蔽,因此收不到该信号,于是该信号就传给了“Signal Dispatcher”,在传完之后作轮询等待看目标进程是否创建了某个文件,attachTimeout默认超时时间是5000ms,可通过设置系统变量sun.tools.attach.attachTimeout来指定,下面是Signal Dispatcher线程的entry实现
01 | static void signal_thread_entry(JavaThread* thread, TRAPS) { |
02 | os::set_priority(thread, NearMaxPriority); |
09 | sig = os::signal_wait(); |
11 | if (sig == os::sigexitnum_pd()) { |
20 | if (!DisableAttachMechanism && AttachListener::is_init_trigger()) { |
28 | VMThread::execute(&op); |
30 | VMThread::execute(&jni_op); |
31 | VM_FindDeadlocks op1(tty); |
32 | VMThread::execute(&op1); |
33 | Universe::print_heap_at_SIGBREAK(); |
34 | if (PrintClassHistogram) { |
35 | VM_GC_HeapInspection op1(gclog_or_tty, true |
37 | VMThread::execute(&op1); |
39 | if (JvmtiExport::should_post_data_dump()) { |
40 | JvmtiExport::post_data_dump(); |
当信号是SIGBREAK(在jvm里做了#define,其实就是SIGQUIT)的时候,就会触发AttachListener::is_init_trigger()的执行
01 | bool AttachListener::is_init_trigger() { |
02 | if (init_at_startup() || is_initialized()) { |
06 | sprintf(fn, ".attach_pid%d", os::current_process_id()); |
09 | RESTARTABLE(::stat64(fn, &st), ret); |
11 | snprintf(fn, sizeof(fn), "%s/.attach_pid%d", |
12 | os::get_temp_directory(), os::current_process_id()); |
13 | RESTARTABLE(::stat64(fn, &st), ret); |
18 | if (st.st_uid == geteuid()) { |
一开始会判断当前进程目录下是否有个.attach_pid<pid>文件(前面提到了),如果没有就会在/tmp下创建一个/tmp/.attach_pid<pid>,当那个文件的uid和自己的uid是一致的情况下(为了安全)再调用init方法
02 | void AttachListener::init() { |
04 | klassOop k = SystemDictionary::resolve_or_fail(vmSymbols::java_lang_Thread(), true , CHECK); |
05 | instanceKlassHandle klass (THREAD, k); |
06 | instanceHandle thread_oop = klass->allocate_instance_handle(CHECK); |
08 | const char thread_name[] = "Attach Listener"; |
09 | Handle string = java_lang_String::create_from_str(thread_name, CHECK); |
12 | Handle thread_group (THREAD, Universe::system_thread_group()); |
13 | JavaValue result(T_VOID); |
14 | JavaCalls::call_special(&result, thread_oop, |
16 | vmSymbols::object_initializer_name(), |
17 | vmSymbols::threadgroup_string_void_signature(), |
22 | KlassHandle group(THREAD, SystemDictionary::ThreadGroup_klass()); |
23 | JavaCalls::call_special(&result, |
26 | vmSymbols::add_method_name(), |
27 | vmSymbols::thread_void_signature(), |
31 | { MutexLocker mu(Threads_lock); |
32 | JavaThread* listener_thread = new JavaThread(&attach_listener_thread_entry); |
35 | if (listener_thread == NULL || listener_thread->osthread() == NULL) { |
36 | vm_exit_during_initialization("java.lang.OutOfMemoryError", |
37 | "unable to create new native thread"); |
40 | java_lang_Thread::set_thread(thread_oop(), listener_thread); |
41 | java_lang_Thread::set_daemon(thread_oop()); |
43 | listener_thread->set_threadObj(thread_oop()); |
44 | Threads::add(listener_thread); |
45 | Thread::start(listener_thread); |
此时水落石出了,看到创建了一个线程,并且取名为Attach Listener。再看看其子类LinuxAttachListener的init方法
01 | int LinuxAttachListener::init() { |
02 | char path[UNIX_PATH_MAX]; |
03 | char initial_path[UNIX_PATH_MAX]; |
07 | ::atexit(listener_cleanup); |
09 | int n = snprintf(path, UNIX_PATH_MAX, "%s/.java_pid%d", |
10 | os::get_temp_directory(), os::current_process_id()); |
11 | if (n < ( int )UNIX_PATH_MAX) { |
12 | n = snprintf(initial_path, UNIX_PATH_MAX, "%s.tmp", path); |
14 | if (n >= ( int )UNIX_PATH_MAX) { |
19 | listener = ::socket(PF_UNIX, SOCK_STREAM, 0 ); |
25 | struct sockaddr_un addr; |
26 | addr.sun_family = AF_UNIX; |
27 | strcpy(addr.sun_path, initial_path); |
28 | ::unlink(initial_path); |
29 | int res = ::bind(listener, (struct sockaddr*)&addr, sizeof(addr)); |
31 | RESTARTABLE(::close(listener), res); |
36 | res = ::listen(listener, 5 ); |
38 | RESTARTABLE(::chmod(initial_path, S_IREAD|S_IWRITE), res); |
40 | res = ::rename(initial_path, path); |
44 | RESTARTABLE(::close(listener), res); |
45 | ::unlink(initial_path); |
49 | set_listener(listener); |
看到其创建了一个监听套接字,并创建了一个文件/tmp/.java_pid<pid>,这个文件就是客户端之前一直在轮询等待的文件,随着这个文件的生成,意味着attach的过程圆满结束了。
attach listener接收请求
看看它的entry实现attach_listener_thread_entry
01 | static void attach_listener_thread_entry(JavaThread* thread, TRAPS) { |
02 | os::set_priority(thread, NearMaxPriority); |
04 | thread->record_stack_base_and_size(); |
06 | if (AttachListener::pd_init() != 0 ) { |
09 | AttachListener::set_initialized(); |
12 | AttachOperation* op = AttachListener::dequeue(); |
22 | if (strcmp(op->name(), AttachOperation::detachall_operation_name()) == 0 ) { |
23 | AttachListener::detachall(); |
26 | AttachOperationFunctionInfo* info = NULL; |
27 | for ( int i= 0 ; funcs[i].name != NULL; i++) { |
28 | const char * name = funcs[i].name; |
29 | assert (strlen(name) <= AttachOperation::name_length_max, "operation <= name_length_max"); |
30 | if (strcmp(op->name(), name) == 0 ) { |
31 | info = &(funcs[i]); |
38 | info = AttachListener::pd_find_operation(op->name()); |
43 | res = (info->func)(op, &st); |
45 | st.print("Operation %s not recognized!", op->name()); |
51 | op->complete(res, &st); |
从代码来看就是从队列里不断取AttachOperation,然后找到请求命令对应的方法进行执行,比如我们一开始说的jstack命令,找到 { “threaddump”, thread_dump }的映射关系,然后执行thread_dump方法 再来看看其要调用的AttachListener::dequeue()
01 | AttachOperation* AttachListener::dequeue() { |
02 | JavaThread* thread = JavaThread::current(); |
03 | ThreadBlockInVM tbivm(thread); |
05 | thread->set_suspend_equivalent(); |
09 | AttachOperation* op = LinuxAttachListener::dequeue(); |
12 | thread->check_and_wait_while_suspended(); |
最终调用的是LinuxAttachListener::dequeue()
01 | LinuxAttachOperation* LinuxAttachListener::dequeue() { |
07 | socklen_t len = sizeof(addr); |
08 | RESTARTABLE(::accept(listener(), &addr, &len), s); |
15 | struct ucred cred_info; |
16 | socklen_t optlen = sizeof(cred_info); |
17 | if (::getsockopt(s, SOL_SOCKET, SO_PEERCRED, ( void *)&cred_info, &optlen) == - 1 ) { |
19 | RESTARTABLE(::close(s), res); |
22 | uid_t euid = geteuid(); |
23 | gid_t egid = getegid(); |
25 | if (cred_info.uid != euid || cred_info.gid != egid) { |
27 | RESTARTABLE(::close(s), res); |
32 | LinuxAttachOperation* op = read_request(s); |
35 | RESTARTABLE(::close(s), res); |
我们看到如果没有请求的话,会一直accept在那里,当来了请求,然后就会创建一个套接字,并读取数据,构建出LinuxAttachOperation返回并执行。
整个过程就这样了,从attach线程创建到接收请求,处理请求,希望对大家有帮助。