1. arthas是什么
Arthas
是Alibaba开源的Java诊断工具,深受开发者喜爱。
当你遇到以下类似问题而束手无策时,Arthas
可以帮助你解决:
- 这个类从哪个 jar 包加载的?为什么会报各种类相关的 Exception?
- 我改的代码为什么没有执行到?难道是我没 commit?分支搞错了?
- 遇到问题无法在线上 debug,难道只能通过加日志再重新发布吗?
- 线上遇到某个用户的数据处理有问题,但线上同样无法 debug,线下无法重现!
- 是否有一个全局视角来查看系统的运行状况?
- 有什么办法可以监控到JVM的实时运行状态?
- 怎么快速定位应用的热点,生成火焰图?
Arthas
支持JDK 6+,支持Linux/Mac/Windows,采用命令行交互模式,同时提供丰富的 Tab
自动补全功能,进一步方便进行问题的定位和诊断。
2. 在线教程
3. 快速开始
3.1. 使用arthas-boot
(推荐)
下载arthas-boot.jar
,然后用java -jar
的方式启动:
curl -O https://arthas.aliyun.com/arthas-boot.jar
java -jar arthas-boot.jar
# 第一次启动,会自动下载相关的包
# 建议将这些下载的包大包,可以直接用salt进行分发
打印帮助信息:
java -jar arthas-boot.jar -h
- 如果下载速度比较慢,可以使用aliyun的镜像:
java -jar arthas-boot.jar --repo-mirror aliyun --use-http
- 推荐参数
--arthas-home 指定arthas的home目录,特别有用
3.2. 使用注意事项
- 选择了java进程回车后报错
[ERROR] Start arthas failed, exception stack trace: com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file: target process not responding or HotSpot VM not loaded at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:106) at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:78) at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:250) at com.taobao.arthas.core.Arthas.attachAgent(Arthas.java:85) at com.taobao.arthas.core.Arthas.<init>(Arthas.java:28) at com.taobao.arthas.core.Arthas.main(Arthas.java:123)
- 解决:在对应JAVA进程启动时添加JVM参数 -XX:+StartAttachListener
- 选择了java进程回车后报错
[ERROR] Start arthas failed, exception stack trace: java.io.IOException: well-known file /tmp/.java_pid87449 is not secure: file should be owned by the current user (which is 0) but is owned by 30001 at sun.tools.attach.LinuxVirtualMachine.checkPermissions(Native Method) at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:117) at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:78) at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:250) at com.taobao.arthas.core.Arthas.attachAgent(Arthas.java:85) at com.taobao.arthas.core.Arthas.<init>(Arthas.java:28) at com.taobao.arthas.core.Arthas.main(Arthas.java:123) [ERROR] attach fail, targetPid: 87449
- 该问题发生的原因是执行该程序的用户需要和目标进程具有相同的权限
- 使用普通用户启动的业务进程时,最好指定
--arthas-home
参数
- 选择需要attach的进程后,即进入arthas专有交互界面
3.3. 案例展示
3.3.1. Dashboard
3.3.2. Thread
一目了然的了解系统的状态,哪些线程比较占cpu?他们到底在做什么?
$ thread -n 3
"as-command-execute-daemon" Id=29 cpuUsage=75% RUNNABLE
at sun.management.ThreadImpl.dumpThreads0(Native Method)
at sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:440)
at com.taobao.arthas.core.command.monitor200.ThreadCommand$1.action(ThreadCommand.java:58)
at com.taobao.arthas.core.command.handler.AbstractCommandHandler.execute(AbstractCommandHandler.java:238)
at com.taobao.arthas.core.command.handler.DefaultCommandHandler.handleCommand(DefaultCommandHandler.java:67)
at com.taobao.arthas.core.server.ArthasServer$4.run(ArthasServer.java:276)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Number of locked synchronizers = 1
- java.util.concurrent.ThreadPoolExecutor$Worker@6cd0b6f8
"as-session-expire-daemon" Id=25 cpuUsage=24% TIMED_WAITING
at java.lang.Thread.sleep(Native Method)
at com.taobao.arthas.core.server.DefaultSessionManager$2.run(DefaultSessionManager.java:85)
"Reference Handler" Id=2 cpuUsage=0% WAITING on java.lang.ref.Reference$Lock@69ba0f27
at java.lang.Object.wait(Native Method)
- waiting on java.lang.ref.Reference$Lock@69ba0f27
at java.lang.Object.wait(Object.java:503)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
3.3.3. jad
对类进行反编译:
$ jad javax.servlet.Servlet
ClassLoader:
+-java.net.URLClassLoader@6108b2d7
+-sun.misc.Launcher$AppClassLoader@18b4aac2
+-sun.misc.Launcher$ExtClassLoader@1ddf84b8
Location:
/Users/xxx/work/test/lib/servlet-api.jar
/*
* Decompiled with CFR 0_122.
*/
package javax.servlet;
import java.io.IOException;
import javax.servlet.ServletConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
public interface Servlet {
public void init(ServletConfig var1) throws ServletException;
public ServletConfig getServletConfig();
public void service(ServletRequest var1, ServletResponse var2) throws ServletException, IOException;
public String getServletInfo();
public void destroy();
}
3.3.4. mc
Memory Compiler/内存编译器,编译.java
文件生成.class
。
mc /tmp/Test.java
3.3.5. redefine
加载外部的.class
文件,redefine jvm已加载的类。
redefine /tmp/Test.class
redefine -c 327a647b /tmp/Test.class /tmp/Test\$Inner.class
3.3.6. sc
查找JVM中已经加载的类
$ sc -d org.springframework.web.context.support.XmlWebApplicationContext
class-info org.springframework.web.context.support.XmlWebApplicationContext
code-source /Users/xxx/work/test/WEB-INF/lib/spring-web-3.2.11.RELEASE.jar
name org.springframework.web.context.support.XmlWebApplicationContext
isInterface false
isAnnotation false
isEnum false
isAnonymousClass false
isArray false
isLocalClass false
isMemberClass false
isPrimitive false
isSynthetic false
simple-name XmlWebApplicationContext
modifier public
annotation
interfaces
super-class +-org.springframework.web.context.support.AbstractRefreshableWebApplicationContext
+-org.springframework.context.support.AbstractRefreshableConfigApplicationContext
+-org.springframework.context.support.AbstractRefreshableApplicationContext
+-org.springframework.context.support.AbstractApplicationContext
+-org.springframework.core.io.DefaultResourceLoader
+-java.lang.Object
class-loader +-org.apache.catalina.loader.ParallelWebappClassLoader
+-java.net.URLClassLoader@6108b2d7
+-sun.misc.Launcher$AppClassLoader@18b4aac2
+-sun.misc.Launcher$ExtClassLoader@1ddf84b8
classLoaderHash 25131501
3.3.7. stack
查看方法 test.arthas.TestStack#doGet
的调用堆栈:
$ stack test.arthas.TestStack doGet
Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 286 ms.
ts=2018-09-18 10:11:45;thread_name=http-bio-8080-exec-10;id=d9;is_daemon=true;priority=5;TCCL=org.apache.catalina.loader.ParallelWebappClassLoader@25131501
@test.arthas.TestStack.doGet()
at javax.servlet.http.HttpServlet.service(HttpServlet.java:624)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:731)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:110)
...
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:451)
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1121)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:637)
at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:745)
3.3.8. Trace
观察方法执行的时候哪个子调用比较慢:
3.3.9. Watch
观察方法 test.arthas.TestWatch#doGet
执行的入参,仅当方法抛出异常时才输出。
$ watch test.arthas.TestWatch doGet {params[0], throwExp} -e
Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 65 ms.
ts=2018-09-18 10:26:28;result=@ArrayList[
@RequestFacade[org.apache.catalina.connector.RequestFacade@79f922b2],
@NullPointerException[java.lang.NullPointerException],
]
3.3.10. Monitor
监控某个特殊方法的调用统计数据,包括总调用次数,平均rt,成功率等信息,每隔5秒输出一次。
$ monitor -c 5 org.apache.dubbo.demo.provider.DemoServiceImpl sayHello
Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 109 ms.
timestamp class method total success fail avg-rt(ms) fail-rate
----------------------------------------------------------------------------------------------------------------------------
2018-09-20 09:45:32 org.apache.dubbo.demo.provider.DemoServiceImpl sayHello 5 5 0 0.67 0.00%
timestamp class method total success fail avg-rt(ms) fail-rate
----------------------------------------------------------------------------------------------------------------------------
2018-09-20 09:45:37 org.apache.dubbo.demo.provider.DemoServiceImpl sayHello 5 5 0 1.00 0.00%
timestamp class method total success fail avg-rt(ms) fail-rate
----------------------------------------------------------------------------------------------------------------------------
2018-09-20 09:45:42 org.apache.dubbo.demo.provider.DemoServiceImpl sayHello 5 5 0 0.43 0.00%
3.3.11. Time Tunnel(tt)
记录方法调用信息,支持事后查看方法调用的参数,返回值,抛出的异常等信息,仿佛穿越时空隧道回到调用现场一般。
$ tt -t org.apache.dubbo.demo.provider.DemoServiceImpl sayHello
Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 75 ms.
INDEX TIMESTAMP COST(ms) IS-RET IS-EXP OBJECT CLASS METHOD
-------------------------------------------------------------------------------------------------------------------------------------
1000 2018-09-20 09:54:10 1.971195 true false 0x55965cca DemoServiceImpl sayHello
1001 2018-09-20 09:54:11 0.215685 true false 0x55965cca DemoServiceImpl sayHello
1002 2018-09-20 09:54:12 0.236303 true false 0x55965cca DemoServiceImpl sayHello
1003 2018-09-20 09:54:13 0.159598 true false 0x55965cca DemoServiceImpl sayHello
1004 2018-09-20 09:54:14 0.201982 true false 0x55965cca DemoServiceImpl sayHello
1005 2018-09-20 09:54:15 0.214205 true false 0x55965cca DemoServiceImpl sayHello
1006 2018-09-20 09:54:16 0.241863 true false 0x55965cca DemoServiceImpl sayHello
1007 2018-09-20 09:54:17 0.305747 true false 0x55965cca DemoServiceImpl sayHello
1008 2018-09-20 09:54:18 0.18468 true false 0x55965cca DemoServiceImpl sayHello
3.3.12. Classloader
了解当前系统中有多少类加载器,以及每个加载器加载的类数量,帮助您判断是否有类加载器泄露。
$ classloader
name numberOfInstances loadedCountTotal
BootstrapClassLoader 1 3346
com.taobao.arthas.agent.ArthasClassloader 1 1262
java.net.URLClassLoader 2 1033
org.apache.catalina.loader.ParallelWebappClassLoader 1 628
sun.reflect.DelegatingClassLoader 166 166
sun.misc.Launcher$AppClassLoader 1 31
com.alibaba.fastjson.util.ASMClassLoader 6 15
sun.misc.Launcher$ExtClassLoader 1 7
org.jvnet.hk2.internal.DelegatingClassLoader 2 2
sun.reflect.misc.MethodUtil 1 1
3.3.13. Web Console
3.3.14. Profiler/FlameGraph/火焰图
$ profiler start
Started [cpu] profiling
$ profiler stop
profiler output file: /tmp/demo/arthas-output/20191125-135546.svg
OK
通过浏览器查看profiler结果——或者下载下来查看:
3.3.15. Arthas Spring Boot Starter
参考:
https://github.com/alibaba/arthas
https://www.jianshu.com/p/c1fc60d48dd7