Java内存泄漏问题的诊断

贴个一年前写的文章。为投稿写的,骗了个小奖。


现在想想,这方法还是有几点不足:

1. 如果有多个类同时引用,单从统计上,无法判断具体哪个才是问题节点。譬如A,B两个类都引用C,可以发现C中存在泄漏。但是无法判断是A或者B出了错,需要结合其他分析,譬如从dump中计算边的数量。(本来想写个程序的,后来发现没有索引,直接query遍历dump时间开销太大,后来忙了就懒得去弄了。。。)

2. 有时候问题节点后再自己写的code之外,譬如有个bug最终查出来是在ThreadLocalMap的问题。进一步分析前,也无法直接得出结论。


当然,这个方法也有好处啦。即简单,稳定性也不错。用它还是能查出一些问题的。譬如上一个项目一下子找到个没commit的错儿,瓦咔咔~ XD


1.    Introduction

The Java virtual machine’s heap stores all objects createdby a running Java application. Objects are created by the programmers, butnever freed explicitly by the code. Garbage collection is the process ofautomatically freeing objects that are no longer referenced by the program.

The garbage collector eliminates the memory related errors,such as dangling pointer and the memory leaks caused by the lost pointer.However, a memory leak may still occur when a Java program maintains referencesto objects that are no longer needed, preventing the garbage collector fromreclaiming the space.

In the worst case, unnecessary references refer to a growingdata structure, parts of which are no longer in use. These types of leaks caneventually cause the program to run out of memory and crash. In long-runningprograms, small leaks can also cause significant performance issue after daysor weeks.

With the automatic garbage collection, memory leaks arerelatively difficult to diagnose since the programmers have less control on thememory allocation. A common garbage collector will also move the objects andchange their address to avoid heap fragmentation which adds the difficulties totrack a particular object instance.

As we will see in Section 2, a number of tools exist thathelp the user look inside the black box to determine the root cause of a leak.But using these tools to solve memory leaks in large java application is alittle tricky.

The author of paper [1] summarized a few difficulties theyencountered when diagnosing leaks in large Java applications.

Perturbation: Acquiring full heap dumps can cause a systemwith a large heap size to pause for tens of seconds. Tracking the call stack ofevery allocation will introduce unacceptable overhead, reducing the throughputof the application by five to ten times. For servers these slowdowns or pausescan cause timeouts, significantly changing the behavior of the application.

Noise: Given a persisting object, it is difficult todetermine whether it has legitimate reason for persisting. For example, cachesand resource pools intentionally retain objects for long periods of time, eventhough the objects may no longer be needed.

Data Structure Complexity: Knowing the type of leakingobject that predominates, often a low-level type such as String, does not helpexplain why the leak occurs. Presented with the context of low-level leakingobjects, it is easy to get lost quickly in extracting a reason for leakage.

To address these three problems, a method is proposed inthis article. It provides a guide to diagnose memory leaks by using the toolswith several steps.

2.    Tools

Before introducing the method, a few common memorydiagnosing tools are briefly described here in this section.

2.1 JDK Troubleshooting Tool

Various diagnostic and monitoring tools are shipped withJava Platform, Standard Edition Development Kit (JDK)[2]. jmap andjhat utilities are generally used in analyzing memory issues.

The jmap command-line utility prints memory relatedstatistics for a running VM or core file. It could also be used to dump theJava heap in binary HPROF format to a specified file.

The jhat tool provides a convenient means to browse theobject topology in a heap snapshot. The tool parses a heap dump in binaryformat, for example, a heap dump produced by jmap. The tool provides a numberof standard queries to find unnecessary object retention.

2.2 Youkit Java Profiler

Youkit is a smart and powerful tool for CPU and memoryprofiling[3]. It integrates several useful function to gather andanalysis the heap information. Unfortunately, this tool is not free and the user needs to purchase alicense.

2.3 IBM HeapAnalyzer

IBM HeapAnalyzer[4] is a graphical tool fordiscovering possible Java heap leaks. HeapAnalyzer allows the finding of apossible Java™ heap leak area through its heuristic search engine and analysisof the Java heap dump in Java applications. It analyzes Java heap dumps byparsing the Java heap dump, creating directional graphs, transforming them intodirectional trees, and executing the heuristic search engine.

2.4 Memory Analyzer Tool (MAT)

The Eclipse Memory Analyzer is a fast and feature-rich Javaheap analyzer that helps you find memory leaks and reduce memory consumption[5]. Like IBM HeapAnalyer, MAT is a convenient tool in examining the heapdump with GUI.

3.    Method

3.1 Steps

A typical procedure of diagnose memory leaks may include thefollowing steps:

a)     Spot the memory leaks phenomenon.

b)     Look for a set of candidate datastructures/objects that are likely to have problems.

c)      Identify the root cause in the code.

Memory leaks are relatively easy to spot. Turning on the GClog and monitoring the heap size after each round of GC is the most common wayto do so. Normally, with the memory leaks problem, a downward-sawtooth patternof free space (every collection frees less and less) will be observed [1, 6]until the application runs into out-of-memory exceptions. It would also be muchhelpful if such pattern could be reproduced by a set of particular operations(such as a few test cases).

Finding the leak candidates are often a little moredifficult. Section 3.2 will discuss several methods with more details.

The last task to locate the bug in code is even much harderfor the person other than the developers who own the real code. The heap dumpusually does not contain the allocation information of each object unlessexplicitly enabled. Enabling allocation tracking will introduce heavy overheadespecially for the large application. Moreover, it swamps the user with toomuch low-level detail about individual objects, which this requires a lot ofexpertise and time to analysis the heap dump. A more light-weight approach isintroduced in section 3.3 which provides sufficient information for thedevelopers to find the root cause of a leak.

3.2 Finding Leak Candidates

There are several different ways to determine the leakobjects.

One of the most common methods is to require a heap dumpmanually or automatically when an out-of-memory error occurs. The engineercould use some offline tools[3, 4, 5] to analyze the dump file and findout the object(s) with the biggest retained size. The retained heap of anobject X is the sum of the sizes of all objects kept alive by X. The maindisadvantage of this method is it can cause some false-negative results becauseof the existence of ‘noise’ as described in Section 1.

Another approach is based on heap differencing[3, 8].The basic idea is to take two snapshots of the heap, before and after the ‘problem’operation. The user could then differentiate between the ‘old’ objects whichexisted before the operation, and the ‘new’ objects which were created duringthis operation and cannot be released at the end. The drawback of this methodis there might be a large amount of objects (false-positive) created during theperiod which require a lot of time to exam.

This article proposed a new method. This method relies onthe simple assumption that the numbers of the leaking objects will continuouslyincrease in the long-term[7]. To find these objects, a series of heaphistogram (more than 3) is captured during the run-time. Acquiring heaphistogram is usually a much more light-weight operation than acquiring the fullheap dump. The heap histogram contains the information about the numbers ofliving instances of each class. The growing trend of each class could becalculated as rank by the Lease Square Method[9]. Classes above arank threshold (Rthres) are reported as leaking. The advantage ofthis method is the false positive and false negative rate could be controlledby choosing a moderate threshold. Also, a set of leaking related classes wouldbe reported, not just one or two of the dominating objects. This makes iteasier for further analysis. Lastly, acquiring heap histogram and calculatingthe rank is quite easy, and could be accomplished automatically in real-time.

3.3 Finding Allocation Sites

The target is to find out the head of a data structure whichis leaking in one or more ways. In this case, a heap is necessary to know thereference relationship between the objects.

The key idea of the algorithm is to transverse the objectgraph until figure out a boundary of the leak candidates which is detected the previousstep. This process could be started with any class from the candidate set. If theclasses who reference that leak candidate are not inside the candidate set,they are the boundary classes. If not, mark these classes as ‘visited’ andrepeat the process on these classes until reach the boundary classes.

Sometimes, the boundary classes are low-level types andshould be backtracked further until it encounters a high-level type with ameaningful class name.

4.    Example

The proposed method is evaluated with a real bug from JIRA.

From GC log, one could easily spot the memory leaks asshowed in the following figure.

 

To diagnose the problem with the proposed technique, tensnapshots of heap histogram information were collected in running the ‘leaking’operations for a few hours. The following table lists the number of detectedleaking candidates with a moderate threshold (Rthres=100). In thiscase, even a small number of snapshots could generate quite selective report.Totally 42 classes are suggested as leaking candidates by 6 snapshots.

Number of snapshots

3

4

5

6

10

Number of candidates

47

45

45

42

42

 

The next step is to identify the leaking data structure. Startingfrom a randomly selected candidate and backtracking via the reference graph ina few minutes, the boundary classjava.util.concurrent.ConcurrentHashMap$HashEntry[] was detected successfully.Certainly, the class name indicates little about the leak cause, so will be dugfurther. In a short time, a class with a meaningful name, *****.SimpleFixedDelayCleanupCachewas found to hold these unnecessary objects. The developer will spend much lesstime in fixing problem with these information.

5.    Conclusion

Java memory leak is a serious issue that will degrade systemperformance and may even make the server crash. Detecting and diagnosing memoryleak is one of the duties in the performance test, especially the longevity. Aneasy and repeatable method is introduced in this article. This method isgenerally with very low overhead and is demonstrated effective with an exampleof the real product. I think this approach is useful in performance tasks tofigure out the memory issues.

6.    Reference

[1] Nick Mitchell and Gary Sevitsky. LeakBot: An Automatedand Lightweight Tool for Diagnosing Memory Leaks in Large Java Applications.

[2] Sun JDK Troubleshooting Tool. http://docs.oracle.com/javase/

[3] Yourkit. http://www.yourkit.com/

[4] IBM HeapAnalyzer https://www.ibm.com/developerworks/mydeveloperworks/groups/service/html/communityview?communityUuid=4544bafe-c7a2-455f-9d43-eb866ea60091

[5] Memory Analyzer (MAT). http://eclipse.org/mat/

[6] How to Fix Memory Leaks in Java. http://olex.openlogic.com/wazi/2009/how-to-fix-memory-leaks-in-java/

[7] Maria Jump and Kathryn S. McKinley. Cork: Dynamic MemoryLeak Detection for Java

[8] Wim De Pauw and Gary Sevitsky.Visualizing Reference Patterns for Solving Memory Leaks in Java

[9] Least squares, regression analysis and statistics. http://en.wikipedia.org/wiki/Least_squares#cite_note-brertscher-0

 


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值