Java5.0垃圾回收性能调优-3、分“代”管理 3、1性能考虑[Vange译]

Generations 分“代”管理

One strength of the J2SE platform is that it shields the developer from the complexity of memory allocation and garbage collection. However, once garbage collection is the principal bottleneck, it is worth understanding some aspects of this hidden implementation. Garbage collectors make assumptions about the way applications use objects, and these are reflected in tunable parameters that can be adjusted for improved performance without sacrificing the power of the abstraction.
J2SE平台的一大特性就是使开发者不再需要面临复杂的内存申请及垃圾回收操作。然而,一旦垃圾回收变成了系统的主要瓶颈时,开发者就有必要去了那些垃圾 回收的内部实现。因为垃圾回收机制是建立在假设程序会以某种方式去使用类来执行的情况下进行工作,所以它并不能最好地处理性能方面的问题。由于J2SE有 着高度的抽象性,使得它在不同的平台上可以通过一些调优参数的设置就可以大大地提高相关的性能。
----------------------------------------------

灰色 的声明:

author:Vange
url:http://blog.csdn.net/Vange

----------------------------------------------



An object is considered garbage when it can no longer be reached from any pointer in the running program. The most straightforward garbage collection algorithms simply iterate over every reachable object. Any objects left over are then considered garbage. The time this approach takes is proportional to the number of live objects, which is prohibitive for large applications maintaining lots of live data.
在垃圾回收机制中,当一个对象没有任何引用时,它将会被认为是"垃圾",在回收的算法里面中,最简单直接的方法就是循环迭代所有的对象,找出没有引用的对 象。这个算法的运行时间与对象的个数有很大关系,当大型的应用里面有大量的对象时,这种算法有严重的问题。

Beginning with the J2SE Platform version 1.2, the virtual machine incorporated a number of different garbage collection algorithms that are combined using generational collection. While naive garbage collection examines every live object in the heap, generational collection exploits several empirically observed properties of most applications to avoid extra work.
在J2SE1.2开始时,JVM加入了几种不同的垃圾回收算法。 这些算法不同之处在于“分代“回收,相比起一般的回收算法,这些算法利用了大多数程序的一些经验特性避免了许多的额外工作。

The most important of these observed properties is infant mortality. The blue area in the diagram below is a typical distribution for the lifetimes of objects. The X axis is object lifetimes measured in bytes allocated. The byte count on the Y axis is the total bytes in objects with the corresponding lifetime. The sharp peak at the left represents objects that can be reclaimed (i.e., have "died") shortly after being allocated. Iterator objects, for example, are often alive for the duration of a single loop.
最重要的回收经验发现是“新生代”瞬间性。下面图表的蓝色部分表示所有对象的存活时间(有被引用则说明还存活)。X轴是对象的自创建后的存活时间(越向右 表示创建时间过去越久了)。Y轴表示所有同一时间创建对象的仍然存活的对象大小。这个图象说明左边的刚创建的大量对象将在很短的时间内可以被回收。例如一 个Iterator对象,它的存活时间很可能只是一个简单的循环而已。

histogram with collections

Some objects do live longer, and so the distribution stretches out to the the right. For instance, there are typically some objects allocated at initialization that live until the process exits. Between these two extremes are objects that live for the duration of some intermediate computation, seen here as the lump to the right of the infant mortality peak. Some applications have very different looking distributions, but a surprisingly large number possess this general shape. Efficient collection is made possible by focusing on the fact that a majority of objects "die young".
在图表的右边,说明一些对象自创建后存活的时间很久,像有一些对象,自创建后可能是贯穿整个进程的存活时间。在图表的中间部分,表明另外的一些对象也存活 了一些的时间。纵观这一个趋势,虽然有一些程序有着自己的特点,但让人十分惊奇的是大多数情况下,程序所产生的情况是跟图表极为相似的!针对这种情况,提 高垃圾回收效率的最主要一点是如果处理这个特点:“新生代瞬间性”(die young)。

To optimize for this scenario, memory is managed in generations, or memory pools holding objects of different ages. Garbage collection occurs in each generation when the generation fills up. Objects are allocated in a generation for younger objects or the young generation, and because of infant mortality most objects die there. When the young generation fills up it causes a minor collection. Minor collections can be optimized assuming a high infant mortality rate. The costs of such collections are, to the first order, proportional to the number of live objects being collected. A young generation full of dead objects is collected very quickly. Some surviving objects are moved to a tenured generation. When the tenured generation needs to be collected there is a major collection that is often much slower because it involves all live objects.
我们假设一个场景,内存被“分代”管理着而且内存里面的对象以创建时间被分成几个不同时期。垃圾回收机制会在每个“代”被填满时自动启动。就像在“新生 代”里面的会填充很快就灭亡的对象(即创建后存活时间很短),当“新生代”被填充满了后就会启动一个特别的小型回收器,这个回收器会根据“新生代”的瞬间 性进行回收。但这种回收器执行的时间会根据“新生代”里面所存在的对象的个数的增加而增加。“新生代”里面大部分都是很快就灭亡的对象,在回收后仍然存活 的对象将会被转移到”保有代“。而当“保有代”也满了以后,另外一个大型的回收器将会被启动,这个回收器的执行时间会长得多,因为它回收范围是全部代里面 的对象。
----------------------------------------------

灰色 的声明:

author:Vange
url:http://blog.csdn.net/Vange

----------------------------------------------


The diagram below shows minor collections occurring at intervals long enough to allow many of the objects to die between collections. It is well-tuned in the sense that the young generation is large enough (and thus the period between minor collections long enough) that the minor collection can take advantage of the high infant mortality rate. This situation can be upset by applications with unusual lifetime distributions, or by poorly sized generations that cause collections to occur before objects have had time to die.
下面的图表显示小型回收器会在一个适当长的间隔后启动,而间隔时间尽量可以让对象创建并且灭亡。在假设场景中,由于“新生代”内存是足够大的(所以间隔时 间是可以无限长来满足对象的创建及灭亡时间),小型回收器可以借助“新生代”瞬间性得到很高的回收效率。但这个假设场景会在一些情况下被打破,像程序的对 象灭亡时间不跟典型灭亡时间相同,或者太小的“代”内存空间导致了回收器过早回收而对象此时还没有灭亡。

As noted in section 2 ergonomics nows makes different choice of the garbage collector in order to provide good performance on a variety of applications. The serial garbage collector is meant to be used by small applications. Its default parameters were designed to be effective for most small applications. The throughput garbage collector is meant to be used by large applications. The heap size parameters selected by ergonomics plus the features of the adaptive size policy are meant to provide good performance for server applications. These choices work well for many applications but do not always work. This leads to the central tenet of this document:
在ergonomics里面的第二节所提到的,使用不同的垃圾回收机制可以提高不同程序的性能。一般串行回收应用于小型的应用程序中,而它的默认参数对大 部分的小型程序是很有效的。大吞吐量的垃圾回收器被应用于大型程序,而它的设置堆内存大小参数会让ergonomics自动根据“大小自适应”策略设置, 从而为大型服务器程序提供很好的性能,但这一切并永远是这样的,所以也引出这篇文章的主题:

If the garbage collector has become a bottleneck, you may wish to customize the generation sizes. Check the verbose garbage collector output, and then explore the sensitivity of your individual performance metric to the garbage collector parameters.

如果有一天,当垃圾回收已经是你程序的性能瓶颈的时间,你可以试着去自定义各个“代”的内存大小。可以在查看垃圾回收的状态输出(jstat工具可 以查 看),再结合自己所期望的回收效率来仔细地配置相关的参数。


 

The default arrangement of generations (for all collectors with the exception of the throughput collector) looks something like this.
除了大吞吐量回收器外,一般的回收器的“代”内存空间大致都像下面的图表。

space usage 
by generations

At initialization, a maximum address space is virtually reserved but not allocated to physical memory unless it is needed. The complete address space reserved for object memory can be divided into the young and tenured generations.
在初始化后,大部分的内存地址空间都是虚拟的,并没有实际向系统申请内存直到真正需要时才会申请。而地址空间存放对象的部分也会被划分为两个 “代”:young代 和 tenured代 (在下面的翻译,我会用英文原来的表示这些“代”,因为觉得这些还是要个人理解,我翻译后大家都不舒服。)

----------------------------------------------

灰色 的声明:

author:Vange
url:http://blog.csdn.net/Vange

----------------------------------------------


The young generation consists of eden plus two survivor spaces . Objects are initially allocated in eden. One survivor space is empty at any time, and serves as a destination of the next, copying collection of any live objects in eden and the other survivor space. Objects are copied between survivor spaces in this way until they are old enough to be tenured, or copied to the tenured generation.
young代的组成部分有eden空间和两个survivor空间。对象都会在eden空间里面被创建并初始化。其中一个survivor空间会在程序运 行时被空置着,它主要是用于存放下一次回收时eden和另一个survivor空间里面的存活对象。对象会在两个survivor空间里面互相交替存放, 直到这些对象的存活时间足够时,才会转移到tenured代。

Other virtual machines, including production virtual machine for the J2SE Platform version 1.2 for the Solaris Operating System, used two equally sized spaces for copying rather than one large eden plus two small spaces. This means the options for sizing the young generation are not directly comparable; see the Performance FAQ for an example.
其他的JVM(包括Solaris的上J2SE 1.2平台)都使用两个大小相等的空间来存放这些对象,而不是用一个大的eden空间加上两个小的survivor空间。这也说明设置young代的参数 并不是直接通用的。这一点可以参看Performance FAQ
----------------------------------------------

灰色 的声明:

author:Vange
url:http://blog.csdn.net/Vange

----------------------------------------------


A third generation closely related to the tenured generation is the permanent generation. The permanent generation is special because it holds data needed by the virtual machine to describe objects that do not have an equivalence at the Java language level. For example objects describing classes and methods are stored in the permanent generation.
而图表靠近tenured代的是Perm代(permanaent generation)。Perm代主要是用于保存jvm所需要的类描述等相关数据。类描述也就是描述类的属性及方法等方面的信息。

 


3.1 Performance Considerations 性能考虑


There are two primary measures of garbage collection performance. Throughput is the percentage of total time not spent in garbage collection, considered over long periods of time. Throughput includes time spent in allocation (but tuning for speed of allocation is generally not needed.) Pauses are the times when an application appears unresponsive because garbage collection is occurring.
衡量垃圾回收性能有两个主要的方面:1、吞吐量,指没有进行垃圾回收的时间占总体时间的比例,一般拥有较长的时间周期(应该指的是程序大部分时间用于Thoughtout,而不是在回收垃圾)。这些非垃圾回收时间包括申请空间时间(一般没必要对申请速度进行调优)。2、中止,指由于垃圾回收机制启动后,程序出现无法响应的时间。

Users have different requirements of garbage collection. For example, some consider the right metric for a web server to be throughput, since pauses during garbage collection may be tolerable, or simply obscured by network latencies. However, in an interactive graphics program even short pauses may negatively affect the user experience.
每个使用者对垃圾回收机制的性能有不同的要求。有些把吞吐量当作web服务器的衡量标准,由于短暂的中止在web应用方面对用户来说有时会被错误认为是网络延迟,所以短暂的中止是可以允许的。但是,对于一个交互式的图形界面程序,即使再小的中止时间都有可能被用户感觉到并影响到程序的用户体验。


Some users are sensitive to other considerations. Footprint is the working set of a process, measured in pages and cache lines. On systems with limited physical memory or many processes, footprint may dictate scalability. Promptness is the time between when an object becomes dead and when the memory becomes available, an important consideration for distributed systems, including remote method invocation (RMI).
此外,有些使用者还会把其他的衡量标准也列为考虑因素。Footprint(运行资源占用量),由一组活动的进程组成,并以页(pages)或缓存行(cache lines)为衡量单位。当系统只有有限的物理内存或需要处理大量的进程时,footprint决定着程序的可伸缩性(tips:实在不太清楚这句话的意思)。 Promptness(释放及时量) 指当一个对象灭亡后到被释放空间两者之间的时间量,是衡量分布式系统及远程方法调用(RMI)的一个重要考虑因素。
----------------------------------------------

灰色 的声明:

author:Vange
url:http://blog.csdn.net/Vange

----------------------------------------------



In general, a particular generation sizing chooses a trade-off between these considerations. For example, a very large young generation may maximize throughput, but does so at the expense of footprint, promptness, and pause times. young generation pauses can be minimized by using a small young generation at the expense of throughput. To a first approximation, the sizing of one generation does not affect the collection frequency and pause times for another generation.
一般情况下,每一个“代”的空间大小会平衡以上提到的各个考虑因素而进行设置,例如,一个空间很大的young代可以得到很大的吞吐量,同时要考虑到需要大量的资源占用(主要是空间),比较长的对象释放时间和中止时间。另一方面,一个空间很小的young代可以最小化需要中止时间,但吞吐量也会大大下降。
在设置代空间大小时必须首先清楚一点:“代”空间的大小并不会影响到另一个“代”空间的垃圾回收频率和中止时间。


There is no one right way to size generations. The best choice is determined by the way the application uses memory as well as user requirements. For this reason the virtual machine's choice of a garbage collectior are not always optimal, and may be overridden by the user in the form of command line options, described below.
对于设置代空间并没有一个绝对的标准,而最理想的方法只有让程序按照使用者所要求方式的使用内存。所以JVM的默认垃圾回收机制并非一定是最优化的,但使用者可以重新设置下列相关的命令行选项

3.2 Measurement 性能测量

Throughput and footprint are best measured using metrics particular to the application. For example, throughput of a web server may be tested using a client load generator, while footprint of the server might be measured on the Solaris Operating System using the pmap command. On the other hand, pauses due to garbage collection are easily estimated by inspecting the diagnostic output of the virtual machine itself.
Throughput(吞吐量)和footprint(资源占用量)一般被认为是衡量特定应用的最好标准,对于一个WEB服务器,吞吐量可以使用 客户端模拟器进行压力测试,同样服务器的footprint 可以用Solaris系统的pmap命令来测量。另一方面,由于垃圾回收导致的中止可以查看JVM自带的诊断输出进行粗略的估计。

The command line argument -verbose:gc prints information at every collection. Note that the format of the -verbose:gc output is subject to change between releases of the J2SE platform. For example, here is output from a large server application:
使用命令行选项 -verbose:gc 可以在每一次垃圾回收时进行打印相关信息。但这个选项的格式有可能在不同的J2SE版本中格式有略微的不同。下面是这个在运行大型程序打印的信息:

  [GC 325407K->83000K(776768K), 0.2300771 secs]
  [GC 325816K->83372K(776768K), 0.2454258 secs]
  [Full GC 267628K->83769K(776768K), 1.8479984 secs]

Here we see two minor collections and one major one.
上面信息表明运行过两个小型回收器和一个大型的回收器
----------------------------------------------

灰色 的声明:

author:Vange
url:http://blog.csdn.net/Vange

----------------------------------------------


325407K->83000K (in the first line)

The numbers before and after the arrow indicate the combined size of live objects before and after garbage collection, respectively. After minor collections the count includes objects that aren't necessarily alive but can't be reclaimed, either because they are directly alive, or because they are within or referenced from the tenured generation.
像在箭号的前后两个数字表示在垃圾回收前后的存活对象的空间大小,以K为单位。从上至下依次看来,在小型回收后不再需要的对象的个数并未下降下来,有可能这些对象依然是存活的,或者这些对象被tenured代包含或引用。

(776768K)(in the first line)

The number in parenthesis is the total available space, not counting the space in the permanent generation, which is the total heap minus one of the survivor spaces. The minor collection took about a quarter of a second.
在括号里面的数字表示的总体可使用的空间大小,未包含permanent代的空间。同时表示堆的大小,未包含一个空间的survivor空间(回收机制里面有两个survivor空间)。而后面时间表示小型回收器使用了大概四分之一秒的时间来回收垃圾。

0.2300771 secs (in the first line)

 The format for the major collection in the third line is similar. The flag -XX:+PrintGCDetails prints additional information about the collections. The additional information printed with this flag is liable to change with each version of the virtual machine. The additional output with the -XX:+PrintGCDetails flag in particular changes with the needs of the development of the Java Virtual Machine. An example of the output with -XX:+PrintGCDetails for the J2SE Platform version 1.5 using the serial garbage collector is shown here.
 第三行表示大型回收器的相关信息,格式和上面说的差不多。如果加上 -XX:+PrintGCDetails 选项则会打印关于回收的更多信息,信息的格式可以能会根据不同的VM有所不同。此选项比较少用,一般应用于开发JavaVM过程中。下面的信息是一个使用 -XX:+PrintGCDetails 选项并运行在Java1.5平台上的串行回收器所打印的

[GC [DefNew: 64575K->959K(64576K), 0.0457646 secs] 196016K->133633K(261184K), 0.0459067 secs]]

indicates that the minor collection recovered about 98% of the young generation,
上面表明,小型回收器收回了大概98%的young代空间。

DefNew: 64575K->959K(64576K)

and took about 46 milliseconds.
使用了大概46微秒。

0.0457646 secs

The usage of the entire heap was reduced to about 51%
堆的空间使用情况由75%降为51%

196016K->133633K(261184K)

and that there was some slight additional overhead for the collection (over and above the collection of the young generation) as indicated by the final time:
最后面的时间表示young代回收所使用的时间
----------------------------------------------

灰色 的声明:

author:Vange
url:http://blog.csdn.net/Vange

----------------------------------------------



0.0459067 secs

The flag -XX:+PrintGCTimeStamps will additionally print a time stamp at the start of each collection.
使用 -XX:+PrintGCTimeStamps 选项将会在每次启动回收机制时打印时间戳(时间点)。

111.042: [GC 111.042: [DefNew: 8128K->8128K(8128K), 0.0000505 secs]111.042: [Tenured: 18154K->2311K(24576K), 0.1290354 secs] 26282K->2311K(32704K), 0.1293306 secs]

The collection starts about 111 seconds into the execution of the application. The minor collection starts at about the same time. Additionally the information is shown for a major collection delineated by Tenured. The tenured generation usage was reduced to about 10%
上面信息表示回收器启动于程序运行后111秒,同时小型的回收器也在111.042秒启动了。后面的信息表示一个大型回收器启动回收Tenered代,并把Tenered代的空间使用率降低为10%

18154K->2311K(24576K)

and took about .13 seconds.
大型回收器使用大概0.13秒

0.1290354 secs

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值