doesn't equal a big enough hole. Did I mention that the low pause collector maintains free lists for the space available in the tenured generation and that fragmentation can become a problem? If you're using the low pause collector and things are going just peachy for days and days and then there is a huge (relatively speaking) pause, the cause may be fragmentation in the tenured generation. In 1.4.2 and older releases in order to do a young generation collection there was a requirement that there be a contiguous chunk of free space in the tenured generation that was big enough to hold all the the young generation. In the GC tuning documents at http://java.sun.com/docs/hotspot/ this is referred to as the young generation guarantee. Basically during a young generation collection, any data that survives may have to be promoted into the tenured generation and we just don't know how much is going to survive. Being our usual conservative selves we assumed all of it would survive and so there needed to be room in the tenured generation for all of it. How does this cause a big pause? If the young generation is full and needs to be collected but there is not enough room in the tenured generation, then a full collection of both the young generation and the tenured generations are done. And this collection is a stop-the-world collection not a concurrent collection so you generally see a pause much longer than you want to. By the way this full collection is also a compacting collection so there is no fragmentation at the end of the full collection. In 5.0 we added the ability in the low pause collector to start a young generation collection and then to back out of it if there was not enough space in the tenured generation. Being able to backout of a young generation collection allowed us to make a couple of changes. We now keep an average of the amount of space that is used for promotions and use that (with some appropriate padding to be on the safe side) as the requirement for the space needed in the tenured generation. Additionally we no longer need a single contiguous chunk of space for the promotions so we look at the total amount of free space in the tenured generation in deciding if we can do a young generation collection. Not having to have a single contiguous chunk of space to support promotions is where fragmentation comes in (or rather where it doesn't come in as often). Yes, sometimes using the averages for the amount promoted and the total amount of free in the tenured generation tells us to go ahead and do a young generation collection and we get surprised (there really isn't enough space in tenured generation). In that situation we have to back out of the young generation collection. It's expensive to back out of a collection, but it's doable. That's a very long way of saying that fragmentation is less of a problem in 5.0. It still occurs, but we have better ways of dealling with it. What should you do if you run into a fragmentation problem? Try 5.0. Or you could try a larger total heap and/or smaller young generation. If your application is on the edge, it might give you just enough extra space to fit all your live data. But often it just delays the problem. Or you can try to make you application do a full, compacting collection at a time which will not disturb your users. If your application can go for a day without hitting a fragmentation problem, try a System.gc() in the middle of the night. That will compact the heap and you can hopefully go another day without hitting the fragmentation problem. Clearly no help for an application that does not have a logical "middle of the night". Or if by chance most of the data in the tenured generation is read in when your application first starts up and you can do a System.gc() after you complete initialization, that might help by compacting all data into a single chunk leaving the rest of the tenured generation available for promotions. Depending on the allocation pattern of the application, that might be adequate. Or you might want to start the concurrent collections earlier. The low pause collector tries to start a concurrent collection just in time (with some safety factor) to collect the tenured generation before it is full. If you are doing concurrent collections and freeing enough space, you can try starting a concurrent collection sooner so that it finishes before the fragmentation becomes a problem. The concurrent collections don't do a compaction, but they do coalese adjacent free blocks so larger chunks of free space can result from a concurrent collection. One of the triggers for starting a concurrent collection is the amount of free space in the tenured generation. You can cause a concurrent collection to occur early by setting the option -XX:CMSInitiatingOccupancyFraction= where NNN is the percentage of the tenured generation that is in use above which a concurrent collection is started. This will increase the overall time you spend doing GC but may avoid the fragmentation problem. And this will be more effective with 5.0 because a single contiguous chunk of space is not required for promotions. By the way, I've increased the comment period for my blogs. I hadn't realized it was so short. |