Garbage Collection: How it’s done

Garbage Collection: How it’s done

If you are familiar with the basics of the memory allocation in programming languages, you know that there are two parts in the memory defined as Heap and Stack.

The stack memory is used for execution of a thread. When a function is called, a block of memory is allocated in the stack to store the local variables of the function. The allocated memory gets freed when the function is returned. In contrast to the Stack, Heap memory is used for dynamic allocation (usually when creating objects with “new” or “malloc” keyword) and memory deallocation needs to be handled separately.

Object myObject = new Object();


myObject = null;

If at some point of the program, another reference to an object or “null” was assigned to the “myObject” variable, the reference that existed with the already created “Object” will be removed. However, the memory allocated for this “Object” will not be freed even though the Object is not being used. In the older programs such as C or C++, the programmer needs to be concerned about these type of objects allocated in the Heap and delete when they are not in use to free up the memory. Failing to do that can end up in a memory leak. In the other hand, if we mistakenly delete an Object that has a live reference to a variable can cause null pointer exceptions in later parts of the code when we try to access the deleted object using the old reference.

However, in languages like Java and C#, this memory management is handled by a separate entity known as the Garbage Collector.

With a Garbage Collector in place, we can allocate an object in the memory, use it and when there is no longer any reference for that object, the object will be marked for the Garbage Collector to pick up freeing the allocated memory. And a Garbage Collector also guarantees that any live Object that has an active reference will not get removed from the memory.

Reference Counted Garbage Collection

The reference count garbage collection keeps a track of the number of references for a particular object in the memory. Let’s look at the following code segment.

Object a = new Object(); // Reference Count(OB1) starts at 1
Object b = a;         // Reference Count (OB1) incremented to 2 as a new reference is added
Object c = new Object();

b = null;        // Reference Count(OB1) decremented to 1 as reference goes away
a = c;           // Reference Count(OB1) decremented to 0

When executing the line Object a = new Object(), a new object (let’s say OB1 ) is created in the memory and the reference count (for OB1) starts at 1.

When the reference for the OB1 in the variable “a” is copied to “b”, the reference counter increases by one as now two variables have the reference for OB1.

When “b” is assigned to null, there reference for OB1 decreases leaving only the variable “a” having a reference for OB1.

When the value of “a” is updated by the value of “c” (which is having a reference for a whole new object), there reference counter for the OB1 becomes zero leaving the OB1 available for garbage collection.

Drawbacks in Reference Counted GC

The main disadvantage in the reference counted garbage collection is its inability to identify circular references. To understand the circular references, let’s have a look into the below code segment.

Consider two classes A and B having each other’s references.

class A {
    private B b;

    public void setB(B b) {
        this.b = b;

class B {
    private A a;

    public void setA(A a) {
        this.a = a;

Now in the main method, we can create new objects for both of these classes and assign the references.

public class Main {
    public static void main(String[] args) {
        A one = new A();
        B two = new B();

        // Make the objects refer to each other (creates a circular reference)

        // Throw away the references from the main method; the two objects are
        // still referring to each other
        one = null;
        two = null;

When we assign null values for the two variables one and two, the external references existed with the class objects (“A” and “B”) created at the beginning will be removed. Still, they won’t be eligible for garbage collection as the reference counters of those two objects will not become zero due to object “A” having its reference inside “B” and the object “B” having its reference inside “A”.

Mark and Sweep Garbage Collection

As the name suggests, Mark and Sweep garbage collectors have two phases
1. Mark Phase
2. Sweep Phase

Mark Phase
During the Mark phase, the GC identifies the objects that are still in use and set their “mark bit” to true. The search starts with a root set of references kept in local variables in the stack or global variables. Starting from the root references the GC will conduct a depth search for the objects that have reachable references from the root. Any object that keeps a reference of another object, keeps that object alive.

It is important to keep note that during the Mark phase, the application threads are stopped to avoid the changes that can happen to the object state during the marking phase.


The cyclic references are not an issue for a Mark and Sweep GC. If you observe the above diagram, a cyclic reference exists (shown by the square) but it is unreachable from the root. Hence, those types of references will not be marked as live allowing GC to collect as garbage.

Sweep Phase
In the sweep face, all the unmarked objects from the Mark phase will be removed from the memory freeing up space.


As you can observe from the above diagram, there may exist plenty of free regions after the sweep phase. But, due to this fragmentation, the next memory allocation may fail if it is bigger than all the existing free regions.

To overcome this problem, an additional Compact phase was added.

Mark-Sweep-Compact Garbage Collection

After the sweep phase, all the memory locations are rearranged to provide a more compact memory allocation. The downside of this approach is an increased GC pause duration as it needs to copy all objects to a new place and to update all references to such objects.


Mark and Copy Garbage Collector

This is similar to the Mark and Sweep collector, but the memory space is divided into two. Initially, the objects are allocated to one space (fromspace), and the live objects are marked.


During the copy phase, the marked objects are copied into the other space (tospace) and at the same time compacted. Then, the fromspace is cleared out.


After that, both spaces are swapped resulting any new memory allocation to allocate memory in the “new fromspace” (the old tospace will now become the new fromspace). Finally, when the “new fromspace” becomes full, the whole process happens again.

Generational Garbage Collector

In generational garbage collection, the memory space is divided into different generations (e.g. young generation and old generation). Initially, all the objects would reside on the young generation. However, when a garbage collection cycle happens, objects that survive the garbage collection will be promoted to the older generation.



Now, the objects left in the young generation can be cleared as all the live objects are moved to the old generation.

Garbage collection cycles in the old generation occur less frequent than in the young generation. The key idea behind this approach is that objects that survive the first garbage collection tend to live longer. Thus the frequency of garbage collection can be reduced for objects in the older generations. The number of generations differs with the programming language. For example, in Java there are two generations and in .NET there are three.

It’s always up to the programmer to decide the optimum garbage collector to be used for the application. We can research the types of garbage collectors that have been implemented with the programming language that we use and their properties. However, choosing a garbage collector might require thorough testing as garbage collection is something that affects the performance of an application.







garbage collection


//: c04:Garbage.javarn// Demonstration of the garbagern// collector and finalizationrnrnclass Chair rn static boolean gcrun = false;rn static boolean f = false;rn static int created = 0;rn static int finalized = 0;rn int i;rn Chair() rn i = ++created;rn if(created == 47) rn System.out.println("Created 47");rn rn public void finalize() rn if(!gcrun) rn // The first time finalize() is called:rn gcrun = true;rn System.out.println(rn "Beginning to finalize after " +rn created + " Chairs have been created");rn rn if(i == 47) rn System.out.println(rn "Finalizing Chair #47, " +rn "Setting flag to stop Chair creation");rn f = true;rn rn finalized++;rn if(finalized >= created)rn System.out.println(rn "All " + finalized + " finalized");rn rnrnrnpublic class Garbage rn public static void main(String[] args) rn // As long as the flag hasn't been set,rn // make Chairs and Strings:rn while(!Chair.f) rn new Chair();rn new String("To take up space");rn rn System.out.println(rn "After all Chairs have been created:\n" +rn "total created = " + Chair.created +rn ", total finalized = " + Chair.finalized);rn // Optional arguments force garbagern // collection & finalization:rn if(args.length > 0) rn if(args[0].equals("gc") || rn args[0].equals("all")) rn System.out.println("gc():");rn System.gc();rn rn if(args[0].equals("finalize") || rn args[0].equals("all")) rn System.out.println("runFinalization():");rn System.runFinalization();rn rn rn System.out.println("bye!");rn rn ///:~rn这是TIJ上的一道关于垃圾清理的例题,有以下几个小问题不明白,请大家指点:rnfinalized++;在finalize()中到底起什么作用,是清理一个对象累加一次吗?rn运行程序时没有命令行参数也能把对象都清理完。在我的机子上结果是这样的:rnCreated 47rnBeginning to finalize after 3496 Chairs have been createdrnFinalizing Chair #47, Setting flag to stop Chair creationrnAll 3496 finalizedrnAfter all Chairs have been created:rntotal created = 3497, total finalized = 3496rnbye!rn命令行参数是gc时:rnCreated 47rnBeginning to finalize after 3495 Chairs have beenrnFinalizing Chair #47, Setting flag to stop Chair crnAll 3495 finalizedrnAfter all Chairs have been created:rntotal created = 3496, total finalized = 3495rngc():rnAll 3496 finalizedrnbye!rn命令行参数是finalize时:rnCreated 47rnBeginning to finalize after 3495 Chairs have been createdrnFinalizing Chair #47, Setting flag to stop Chair creationrnAll 3495 finalizedrnAfter all Chairs have been created:rntotal created = 3495, total finalized = 3495rnrunFinalization():rnbye!rn为什么没有All 3495 finalized呢?finalized >= created啊!rn如果自己不写finalize(),垃圾清理怎么进行?rn 论坛

关于ClassLoader和Garbage Collection的若干问题


1.在资料上看到说rn "在JDK1.2 及以后的版本里,升阳公司又收紧了类的垃圾回收规则,它规定,rn所有通过局部的和系统的类加载器加载的类,永不被回收。并且,通过其它类加载rn器加载的类,只有在加载器自己被回收后才可被回收。"rnrna.局部的类加载器,系统的类加载器,其它类加载器这三者如何科学地定义区分呢?rnrnb.请问真的象上面的话说的这样吗?rnrnrn2.资料上又说rn "如果不了解这一版爪哇语言的特点,很有可能会遇到类消失掉的奇特问题(ClassNotFoundException)。为了使你的单态类能在所有版本的爪哇环境里使用,作者特别提供一个"看守"类程序,它能保证你的单态类, 甚至其它任何对象,一旦交给"看守"对象,即不会莫名其妙地被垃圾回收器回收,直到你把它从"看守" 那里把它释放出来。"rnrn代码清单6. 看守类的一个实现。rnpackage com.javapatterns.singleton.demos;rnimport java.util.Vector;rn/**rn* This class keeps your objects from garbage collectedrn*/rnpublic class ObjectKeeper extends Threadrnrnprivate ObjectKeeper()rnrnnew Thread(this).start();rnpublic void run()rnrntryrnrnjoin();rnrncatch (InterruptedException e) rnrn/**rn* Any object passed here will be kept until you call discardObject()rn*/rnpublic static void keepObject(Object myObject)rnrnSystem.out.println(" Total number of kept objects: " +rnm_keptObjects.size());rnm_keptObjects.add(myObject);rnSystem.out.println(" Total number of kept objects: " +rnm_keptObjects.size());rnrn/**rn* This method will remove the protect of the object you pass in and make itrn* available for Garbage Collector to collect.rn*/rnpublic static void discardObject(Object myObject)rnrnSystem.out.println(" Total number of kept objects: " +rnm_keptObjects.size());rnm_keptObjects.remove(myObject);rnSystem.out.println(" Total number of kept objects: " +rnm_keptObjects.size());rnrnprivate static ObjectKeeper m_keeper = new ObjectKeeper();rnprivate static Vector m_keptObjects = new Vector();rnrn看守类应当自我实例化,而且在每个系统里只需一个实例。这就意味着看守rn类本身就应当是单态类。当然,类消失的事情绝不可以发生在它自己身上。作者rn提供的例子刚好满足所有的要求。rnrnrnc. 上面的例子为什么能保证类不被回收呢?rnrnd. 为什么说"类消失的事情绝不可以发生在它自己身上"呢? 论坛