Java doesn't have a corresponding operator. In reality, Java doesn't need one. Size of primitive types in Java is defined in the language specification, whereas in C and C++ it depends on the platform. Java has its own IO infrastructure built around serialization. And both pointer arithmetic and bulk memory copy don't apply because Java doesn't have pointers.
But every Java developer at some point wondered how much memory is used by a Java object. The answer, it turns out, is not so simple.
The first distinction to be made is between shallow size and deep size. The shallow size of an object is the space occupied by the object alone, not taking into account size of other objects that it references. The deep size, on the other hand, takes into account the shallow size of the object, plus the deep size of each object referenced by this object, recursively. Most of the times you will be interested on knowing the deep size of an object, but, in order to know that, you need to know how to calculate the shallow size first, which is what I'm going to talk about here.
One complication is that runtime in memory structure of Java objects is not enforced by the virtual machine specification, which means that virtual machine providers can implement them as they please. The consequence is that you can write a class, and instances of that class in one VM can occupy a different amount of memory than instances of that same class when run in another VM. Most of the world, including myself, uses the Sun HotSpot virtual machine though, which simplifies things a lot. The remainder of the discussion will focus on the 32 bit Sun JVM. I will lay down a few 'rules that will help explain how the JVM organizes the objects' layout in memory.
Memory layout of classes that have no instance attributes
In the Sun JVM, every object (except arrays) has a 2 words header. The first word contains the object's identity hash code plus some flags like lock state and age, and the second word contains a reference to the object's class. Also, any object is aligned to an 8 bytes granularity. This is the first rule or objects memory layout:
Now we know that if we call
new Object()
, we will be using 8 bytes of the heap for the two header words and nothing else, since the
Object
class doesn't have any fields.
Memory layout of classes that extend
Object
After the 8 bytes of header, the class attributes follow. Attributes are always aligned in memory to their size. For instance, ints are aligned to a 4 byte granularity, and longs are aligned to an 8 byte granularity. There is a performance reason to do it this way: usually the cost to read a 4 bytes word from memory into a 4 bytes register of the processor is much cheaper if the word is aligned to a 4 bytes granularity.
In order to save some memory, the Sun VM doesn't lay out object's attributes in the same order they are declared. Instead, the attributes are organized in memory in the following order:
- doubles and longs
- ints and floats
- shorts and chars
- booleans and bytes
- references
This scheme allows for a good optimization of memory usage. For example, imagine you declared the following class:
class MyClass { byte a; int c; boolean d; long e; Object f; }
If the JVM didn't reorder the attributes, the object memory layout would be like this:
[HEADER: 8 bytes] 8 [a: 1 byte ] 9 [padding: 3 bytes] 12 [c: 4 bytes] 16 [d: 1 byte ] 17 [padding: 7 bytes] 24 [e: 8 bytes] 32 [f: 4 bytes] 36 [padding: 4 bytes] 40
Notice that 14 bytes would have been wasted with padding and the object would use 40 bytes of memory. By reordering the objects using the rules above, the in memory structure of the object becomes:
[HEADER: 8 bytes] 8 [e: 8 bytes] 16 [c: 4 bytes] 20 [a: 1 byte ] 21 [d: 1 byte ] 22 [padding: 2 bytes] 24 [f: 4 bytes] 28 [padding: 4 bytes] 32
This time, only 6 bytes are used for padding and the object uses only 32 bytes of memory.
So here is rule 2 of object memory layout:
Now we know how to calculate the memory used by any instance of a class that extends Object directly. One practical example is the java.lang.Boolean class. Here is its memory layout:
[HEADER: 8 bytes] 8
[value: 1 byte ] 9
[padding: 7 bytes] 16
An instance of the Boolean class takes 16 bytes of memory! Surprised? (Notice the padding at the end to align the object size to an 8 bytes granularity.)
Memory layout of subclasses of other classes
The next three rules are followed by the JVM to organize the the fields of classes that have superclasses. Rule 3 of object memory layout is the following:
Here is an example:
class A { long a; int b; int c; } class B extends A { long d; }
An instance of B looks like this in memory:
[HEADER: 8 bytes] 8 [a: 8 bytes] 16 [b: 4 bytes] 20 [c: 4 bytes] 24 [d: 8 bytes] 32
The next rule is used when the fields of the superclass don't fit in a 4 bytes granularity. Here is what it says:
Here is an example:
class A { byte a; } class B { byte b; }
[HEADER: 8 bytes] 8 [a: 1 byte ] 9 [padding: 3 bytes] 12 [b: 1 byte ] 13 [padding: 3 bytes] 16
Notice the 3 bytes padding after field
a
to align
b
to a 4 bytes granularity. That space is lost and cannot be used by fields of class B.
The final rule is applied to save some space when the first field of the subclass is a long or double and the parent class doesn't end in an 8 bytes boundary.
Here is an example:
class A { byte a; } class B { long b; short c; byte d; }
Here is the memory layout:
[HEADER: 8 bytes] 8 [a: 1 byte ] 9 [padding: 3 bytes] 12 [c: 2 bytes] 14 [d: 1 byte ] 15 [padding: 1 byte ] 16 [b: 8 bytes] 24
At byte 12, which is where class A 'ends', the JVM broke rule 2 and stuck a short and a byte before a long, to save 3 out of 4 bytes that would otherwise have been wasted.
Memory layout of arrays
Arrays have an extra header field that contain the value of the 'length' variable. The array elements follow, and the arrays, as any regular objects, are also aligned to an 8 bytes boundary.
Here is the layout of a byte array with 3 elements:
[HEADER: 12 bytes] 12
[[0]: 1 byte ] 13
[[1]: 1 byte ] 14
[[2]: 1 byte ] 15
[padding: 1 byte ] 16
And here is the layout of a long array with 3 elements:
[HEADER: 12 bytes] 12
[padding: 4 bytes] 16
[[0]: 8 bytes] 24
[[1]: 8 bytes] 32
[[2]: 8 bytes] 40
Memory layout of inner classes
Non-static inner classes have an extra 'hidden' field that holds a reference to the outer class. This field is a regular reference and it follows the rule of the in memory layout of references. Inner classes, for this reason, have an extra 4 bytes cost.
Final thoughts
We have learned how to calculate the shallow size of any Java object in the 32 bit Sun JVM. Knowing how memory is structured can help you understand how much memory is used by instances of your classes.
In the next post I will will show code that puts it all together and uses reflection to calculate the deep size of an object.Subscribe to my Feed or keep watching this blog for updates!
Understanding What Memory Alignment Means
Most CPUs require that objects and variables reside at particular offsets in the system's memory. For example, 32-bit processors require a 4-byte integer to reside at a memory address that is evenly divisible by 4. This requirement is called "memory alignment". Thus, a 4-byte int can be located at memory address 0x2000 or 0x2004, but not at 0x2001. On most Unix systems, an attempt to use misaligned data results in a bus error, which terminates the program altogether. On Intel processors, the use of misaligned data is supported but at a substantial performance penalty. Therefore, most compilers automatically align data variables according to their type and the particular processor being used. This is why the size that structs and classes occupy is often larger than the sum of their members' size:
struct Employee
{
int ID;
char state[3]; //CA, NY etc. + terminating null
int salary;
};
Apparently, Employee should occupy 11 bytes (4+3+4). However, most compilers add an unused padding byte after the field 'state' so that it aligns on a 4 byte boundary. Consequently, Employee occupies 12 bytes rather than 11. You can examine the actual size of an aggregate by using the expression sizeof(Employee).