检测Java对象所占内存大小 (转载)

Don't pay the price for hidden class fields
By Vladimir Roubtsov, JavaWorld.com, 08/16/02

Recently, I helped design a Java server application that resembled an in-memory database. That is, we biased the design toward caching tons of data in memory to provide super-fast query performance.

Once we got the prototype running, we naturally decided to profile the data memory footprint after it had been parsed and loaded from disk. The unsatisfactory initial results, however, prompted me to search for explanations.

Since Java purposefully hides many aspects of memory management, discovering how much memory your objects consume takes some work. You could use the Runtime.freeMemory() method to measure heap size differences before and after several objects have been allocated. Several articles, such as Ramchander Varadarajan's "Question of the Week No. 107" (Sun Microsystems, September 2000) and Tony Sintes's "Memory Matters" (JavaWorld, December 2001), detail that idea. Unfortunately, the former article's solution fails because the implementation employs a wrong Runtime method, while the latter article's solution has its own imperfections:

   
  • A single call to Runtime.freeMemory() proves insufficient because a JVM may decide to increase its current heap size at any time (especially when it runs garbage collection). Unless the total heap size is already at the -Xmx maximum size, we should use Runtime.totalMemory()-Runtime.freeMemory() as the used heap size.
   
  • Executing a single Runtime.gc() call may not prove sufficiently aggressive for requesting garbage collection. We could, for example, request object finalizers to run as well. And since Runtime.gc() is not documented to block until collection completes, it is a good idea to wait until the perceived heap size stabilizes.

   
  • If the profiled class creates any static data as part of its per-class class initialization (including static class and field initializers), the heap memory used for the first class instance may include that data. We should ignore heap space consumed by the first class instance.


Considering those problems, I present Sizeof, a tool with which I snoop at various Java core and application classes:

Java代码 复制代码  收藏代码
  1. public class Sizeof   
  2. {   
  3.     public static void main (String [] args) throws Exception   
  4.     {   
  5.         // Warm up all classes/methods we will use   
  6.         runGC ();   
  7.         usedMemory ();   
  8.         // Array to keep strong references to allocated objects   
  9.         final int count = 100000;   
  10.         Object [] objects = new Object [count];   
  11.            
  12.         long heap1 = 0;   
  13.         // Allocate count+1 objects, discard the first one   
  14.         for (int i = -1; i < count; ++ i)   
  15.         {   
  16.             Object object = null;   
  17.                
  18.             // Instantiate your data here and assign it to object   
  19.                
  20.             object = new Object ();   
  21.             //object = new Integer (i);   
  22.             //object = new Long (i);   
  23.             //object = new String ();   
  24.             //object = new byte [128][1]   
  25.                
  26.             if (i >= 0)   
  27.                 objects [i] = object;   
  28.             else  
  29.             {   
  30.                 object = null// Discard the warm up object   
  31.                 runGC ();   
  32.                 heap1 = usedMemory (); // Take a before heap snapshot   
  33.             }   
  34.         }   
  35.         runGC ();   
  36.         long heap2 = usedMemory (); // Take an after heap snapshot:   
  37.            
  38.         final int size = Math.round (((float)(heap2 - heap1))/count);   
  39.         System.out.println ("'before' heap: " + heap1 +   
  40.                             ", 'after' heap: " + heap2);   
  41.         System.out.println ("heap delta: " + (heap2 - heap1) +   
  42.             ", {" + objects [0].getClass () + "} size = " + size + " bytes");   
  43.         for (int i = 0; i < count; ++ i) objects [i] = null;   
  44.         objects = null;   
  45.     }   
  46.     private static void runGC () throws Exception   
  47.     {   
  48.         // It helps to call Runtime.gc()   
  49.         // using several method calls:   
  50.         for (int r = 0; r < 4; ++ r) _runGC ();   
  51.     }   
  52.     private static void _runGC () throws Exception   
  53.     {   
  54.         long usedMem1 = usedMemory (), usedMem2 = Long.MAX_VALUE;   
  55.         for (int i = 0; (usedMem1 < usedMem2) && (i < 500); ++ i)   
  56.         {   
  57.             s_runtime.runFinalization ();   
  58.             s_runtime.gc ();   
  59.             Thread.currentThread ().yield ();   
  60.                
  61.             usedMem2 = usedMem1;   
  62.             usedMem1 = usedMemory ();   
  63.         }   
  64.     }   
  65.     private static long usedMemory ()   
  66.     {   
  67.         return s_runtime.totalMemory () - s_runtime.freeMemory ();   
  68.     }   
  69.        
  70.     private static final Runtime s_runtime = Runtime.getRuntime ();   
  71. // End of class  
public class Sizeof
{
    public static void main (String [] args) throws Exception
    {
        // Warm up all classes/methods we will use
        runGC ();
        usedMemory ();
        // Array to keep strong references to allocated objects
        final int count = 100000;
        Object [] objects = new Object [count];
        
        long heap1 = 0;
        // Allocate count+1 objects, discard the first one
        for (int i = -1; i < count; ++ i)
        {
            Object object = null;
            
            // Instantiate your data here and assign it to object
            
            object = new Object ();
            //object = new Integer (i);
            //object = new Long (i);
            //object = new String ();
            //object = new byte [128][1]
            
            if (i >= 0)
                objects [i] = object;
            else
            {
                object = null; // Discard the warm up object
                runGC ();
                heap1 = usedMemory (); // Take a before heap snapshot
            }
        }
        runGC ();
        long heap2 = usedMemory (); // Take an after heap snapshot:
        
        final int size = Math.round (((float)(heap2 - heap1))/count);
        System.out.println ("'before' heap: " + heap1 +
                            ", 'after' heap: " + heap2);
        System.out.println ("heap delta: " + (heap2 - heap1) +
            ", {" + objects [0].getClass () + "} size = " + size + " bytes");
        for (int i = 0; i < count; ++ i) objects [i] = null;
        objects = null;
    }
    private static void runGC () throws Exception
    {
        // It helps to call Runtime.gc()
        // using several method calls:
        for (int r = 0; r < 4; ++ r) _runGC ();
    }
    private static void _runGC () throws Exception
    {
        long usedMem1 = usedMemory (), usedMem2 = Long.MAX_VALUE;
        for (int i = 0; (usedMem1 < usedMem2) && (i < 500); ++ i)
        {
            s_runtime.runFinalization ();
            s_runtime.gc ();
            Thread.currentThread ().yield ();
            
            usedMem2 = usedMem1;
            usedMem1 = usedMemory ();
        }
    }
    private static long usedMemory ()
    {
        return s_runtime.totalMemory () - s_runtime.freeMemory ();
    }
    
    private static final Runtime s_runtime = Runtime.getRuntime ();
} // End of class



Sizeof's key methods are runGC() and usedMemory(). I use a runGC() wrapper method to call _runGC() several times because it appears to make the method more aggressive. (I am not sure why, but it's possible creating and destroying a method call-stack frame causes a change in the reachability root set and prompts the garbage collector to work harder. Moreover, consuming a large fraction of the heap space to create enough work for the garbage collector to kick in also helps. In general, it is hard to ensure everything is collected. The exact details depend on the JVM and garbage collection algorithm.)

Note carefully the places where I invoke runGC(). You can edit the code between the heap1 and heap2 declarations to instantiate anything of interest.

Also note how Sizeof prints the object size: the transitive closure of data required by all count class instances, divided by count. For most classes, the result will be memory consumed by a single class instance, including all of its owned fields. That memory footprint value differs from data provided by many commercial profilers that report shallow memory footprints (for example, if an object has an int[] field, its memory consumption will appear separately).
The results

Let's apply this simple tool to a few classes, then see if the results match our expectations.

Note: The following results are based on Sun's JDK 1.3.1 for Windows. Due to what is and is not guaranteed by the Java language and JVM specifications, you cannot apply these specific results to other platforms or other Java implementations.
java.lang.Object

Well, the root of all objects just had to be my first case. For java.lang.Object, I get:

Java代码 复制代码  收藏代码
  1. 'before' heap: 510696'after' heap: 1310696  
  2. heap delta: 800000, {class java.lang.Object} size = 8 bytes  
'before' heap: 510696, 'after' heap: 1310696
heap delta: 800000, {class java.lang.Object} size = 8 bytes



So, a plain Object takes 8 bytes; of course, no one should expect the size to be 0, as every instance must carry around fields that support base operations like equals(), hashCode(), wait()/notify(), and so on.
java.lang.Integer

My colleagues and I frequently wrap native ints into Integer instances so we can store them in Java collections. How much does it cost us in memory?

Java代码 复制代码  收藏代码
  1. 'before' heap: 510696'after' heap: 2110696  
  2. heap delta: 1600000, {class java.lang.Integer} size = 16 bytes  
'before' heap: 510696, 'after' heap: 2110696
heap delta: 1600000, {class java.lang.Integer} size = 16 bytes




The 16-byte result is a little worse than I expected because an int value can fit into just 4 extra bytes. Using an Integer costs me a 300 percent memory overhead compared to when I can store the value as a primitive type.
java.lang.Long

Long should take more memory than Integer, but it does not:

Java代码 复制代码  收藏代码
  1. 'before' heap: 510696'after' heap: 2110696  
  2. heap delta: 1600000, {class java.lang.Long} size = 16 bytes  
'before' heap: 510696, 'after' heap: 2110696
heap delta: 1600000, {class java.lang.Long} size = 16 bytes


Clearly, actual object size on the heap is subject to low-level memory alignment done by a particular JVM implementation for a particular CPU type. It looks like a Long is 8 bytes of Object overhead, plus 8 bytes more for the actual long value. In contrast, Integer had an unused 4-byte hole, most likely because the JVM I use forces object alignment on an 8-byte word boundary.
Arrays

Playing with primitive type arrays proves instructive, partly to discover any hidden overhead and partly to justify another popular trick: wrapping primitive values in a size-1 array to use them as objects. By modifying Sizeof.main() to have a loop that increments the created array length on every iteration, I get for int arrays:

Java代码 复制代码  收藏代码
  1. length: 0, {class [I} size = 16 bytes   
  2. length: 1, {class [I} size = 16 bytes   
  3. length: 2, {class [I} size = 24 bytes   
  4. length: 3, {class [I} size = 24 bytes   
  5. length: 4, {class [I} size = 32 bytes   
  6. length: 5, {class [I} size = 32 bytes   
  7. length: 6, {class [I} size = 40 bytes   
  8. length: 7, {class [I} size = 40 bytes   
  9. length: 8, {class [I} size = 48 bytes   
  10. length: 9, {class [I} size = 48 bytes   
  11. length: 10, {class [I} size = 56 bytes  
length: 0, {class [I} size = 16 bytes
length: 1, {class [I} size = 16 bytes
length: 2, {class [I} size = 24 bytes
length: 3, {class [I} size = 24 bytes
length: 4, {class [I} size = 32 bytes
length: 5, {class [I} size = 32 bytes
length: 6, {class [I} size = 40 bytes
length: 7, {class [I} size = 40 bytes
length: 8, {class [I} size = 48 bytes
length: 9, {class [I} size = 48 bytes
length: 10, {class [I} size = 56 bytes



and for char arrays:

Java代码 复制代码  收藏代码
  1. length: 0, {class [C} size = 16 bytes   
  2. length: 1, {class [C} size = 16 bytes   
  3. length: 2, {class [C} size = 16 bytes   
  4. length: 3, {class [C} size = 24 bytes   
  5. length: 4, {class [C} size = 24 bytes   
  6. length: 5, {class [C} size = 24 bytes   
  7. length: 6, {class [C} size = 24 bytes   
  8. length: 7, {class [C} size = 32 bytes   
  9. length: 8, {class [C} size = 32 bytes   
  10. length: 9, {class [C} size = 32 bytes   
  11. length: 10, {class [C} size = 32 bytes  
length: 0, {class [C} size = 16 bytes
length: 1, {class [C} size = 16 bytes
length: 2, {class [C} size = 16 bytes
length: 3, {class [C} size = 24 bytes
length: 4, {class [C} size = 24 bytes
length: 5, {class [C} size = 24 bytes
length: 6, {class [C} size = 24 bytes
length: 7, {class [C} size = 32 bytes
length: 8, {class [C} size = 32 bytes
length: 9, {class [C} size = 32 bytes
length: 10, {class [C} size = 32 bytes




Above, the evidence of 8-byte alignment pops up again. Also, in addition to the inevitable Object 8-byte overhead, a primitive array adds another 8 bytes (out of which at least 4 bytes support the length field). And using int[1] appears to not offer any memory advantages over an Integer instance, except maybe as a mutable version of the same data.
Multidimensional arrays

Multidimensional arrays offer another surprise. Developers commonly employ constructs like int[dim1][dim2] in numerical and scientific computing. In an int[dim1][dim2] array instance, every nested int[dim2] array is an Object in its own right. Each adds the usual 16-byte array overhead. When I don't need a triangular or ragged array, that represents pure overhead. The impact grows when array dimensions greatly differ. For example, a int[128][2] instance takes 3,600 bytes. Compared to the 1,040 bytes an int[256] instance uses (which has the same capacity), 3,600 bytes represent a 246 percent overhead. In the extreme case of byte[256][1], the overhead factor is almost 19! Compare that to the C/C++ situation in which the same syntax does not add any storage overhead.
java.lang.String

Let's try an empty String, first constructed as new String():

Java代码 复制代码  收藏代码
  1. 'before' heap: 510696'after' heap: 4510696  
  2. heap delta: 4000000, {class java.lang.String} size = 40 bytes  
'before' heap: 510696, 'after' heap: 4510696
heap delta: 4000000, {class java.lang.String} size = 40 bytes




The result proves quite depressing. An empty String takes 40 bytes—enough memory to fit 20 Java characters.

Before I try Strings with content, I need a helper method to create Strings guaranteed not to get interned. Merely using literals as in:

Java代码 复制代码  收藏代码
  1. object = "string with 20 chars";  
    object = "string with 20 chars";




will not work because all such object handles will end up pointing to the same String instance. The language specification dictates such behavior (see also the java.lang.String.intern() method). Therefore, to continue our memory snooping, try:

Java代码 复制代码  收藏代码
  1. public static String createString (final int length)   
  2. {   
  3.     char [] result = new char [length];   
  4.     for (int i = 0; i < length; ++ i) result [i] = (char) i;   
  5.        
  6.     return new String (result);   
  7. }  
    public static String createString (final int length)
    {
        char [] result = new char [length];
        for (int i = 0; i < length; ++ i) result [i] = (char) i;
        
        return new String (result);
    }




After arming myself with this String creator method, I get the following results:

Java代码 复制代码  收藏代码
  1. length: 0, {class java.lang.String} size = 40 bytes   
  2. length: 1, {class java.lang.String} size = 40 bytes   
  3. length: 2, {class java.lang.String} size = 40 bytes   
  4. length: 3, {class java.lang.String} size = 48 bytes   
  5. length: 4, {class java.lang.String} size = 48 bytes   
  6. length: 5, {class java.lang.String} size = 48 bytes   
  7. length: 6, {class java.lang.String} size = 48 bytes   
  8. length: 7, {class java.lang.String} size = 56 bytes   
  9. length: 8, {class java.lang.String} size = 56 bytes   
  10. length: 9, {class java.lang.String} size = 56 bytes   
  11. length: 10, {class java.lang.String} size = 56 bytes  
length: 0, {class java.lang.String} size = 40 bytes
length: 1, {class java.lang.String} size = 40 bytes
length: 2, {class java.lang.String} size = 40 bytes
length: 3, {class java.lang.String} size = 48 bytes
length: 4, {class java.lang.String} size = 48 bytes
length: 5, {class java.lang.String} size = 48 bytes
length: 6, {class java.lang.String} size = 48 bytes
length: 7, {class java.lang.String} size = 56 bytes
length: 8, {class java.lang.String} size = 56 bytes
length: 9, {class java.lang.String} size = 56 bytes
length: 10, {class java.lang.String} size = 56 bytes




The results clearly show that a String's memory growth tracks its internal char array's growth. However, the String class adds another 24 bytes of overhead. For a nonempty String of size 10 characters or less, the added overhead cost relative to useful payload (2 bytes for each char plus 4 bytes for the length), ranges from 100 to 400 percent.

Of course, the penalty depends on your application's data distribution. Somehow I suspected that 10 characters represents the typical String length for a variety of applications. To get a concrete data point, I instrumented the SwingSet2 demo (by modifying the String class implementation directly) that came with JDK 1.3.x to track the lengths of the Strings it creates. After a few minutes playing with the demo, a data dump showed that about 180,000 Strings were instantiated. Sorting them into size buckets confirmed my expectations:
Java代码 复制代码  收藏代码
  1. [0-10]:  96481  
  2. [10-20]: 27279  
  3. [20-30]: 31949  
  4. [30-40]: 7917  
  5. [40-50]: 7344  
  6. [50-60]: 3545  
  7. [60-70]: 1581  
  8. [70-80]: 1247  
  9. [80-90]: 874  
  10. ...  
[0-10]:  96481
[10-20]: 27279
[20-30]: 31949
[30-40]: 7917
[40-50]: 7344
[50-60]: 3545
[60-70]: 1581
[70-80]: 1247
[80-90]: 874
...



That's right, more than 50 percent of all String lengths fell into the 0-10 bucket, the very hot spot of String class inefficiency!

In reality, Strings can consume even more memory than their lengths suggest: Strings generated out of StringBuffers (either explicitly or via the '+' concatenation operator) likely have char arrays with lengths larger than the reported String lengths because StringBuffers typically start with a capacity of 16, then double it on append() operations. So, for example, createString(1) + ' ' ends up with a char array of size 16, not 2.
What do we do?

"This is all very well, but we don't have any choice but to use Strings and other types provided by Java, do we?" I hear you ask. Let's find out.
Wrapper classes

Wrapper classes like java.lang.Integer seem a bad choice for storing large data amounts in memory. If you strive to be memory-economic, avoid them altogether. Rolling your own vector class for primitive ints isn't difficult. Of course, it would be great if the Java core API already contained such libraries. Perhaps the situation will improve when Java has generic types.
Multidimensional arrays

For large data structures built with multidimensional arrays, you can oftentimes reduce the extra dimension overhead by an easy indexing change: convert every int[dim1][dim2] instance to an int[dim1*dim2] instance and change all expressions like a[i][j] to a[i*dim1 + j]. Of course, you pay a price from the lack of index-range checking on dim1 dimension (which also boosts performance).
java.lang.String

You can try a few simple tricks to reduce your application's String static memory size.

First, you can try one common technique when an application loads and caches many Strings from a data file or a network connection, and the String value range proves limited. For example, if you want to parse an XML file in which you frequently encounter a certain attribute, but the attribute is limited to just two possible values. Your goal: filter all Strings through a hash map and reduce all equal but distinct Strings to identical object references:

Java代码 复制代码  收藏代码
  1. public String internString (String s)   
  2. {   
  3.     if (s == nullreturn null;   
  4.        
  5.     String is = (String) m_strings.get (s);   
  6.     if (is != null)   
  7.         return is;   
  8.     else  
  9.     {   
  10.         m_strings.put (s, s);   
  11.         return s;   
  12.     }   
  13. }   
  14.   
  15. private Map m_strings = new HashMap ();  
    public String internString (String s)
    {
        if (s == null) return null;
        
        String is = (String) m_strings.get (s);
        if (is != null)
            return is;
        else
        {
            m_strings.put (s, s);
            return s;
        }
    }
    
    private Map m_strings = new HashMap ();


When applicable, that trick can decrease your static memory requirements by hundreds of percent. An experienced reader may observe that the trick duplicates java.lang.String.intern()'s functionality. Numerous reasons exist to avoid the String.intern() method. One is that few modern JVMs can intern large amounts of data.

What if your Strings are all different? For the second trick, recollect that for small Strings the underlying char array takes half the memory occupied by the String that wraps it. Thus, when my application caches many distinct String values, I can just keep the arrays in memory and convert them to Strings as needed. That works well if each such String then serves as a transient, quickly discarded object. A simple experiment with caching 90,000 words taken from a sample dictionary file shows that this data takes about 5.6 MB in String form and only 3.4 MB in char[] form, a 65 percent reduction.

The second trick contains one obvious disadvantage: you cannot convert a char[] back to a String through a constructor that would take ownership of the array without cloning it. Why? Because the entire public String API ensures that every String is immutable, so every String constructor defensively clones input data passed through its parameters.

Still, you can try a third trick when the cost of converting from char arrays to Strings proves too high. The trick exploits java.lang.String.substr()'s ability to avoid data copying: the method implementation exploits String immutability and creates a shallow String object that shares the char content array with the original String but has its internal start and end indices adjusted correspondingly. To make an example, new String("smiles").substring(1,5) is a String configured to start at index 1 and end at index 4 within a char buffer "smiles" shared by reference with the originally constructed String. You can exploit that fact as follows: given a large String set, you can merge its char content into one large char array, create a String out of it, and recreate the original Strings as subStrings of this master String, as the following method illustrates:
Java代码 复制代码  收藏代码
  1. public static String [] compactStrings (String [] strings)   
  2. {   
  3.     String [] result = new String [strings.length];   
  4.     int offset = 0;   
  5.        
  6.     for (int i = 0; i < strings.length; ++ i)   
  7.         offset += strings [i].length ();   
  8.        
  9.     // Can't use StringBuffer due to how it manages capacity   
  10.     char [] allchars = new char [offset];   
  11.        
  12.     offset = 0;   
  13.     for (int i = 0; i < strings.length; ++ i)   
  14.     {   
  15.         strings [i].getChars (0, strings [i].length (), allchars, offset);   
  16.         offset += strings [i].length ();   
  17.     }   
  18.        
  19.     String allstrings = new String (allchars);   
  20.        
  21.     offset = 0;   
  22.     for (int i = 0; i < strings.length; ++ i)   
  23.         result [i] = allstrings.substring (offset,   
  24.                                          offset += strings [i].length ());   
  25.        
  26.     return result;   
  27. }  
    public static String [] compactStrings (String [] strings)
    {
        String [] result = new String [strings.length];
        int offset = 0;
        
        for (int i = 0; i < strings.length; ++ i)
            offset += strings [i].length ();
        
        // Can't use StringBuffer due to how it manages capacity
        char [] allchars = new char [offset];
        
        offset = 0;
        for (int i = 0; i < strings.length; ++ i)
        {
            strings [i].getChars (0, strings [i].length (), allchars, offset);
            offset += strings [i].length ();
        }
        
        String allstrings = new String (allchars);
        
        offset = 0;
        for (int i = 0; i < strings.length; ++ i)
            result [i] = allstrings.substring (offset,
                                             offset += strings [i].length ());
        
        return result;
    }


The above method returns a new set of Strings equivalent to the input set but more compact in memory. Recollect from earlier measurements that every char[] adds 16 bytes of overhead; effectively removed by this method. The savings could be significant when cached data comprises mostly short Strings. When you apply this trick to the same 90,000-word dictionary mentioned above, the memory size drops from 5.6 MB to 4.2 MB, a 30 percent reduction. (An astute reader will observe in that particular example the Strings tend to share many prefixes and the compactString() method could be further optimized to reduce the merged char array's size.)

As a side effect, compactString() also removes StringBuffer-related inefficiencies mentioned earlier.
Is it worth the effort?

To many, the techniques I presented may seem like micro-optimizations not worth the time it takes to implement them. However, remember the applications I had in mind: server-side applications that cache massive amounts of data in memory to achieve performance impossible when data comes from a disk or database. Several hundred megabytes of cached data represents a noticeable fraction of maximum heap sizes of today's 32-bit JVMs. Shaving 30 percent or more off is nothing to scoff at; it could push an application's scalability limits quite noticeably. Of course, these tricks cannot substitute for beginning with well-designed data structures and profiling your application to determine its actual hot spots. In any case, you're now more aware of how much memory your objects consume.
Author Bio
Vladimir Roubtsov has programmed in a variety of languages for more than 12 years, including Java since 1995. Currently, he develops enterprise software as a senior developer for Trilogy in Austin, Texas. When coding for fun, Vladimir develops software tools based on Java byte code or source code instrumentation.  
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值