Java performance programming, Part 2: The cost of casting-CSDN博客

http://www.javaworld.com/javaworld/jw-12-1999/jw-12-performance.html

Sidebar: Casting issues and JVM performance

Interfaces and method calls

Table 1 shows how different method-call types affect performance of several JVMs. The values in this table were obtained by running a test program that executes a small calculation (integer multiply and add) with various method-call wrappers. By comparing the times for the test execution using a particular type of method call with the basic calculation time, we can get an idea of the overhead involved in that type of method call.

**Table 1. Method call performance (time in seconds; lower scores are better)**
	JRE 1.1.8 (Sun)	JRE 1.1.8 (IBM)	JRE 1.2.2 (Classic)	JRE 1.2.2 (HotSpot 2.0 beta)
1. Inline calculation	0.6	0.3	0.3	0.6
2. Called calculation method	0.9	1.7	0.9	0.8
3. Accessor method call	2.0	2.8	2.1	0.7
4. Overridden method call	3.2	3.7	3.2	5.6
5. Interface method call	5.5	5.2	5.5	6.5
6. Object allocation time	51.4	8.6	34.0	16.1

Line 1 gives the time for the calculation loop (executed 20 million times in this test) executed as a direct loop using a local variable, while line 2 gives the time for a simple method call with the initial value passed as a parameter and the result value returned. Given the simple nature of the calculation step, it's not surprising that adding a method call around it can slow the loop considerably. Only the IBM JVM shows a dramatic decrease in speed for this change, though, suggesting that it may not optimize this type of method call as well as the other alternatives.

Line 3 adds a twist to the simple method-call test of line 2, adding get and set method accesses to a variable in an object for each calculation step. This again slows the calculation, but not by a large amount.

These first three lines show almost identical times when using the HotSpot Server JVM, suggesting that it's converting the method calls into inline code. In the test on line 4, this breaks down, though. For this test, get and set methods that are overridden in a subclass are used. Even though the test uses the parent class implementation (and not the subclass with the overriding method defined), this form of call runs slower on all JVMs than calls to methods that are never overridden. The performance degradation is especially marked for the HotSpot version tested.

Line 5 shows the results of using an interface for the get and set method calls, rather than class-method calls. An interface method call requires more work by the JVM than a class-method call, and this shows in the test results. All JVMs tested ran this test slower than either a base or overridden class-method call, and, for all but HotSpot, the difference was substantial even compared to the slower overridden class-method call.

To offer some perspective on these results, the last line of the table shows the time taken to allocate the equivalent of one object (a java.lang.Integer instance, in this case) for each calculation step in the other tests. These times show that, although method-call overhead can slow performance considerably in high-usage code, its impact is in most cases much smaller than the object allocation overhead discussed in last month's article.

The costs of casting

JVMs use a variety of special techniques to minimize the overhead of the runtime checks required for downcasting, but there will always be situations where these techniques fail. Table 2 lists some results of an investigation into this issue with current JVMs, showing how various casting operations effect performance.

**Table 2. Casting performance (time in seconds; lower scores are better)**
	JRE 1.1.8 (Sun)	JRE 1.1.8 (IBM)	JRE 1.2.2 (Classic)	JRE 1.2.2 (HotSpot 2.0 beta)
1. Method call	0.9	1.7	0.9	0.8
2. Direct member variable	1.2	1.2	0.7	0.7
3. Downcast member variable	2.4	2.2	2.4	1.3
4. Downcast parent member variable	5.2	2.2	8.7	1.3
5. Downcast method call	3.0	3.5	3.1	1.3
6. Downcast parent method call	5.9	3.4	9.6	1.3
7. Downcast parent overridden method call	7.5	4.1	10.3	6.0
8. Downcast to interface method call	9.6	5.6	9.7	7.0
9. Tested downcast to interface method call	9.7	6.2	9.9	7.7
10. Mixed cast parent method call	6.2	3.5	8.8	23.1
11. Object allocation time	51.4	8.6	34.0	16.1

Line 1 gives the time for the calculation step executed as a simple method call with the initial value passed as a parameter and the result value returned (the same as Line 2 in the prior table). These times are the basic comparison values for this set of test results, since all the other tests are build around this method call form.

Lines 2, 3, and 4 show different variations of accessing a member variable of an object. In line 2, the access is without casting. In line 3, an object reference is passed as a generic java.lang.Object and then downcast to the actual class of the object. In line 4, a generic object reference is passed and then downcast to a parent class of the object.

For these member-variable accesses, the adaptive JVMs (IBM JRE 1.1.8 and HotSpot) do a very good job of minimizing the overhead of a cast. In the last of these cases, where the cast is not to the actual class of the object, the standard JVMs (Sun JRE 1.2.2 and 1.1.8) are starting to show high levels of overhead for the cast operation.

Lines 5, 6, and 7 show different casting variations combined with method calls. In line 5, this is a simple downcast of a generic object reference to the actual class of the object. In line 6 the cast is to a parent class of the object, and in line 7 the method being called is one which is overridden by a loaded class. All the tested JVMs except HotSpot show fairly high overhead for the cast to the parent class of the object, and, for the case in which the called method is overridden, HotSpot shows fairly high overhead as well (though not nearly as high as the base Sun JRE 1.2.2).

Lines 8 and 9 show casting to an interface with method calls through the interface. These are again high-overhead operations, though less so for the adaptive JVMs. The difference between these two lines is a test before the cast in line 9, duplicating the common coding idiom of a check before a cast:

    if (obj instanceof IValue) {
        IValue iobj = (IValue) obj;
        ...
    }

This test was specifically designed to determine if the JVMs were able to translate this coding idiom efficiently. Since the cast operation is checked before it's executed, the hope was that there would be no need to check it again when actually performing the cast. Unfortunately, judging from the test times, it looks like the JVMs aren't able to make full use of this information in the generated code and do go through some duplicated effort. On the brighter side, however, the duplicated effort isn't as much as for the equivalent cast operation.

Line 10 shows the results of a test that used a different usage pattern. All the other tests used a single target type for the casts. In this test, two target types (one a subclass of the other) were used in alternation, with calls to a method inherited by both target types. This test only presented a problem for the HotSpot JVM, but in this case the problem was a substantial one, giving by far the worse performance of any of the tests.

It appears from these tests that part of HotSpot's performance advantage for casting operations is some type of caching of the cast last performed for an object. If the same cast is done repeatedly on an object, the performance is very good, but if a cast to another type is done, there's a large performance hit. This trade-off is probably good in general, but can result in unexpected performance bottlenecks when usage does not fit the expected pattern.

Finally, the last line of the table shows the time taken to allocate the equivalent of one object for each calculation step in the other tests. As with the Table 1 times for method call tests, these times show that casting is generally not as significant a performance factor as object allocations.