Weighing in on Java native compilation

Weighing in on Java native compilation
Discuss 74KB e-mail it!
Contents:
Code compilation basics
About the test setup
Test 1: Prime.java
Test 2: SciMark 2
Pros and cons
Conclusion
Resources
About the author
Rate this article
Related content:
Bridging the gap to COM
Debugging integrated Java and C/C++ code
Subscriptions:
dW newsletters
dW Subscription
(CDs and downloads)
The pros and cons of generating native code from Java source

Martyn Honeyford (martynh@uk.ibm.com)
Software Engineer, IBM UK Labs
01 Jan 2002

When it was first introduced, it seemed that Java native compilation would surely topple the JVM, taking with it the Java platform's hard-fought platform independence. But even with its growing popularity and the increasing number of native compilers on the market, native compilation has a way to go before it poses a real threat to Java code's portability. Unfortunately, it also may be a while before the technology is mature enough to resolve the Java performance issues so many of us struggle with today. Share your thoughts on this article with the author and other readers in the discussion forum by clicking Discuss at the top or bottom of the article.

Despite its many high points, there are still several issues with the Java language that rule out its use in key projects. These include execution speed, memory footprint, disk footprint, and JVM availability. JIT compilers do much to improve the platform's execution speed and J2ME greatly reduces its memory footprint, but in many domains Java applications simply cannot compete with their native (typically C/C++) counterparts. To resolve these problems, many developers have turned to Java native compilers, which allow applications to be written in the Java language and then compiled into native executables. This solution will cost you in terms of platform independence, but it can result in the faster execution and smaller footprint essential to so many of today's applications.

To bring you up to speed on Java native compilation technology, we'll start with a discussion of the basics of code compilation, including a brief overview of why many developers are employing Java native compilers for their applications. Next, we'll test the results of Java native compilation, using a free software compiler and two different applications (one very simple, the other more sophisticated). These examples and the resulting metrics will serve as a first-hand look at how the recent Java native compilers compare with the JVM.

Code compilation basics
To follow the discussion in this article, you should be familiar with the three most common methods of code compilation:

  • Compiling Java code with a Java compiler such as javac

  • Compiling native code such as C/C++ targeted to a specific hardware/operating system (OS) platform

  • Compiling Java code using a Java native compiler targeted to a specific hardware/OS platform

Compiling Java code using a Java compiler is straightforward. We simply write the source code in the Java language, use a Java compiler to compile the source into Java bytecode, and execute the results on any hardware/OS platform that has a JVM installed. Java's reliance on the JVM for its signature "write once, run anywhere" portability is its downside; not only must a JVM be available for any platform on which you want to run your Java apps, but there must be significant system resources (memory and disk space) to support that JVM. As a result, many developers continue to rely on less flexible but more targeted languages such as C/C++.

Compiling source in C/C++ is similar to doing so in Java. Once the code is written, we run it through a compiler and linker targeted to a specific hardware/OS platform. The resulting application will be executable only on the targeted platform, but will not require that a JVM be installed (though it may require some supporting shared libraries, depending on language used). All but the most simple applications developed using this method must be tailored individually to each hardware/OS platform on which you want them to run.

The third method attempts to bridge the best of each of the above solutions, allowing developers to write applications in the Java language and compile them into native executables. Once the Java code is written, it can be run through a Java compiler to produce Java bytecode, which is then compiled into native code, or it can be run directly into a Java native compiler. The number of steps involved depends on the requirements of the compiler used.

The advantage of this approach is that the resulting code can be executed on the targeted platform without the JVM. This is intended to result in Java applications that execute at much improved speeds and require significantly less disk space and memory to run (though it may be necessary to provide supporting libraries for the Java native compiler).

Compilers vary in the platforms they target, the level of Java support they provide, and the amount of system resources they use. You'll find a listing of some of the currently available native compilers in this article's Resources section.

About the test setup
It is well beyond the scope of this article to compare the features and performance of every native compiler on the market. Instead, I've used one compiler, the GNU Compiler for the Java Programming Language (GCJ), as an example to detail the process and results of native compilation. GCJ is one of the compilers developed for the GNU Compiler Collection (GCC), which is part of the GNU project. As is true of all the software that comes out of the GNU project, GCJ is free software in both senses of the term, and therefore can easily be obtained (see Resources). If you're seriously considering the native compilation route for your product, you should obviously evaluate as many compilers as you can, perhaps using the criteria established in this article.

My test system hardware consists of a PC with a Pentium II processor running at 450 MHz and containing 320 MB of memory. The OS is a recent install of the Mandrake 8.1 Linux distribution. This distribution comes with version 3.0.1 of GCJ, which is included in GCC 3.0.1 and ships as part of the 8.1 Mandrake distribution.

I've run two separate applications, one very simple and one that is more complex. To compare the system's performance against that of the Java platform, I compiled the applications into Java bytecode. I compiled the Java code using the Sun JDK version 1.3.1.02 for Linux, then tested the resulting class on the following JVMs:

  • Kaffe 1.0.6

  • Sun JVM 1.3.1_02

  • IBM JRE 1.3.1

For the purpose of this article, I've measured execution speed, execution memory overhead, and disk space.

Test 1: Prime.java
The first test application is very simple, consisting of a single class, prime.java. This application implements a very basic algorithm to search for prime numbers. Listing 1 shows the source code for prime.java.


import java.io.*;
class prime 
{
   private static boolean isPrime(long i)
   {
       for(long test = 2; test < i; test++)
       {
     if(i%test == 0)
     {
    return false;
     }
       }       
       return true;
   }

   public static void main(String[] args) throws IOException 
   {
       long start_time = System.currentTimeMillis();

       long n_loops =  50000;
       long n_primes = 0;

       for(long i = 0; i < n_loops; i++)
       {
     if(isPrime(i))
           {
         n_primes++;
           }
       }
   
       long end_time = System.currentTimeMillis();

       System.out.println(n_primes + " primes found");       
       System.out.println("Time taken = " + (end_time - start_time));
   }
}

As you can see, the code loops from 0 to 50000. As it goes, it attempts to divide each number it encounters by every number up to itself, to find out if there is a remainder. (This is, admittedly, the brute-force method of ferreting out primes, but it will suffice for the example.)

I compiled prime.java into a native executable with the command:

gcj prime.java -O3 --main=prime -o prime

The argument -O3 means "optimize for speed"; argument --main tells GCJ which class contains the main method to be used when the application is run; and argument -o Prime names the resulting executable. For a full set of command-line arguments, see the GCJ documentation.

To compile the Java bytecode test, I used the command:

/usr/java/jdk1.3.1_02/bin/javac -O prime.java

Next, I invoked the code for each of our test JVMs, using the following commands:

  • Native: ./prime

  • Kaffe: /usr/bin/java prime

  • Sun JDK: /usr/java/jdk1.3.1_02/bin/java prime

  • IBM JRE: /opt/IBMJava2-13/jre/bin/java prime

Test results for prime.java
As previously mentioned, I tested for execution speed, memory use, and disk space use. The following tables detail the results of the first test.

Table 1. Prime.java: Execution speed

ImplementationTime in milliseconds (average of three runs -- lower score is better)
Native40180
Kaffe75456
Sun JDK67315
IBM JRE18188

Table 2. Prime.java: Memory usage

ImplementationVM size (KB) VM RSS (KB)
Native70243528
Kaffe88883564
Sun JDK1695606636
IBM JRE819366288

Note that the VM size equals the total size of the image of the process. This includes all code, data, and shared libraries used by the process, including pages that have been swapped out. VM resident set size (RSS) is equal to the size of the part of the process (code and data) that actually resides in RAM, including shared libraries. This gives a fair approximation of how much RAM a process is using.

In simple terms, if a process allocates a large amount of memory it will show up in the VM size, but it won't show up in VM RSS until it is actually being used (for example, read or written). VM RSS is actually the more important measure, because it gives a greater indication of the performance hit on the system.

Table 3. Prime.java: Disk space usage

ImplementationTotal compiled size (bytes)
Native 22268
Java classes962

Note that the measurements shown in Table 3 exclude shared libraries and the JVM, and are measured with the executable stripped.

Test 2: SciMark 2
For the second test I employed a more complicated Java application, the SciMark 2 Java benchmark. The command-line version used for this article is available for free (see Resources). SciMark 2 is quite a sophisticated application. It implements a number of benchmarks that are intended to accurately measure the efficiency of a JVM.

I used the following command to compile SciMark 2 into a native executable:


gcj-3.0.1 -O3 commandline.java Random.java FFT.java SOR.java Stopwatch.java 
  SparseCompRow.java LU.java kernel.java MonteCarlo.java 
    --main=jnt.scimark2.commandline -o scimark

And I used this command to compile the application into Java bytecode:


/usr/java/jdk1.3.1_02/bin/javac -O *.java

The SciMark 2 benchmark can be run in two modes, normal and large. The mode you use determines the size of the problem sets used. I've run the tests in both modes.

To invoke the code in normal mode, I used the following commands:

  • Native: ./scimark

  • Kaffe: /usr/bin/java jnt.scimark2.commandline

  • Sun JDK: /usr/java/jdk1.3.1_02/bin/java jnt.scimark2.commandline

  • IBM JRE: /opt/IBMJava2-13/jre/bin/java jnt.scimark2.commandline

For the larger problem sets I used these commands:

  • Native: ./scimark -large

  • Kaffe: /usr/bin/java jnt.scimark2.commandline -large

  • Sun JDK: /usr/java/jdk1.3.1_02/bin/java jnt.scimark2.commandline -large

  • IBM JRE: /opt/IBMJava2-13/jre/bin/java jnt.scimark2.commandline -large

Test results for SciMark 2
The following tables show the results of compiling SciMark 2. Note the difference in results for the normal and large modes.

Table 4. SciMark 2, mode normal: Execution speed

ImplementationComposite score (average of three runs -- higher score is better)
Native15.22
Kaffe7.01
Sun JDK22.86
IBM JRE25.29

Table 5. SciMark 2, mode normal: Memory usage

ImplementationVM size (KB) VM RSS (KB)
Native97885956
Kaffe88884092
Sun JDK1696927428
IBM JRE819647408

Table 6. SciMark 2, mode large: Execution speed

ImplementationComposite score (average of three runs -- higher score is better)
Native8.78
Kaffe5.72
Sun JDK12.04
IBM JRE15.04

Table 7. SciMark 2, mode large: Memory usage

ImplementationVM size (KB) VM RSS (KB)
Native6288859072
Kaffe5805656988
Sun JDK16969264624
IBM JRE8196457704

Table 8. SciMark 2: Disk space usage for both modes

ImplementationCompiled size (bytes)
Native49588
Java Classes16318

Once again, the measurements in Table 8 exclude shared libraries and the JVM, and are measured with the executable stripped.

Pros and cons of native compilation
As should be apparent from the above test results, the success or failure of Java native compilation is far from clear cut. Some of the benchmarks show the natively compiled executables to be faster than some of the JVM versions; others are slower. Similarly, the speed of some operations varies wildly between different JVMs. The "working set" memory tests performed show that there isn't a vast amount of difference in the memory usage during execution. Tests employing different garbage collection schemes on both the native and JVM tests could further explore this area.

The native version is a clear winner over the JVM version only when it comes to disk space, and this is true only when the size of the JVM is taken into account. While the classes themselves are very small, the JVMs tested were huge (a recursive directory listing in the jre subdirectory of both the IBM and Sun JVMs shows that the JREs alone take up over 50 MB of disk space). But bear in mind that there are much smaller JVMs available and, while the combination of JVM and a single application was much larger than that of a native executable and the GCJ runtime library, libgcj.so (which is under 3 MB), the executable size for the native version was much larger. Thus, in situations where a large number of applications are required, the JVM version may ultimately be the winner.

In addition to these somewhat nebulous results, a number of potential problems can arise from the use of Java native compilers. They are as follows:

  • Loss of platform independence: In reality, this isn't so much of a problem. Because the source is written in the Java language, you still have the option to produce a Java bytecode version that will run anywhere, then use native compilers on certain platforms as required.

  • Class support/compiler maturity: Some of the compilers are still relatively immature and may not support all the Java classes required by your application. For example, while GCJ supports most Java language constructs up to v1.1 of the specification, it doesn't support all the Java class libraries that typically ship with a JVM. Most notably, there is very little support for AWT, making GCJ unsuitable for GUI applications. Different compilers support differing levels of class library; Excelsior JET is one compiler that claims to completely support AWT and Swing.

  • Support/complexity: As this field is a relatively new one, it is often not very well understood by developers. Diagnostic tools can be somewhat thin on the ground, which makes it potentially more difficult to diagnose problems that occur in natively compiled Java apps (particularly if the error doesn't occur in the Java bytecode version!).

Conclusion
As is generally the case when it comes to application development, the only way to really determine if Java native compilation is the answer to your particular set of circumstances is to run through a problem-solving cycle:

  1. Determine exactly what problem (or problems) you are hoping to solve with native compilation.

  2. Take a look at the available native compilers and come up with a handful that look like they could solve your problem.

  3. Try all the compilers you've selected with your application and see what happens.

Despite the relative immaturity of the technology and the lack of clear-cut results, Java native compilation is an exciting new area for the Java language. The best way to take advantage of the existing options is to research and test them yourself, perhaps using some of the methods and criteria established in this article.

While native compilation isn't the JVM killer that many people thought it would be, it has proven to be just the right solution for some applications and environments. Native compilation extends the use of Java language into domains where it simply wasn't applicable just a few short years ago. This can only be a good thing for the Java language and for the Java community as a whole.

Resources

  • Participate in the discussion forum on this article. (You can also click Discuss at the top or bottom of the article to access the forum.)

  • To learn more about GCJ and the GNU Compiler Collection, visit the GNU Compiler for the Java Programming Language homepage.

  • SciMark 2.0 is a composite Java benchmark for measuring the performance of numerical code in scientific and engineering applications. See the SciMark 2.0 home page to learn more about this sophisticated application.

  • Learn how to reuse code that wasn't written in the Java language in "Bridging the gap to COM" (developerWorks, October 2001).

  • When you cannot employ a pure Java language solution in an application, you can still effectively debug the Java/C hybrid. Matthew White explains how in "Debugging integrated Java and C/C++ code" (developerWorks, November 2001).

  • Find more Java resources on the developerWorks Java technology zone.

About the author
Martyn Honeyford graduated from Nottingham University with a BS in Computer Science in 1996. He has worked as a software engineer at IBM UK Labs in Hursley, England, ever since. His current role is as a developer in the WebSphere MQ Everyplace development team. When not working, Martyn can usually be found either playing the electric guitar (badly) or playing video games more than most people would consider healthy. You can contact Martyn at martynh@uk.ibm.com.
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值