Inside Class Loaders

最新推荐文章于 2013-09-20 13:34:22 发布

baotoushashou

最新推荐文章于 2013-09-20 13:34:22 发布

阅读量443

点赞数

文章标签： Inside Class Loaders

Inside Class Loaders

by Andreas Schaefer
11/12/2003

This series of articles started when I wanted to write a weblog about the impact of class loaders in a J2EE server. But the log entry grew, due the fact that a few basic rules still can provide a complex system, as you see in physics, where a few basic components and forces can build up something like our universe with all of the stars, black holes, pulsars, galaxies, and planets.

In this part, I want to lay the groundwork on which we can start a discussion about dynamic and modular software systems. Class loaders may seem to be a dry topic, but I think it is one of the topics that separate the junior from the senior software engineer, so bear with me for an exciting journey into the darker corners of Java.

Now you may ask yourself, "Why should I deal with multiple class loaders and their limitations and problems?" The short answer is that you probably have to, one way or the other. Even when you write a simple servlet or JSP program and deploy within a servlet container, your code is loaded by your very own class loader, preventing you from accessing other web applications' classes. In addition, many "container-type" applications such as J2EE servers, web containers, NetBeans, and others are using custom class loaders in order to limit the impact of classes provided by a component, and thus will have an impact on the developer of such components.

As we will see later, even with dynamic class loading, there can only be one class loaded in a particular JVM. Additional class loaders enable a developer to partition the JVM so that the reduced visibility of a class makes it possible to have multiple, different definitions of the same class loaded.

The class loaders work like the federal bank of each country, issuing their own currency. The border of each country defines the visibility and usability of the currency and makes it possible to have multiple currencies in the world.

First we need to explain some definitions:

CL:	Class loader.
Initial CL:	The CL that initiated the loading of the class.
Effective CL:	The CL that actually loaded the class.
Class type:	The fully qualified class name (package plus class name).
Class:	A combination of the class type and effective class loader.
`java.lang.Class`:	A class in the JDK that represents a class (name, fields, methods, etc.).
Symbolic Link:	A class type used within the source code, such as superclasses, extended interfaces, variables, parameters, return values, instanceofs, and upcasts.

Class loaders and their usage follow a few simple rules:

Class loaders are hierarchically organized, where each one has a parent class loader, except the bootstrap class loader (the root).
Class loaders should (practically: must) delegate the loading of a class to the parent, but a custom class loader can define for itself when it should do so.
A class is defined by its class type and the effective class loader.
A class is only loaded once and then cached in the class loader to ensure that the byte code cannot change.
Any symbolic links are loaded by the effective class loader (or one of its ancestors), if this is not already done. The JVM can defer this resolution until the class is actually used.
An upcast of an instance to another class fails when the class of the instance and the class of the symbolic link do not match (meaning their class loaders do not match).

Now I want to put on some meat to these bare-bone rules to provide better understanding.

Class Loader Organization and Delegation

Before we start, let's look at a typical class loader hierarchy, as illustrated by Figure 1:

Figure 1. Class loader hierarchy example

As shown in Figure 1, the bootstrap class loader (BS) loads the classes from the JVM, as well as extensions to the JDK. The system class loader (CP) loads all of the classes provided by the CLASSPATH environment variable or passed using the -classpath argument to the java command. Finally we have several additional class loaders, where A1-3 are children of the CP, and B1-2 are children of A3. Every class loader (except BS) has a parent class loader, even if no parent is provided explicitly; in the latter case, the CP is automatically set as the parent.

Class Linking

After a class is defined with defineClass(), it must be linked in order to be usable by the final resolveClass() method. Between this method call and the first usage of a symbolic link, the class type is loaded by the class loader of the containing class as Initial CL. If any linked class (type) cannot be loaded, the method will throw a linkage error (java.lang.NoClassDefFoundError). Keep in mind that the resolution of symbolic links is up to the JVM and can be done anywhere between the loading of the containing class (eager resolution or C-style) and the first actual usage of the symbolic link (lazy resolution). It can happen that a symbolic link is in a class and if it is never used, the linked class will never be loaded such as in this example with JDK 1.4.2 on Windows 2000:

public class M {
    // In JDK 1.4.2 on W2K this class can be used
    // fine even if class O is not available.
	public O mMyInstanceOfO;
}

whereas this class will fail with a linkage error if the class O cannot be loaded:

public class M {
    // In JDK 1.4.2 and W2K the creation of an
    // instance of M will FAIL with
    // a NoClassDefFoundError if class O is not
    // available
	public O mMyInstanceOfO = new O();
}

and to make matters a little bit more complicated, it only fails when an instance is created:

    // Fine because in JDK 1.4.2 on W2K class
    // linking is done lazy
    Class lClassM = Class.forName("M");
    // Fails with NoClassDefFoundError
    Object lObject = lClassM.newInstance();

For more information, please read Chapter 12: "Execution" in the Java Language Specification.

Class Definition

To a beginner, a class is identified solely by the class type. As soon as you start to deal with class loaders, this is no longer the case. Provided that class type M is not available to CP, A1 and A2 could load the same class type M with different byte code. Even when the byte code would be the same from a Java point of view, these classes are different, no matter if the byte code is the same or not. To avoid ambiguities, a class is identified by its class type as well as the Effective CL, and I will use the notation <Class Name>-<Class Loader>. So for this case, we have classes M-A1 and M-A2. Imagine we also have another class,Test-A1, with a method upcastM() that looks like this:

public void upcastM(Object pInstance)
        throws Exception {
    M lM = (M) pInstance;
}

Because the class Test is loaded by A1, its symbolic link M is also loaded by A1. So we are going to upcast a given object to M-A1. When this method is called with an instance of the class M-A1 as an argument, it will return successfully, but if it is called with an instance of M-A2, it will throw a ClassCastException because it is not the same class, according to the JVM. Even with reflection this rule is enforced, because both java.lang.Class.newInstance() andjava.lang.reflect.Constructor.newInstance() return an instance of class java.lang.Object-BS. Unless only reflection is used during the lifetime of this object, the instance has to be upcast at some point. In the case of only using reflection to avoid conflicts, any arguments of a method still be subject to an upcast to the class of the method signature and therefore the classes must match, otherwise you get a java.lang.IllegalArgumentException due to the ClassCastException.

Test

The sample code may help the reader to better understand the concepts described above and, later, to do their own investigations. In order to run the sample code, just extract it in the directory of your choice and execute the ant build script in the classloader.part1.basics directory.

It has three directories: main, version_a, and version_b. The main directory contains the startup class Main.java as well as the custom class loader that will load classes from a given directory. The other two directories both contain one version of M.javaand Test.java. The class Main will first create two custom class loaders each loading classes, after delegating to the parent class loader, from either the version_a or version_b directories. Then it will load the class M by each of these two class loaders and create an instance through reflection:

// Create two class loaders: one for each dir.
ClassLoader lClassLoader_A =
   new MyClassLoader(
      "./build/classes/version_a" );
ClassLoader lClassLoader_B =
   new MyClassLoader(
      "./build/classes/version_b" );
// Load Class M from first CL and
// create instance
Object lInstance_M_A =
   createInstance( lClassLoader_A, "M" );
// Load Class M from second CL and
// create instance
Object lInstance_M_B =
   createInstance( lClassLoader_B, "M" );

In order to test an upcast, I need a class where the Effective CL is one of the custom class loaders. I then use reflection in order to invoke a method on them because I cannot upcast them because Main is loaded by the CP:

// Check the upcast of a instance of M-A1
// to class M-A1. This test must succeed
// because the CLs match.
try {
    checkUpcast(
        lClassLoader_A, lInstance_M_A );
    System.err.println(
        "OK: Upcast of instance of M-A1"
        + " succeeded to a class of M-A1" );
} catch (ClassCastException cce) {
    System.err.println(
       "ERROR: Upcast of instance of M-A1"
       + " failed to a class of M-A1" );
}
// Check the upcast of a instance of M-A2 to
// class M-A1. This test must fail because
// the CLs does not match.
try {
    checkUpcast(
       lClassLoader_A, lInstance_M_B );
    System.err.println(
       "ERROR: upcast of instance of M-A2"
       + " succeeded to a class of M-A1" );
} catch (ClassCastException cce) {
    System.err.println(
       "OK: upcast of instance of M-A2 failed"
       + " to a class of M-A1" );
}

The checkUpcast() loads the class Test through reflection and calls the Test.checkUpcast() method, which makes a simple upcast:

private static void checkUpcast(
   ClassLoader pTestCL, Object pInstance )
      throws Exception {
    try {
        Object lTestInstance =
           createInstance( pTestCL, "Test" );
        Method lCheckUpcastMethod =
           lTestInstance.getClass().getMethod(
              "checkUpcast",
              new Class[] { Object.class } );
        lCheckUpcastMethod.invoke(
           lTestInstance,
           new Object[] { pInstance } );
    } catch( InvocationTargetException ite ) {
        throw (ClassCastException)
           ite.getCause();
    }
}

Afterwards, there are some tests that do the same thing, but check the upcast restriction against reflection to ensure that reflection cannot compromise the rules posted at the beginning of the article. The last test checks the linking of symbolic links. On Windows 2000 and JDK 1.4.2, it will also show the lazy loading of classes because the loading of the class succeeds, whereas the creation of the instance eventually fails:

// Load a class N that has a symbolic link to
// class O that was removed so that the class
// resolving must fail
try {
    // Upcast ClassLoader to our version in
    // order to access the normally protected
    // loadClass() method with the resolve
    // flag. Even the resolve flag is set to
    // true the missing symbolic link is only
    // detected in W2K and JDK 1.4.2 when the
    // instance is created.
    Class lClassN = ( (MyClassLoader)
       lClassLoader_A).loadClass( "N", true );
    // Finally when the instance is created
    // any used symbolic link must be resolved
    // and the creation must fail
    lClassN.newInstance();
    System.err.println(
       "ERROR: Linkage error not thrown even"
       + "class O is not available for"
       + " class N" );
} catch( NoClassDefFoundError ncdfe ) {
    System.err.println(
       "OK: Linkage error because class O"
       + " could not be found for class N" );
}

Please note that in the directory version_a there is a class named O.java, because in order to compile the class N.java, this class is needed. However, the ant build script will remove the compiled class O.class before the test is started.

Conclusion

As long as a Java developer does not deal with his or her own class loader, all of the classes are loaded by the bootstrap and system class loader, and there will never be a conflict. Thus, it seems that a class is defined only by the fully qualified class name. As soon as there are sibling class loaders -- neither a parent of the other -- a class type can be loaded multiple times with or without different byte code. The class loader also defines the visibility of a class type because any upcast checks against the class name as well as its class loaders.

To use the currency analogy, this is expressed by the fact that you can have several currencies in your wallet, but as soon as you want to use one, the cashier will check if your money is of the local currency. Still, you can carry these currencies in your pocket wherever you go, and likewise, you can carry around instances of classes even when they are unknown or not compatible in a particular class, as long as the class of the reference is compatible there. Luckily in Java, java.lang.Object is the superclass of all instances and is loaded by the BS, which is the parent of all class loaders no matter what. This means a reference of a class java.lang.Object is always compatible. I think of this as a "tunneling through" of classes from one compatible island to the next -- something that is very important in J2EE, as will be shown in a future installment.

My analogy with the currencies is very simplified, because it implies that all classes have the same visibility due to the single border of a country. The analogy is based on the two-dimensional world map, whereas with Java class loaders, each level within the hierarchy of the class loaders is adding a new layer and building up a three-dimensional space.

Additional class loaders enable a Java developer to write modular applications where the visibility of classes is restricted, and therefore, multiple class types can be loaded and managed. Nevertheless, it requires effort to understand the used class loaders and the organization of the classes and class loaders. As with threads, class loading is a runtime behavior that is not obviously visible to the developer, and requires experience and testing to understand and utilize.

Now that the groundwork is laid, we can finally delve into the usage of class loaders. In the next article, we will see how class loaders can be used in a J2EE application server to manage deployments and what the effects are on invocations through local or remote interfaces. Afterwards, we will see how advanced class loaders make it possible to drop class types or massage the code in order to add "Advices" (AOP) at runtime without changing or recompiling your code.

Andreas Schaefer is a system architect for J2EE at SeeBeyond Inc., where he leads application server development.

Return to ONJava.com.

Comments on this article

Showing messages 1 through 8 of 8.

Currency is not the right analogy
2008-03-23 04:35:49 j0h4n [View]

Hi,
I'd like to linger on your analogy of class loading with federal bank authority to publish currency.

I think that classes just like many other things that developers work with, often changes in time. OTOH, currency by it's nature doesn't.

Dollar from the prespective of an american never changes, only when we compare that to another currency do we (care to) see differences in exchange value.
But to american, dollar stays the same dollar be it today or tomorrow.

Classes, OTOH, from the prespective of a developer changes over time. This happens in all industry.

So i think what the jvm has to do is to fully "embrace change", i.e. to give the developer a means to manipulate different implementation of classes loaded by different classloaders to be able to interact with each other while still taking place in the same jvm environment and not via rmi/externalizing methods.

The current jvm is not exactly doing that.

So I think a much appropriate analogy would be The role of the senate as the jvm, classloaders and it's compiler counterpart. while the legislation as the class.

regards,
johan

Where
2004-06-08 07:57:23 packer-creator [View]

That's really cool. So where or when does this series of articles continue?

Casting classes loaded from different effective classloaders
2003-11-21 13:27:39 anonymous2 [View]

One way around your problem is to create a proxy class that implements the same interface. The proxy can delegate methods calls via introspection to the instance of itself loaded via a different classloader.
- Casting classes loaded from different effective classloaders
  2004-02-10 17:03:23 nancy_sandoval [View]
  
  But you have to create a new class.
  
  ____________________
  Translated by Mail-Translator

Casting classes loaded from different effective classloaders
2003-11-17 10:56:57 sasho [View]

In normal circumstances the ClassCastException thrown when casting classes with the same type but with different effective classloaders is the desired behavior but in some cases we may need the cast to succeed. Just like in the examples in the article, we may need to invoke a method on a class loaded with a classloader different from the effective classloader of the instance we pass as an argument (in this case the class names will match but the classloaders will not). One possible workaround, if the class is serializable, is to serialize the instance and then deserialize it within a class loaded by the second classloader (thus creating an instance that has the correct type and the correct effective classloader) and pass the new instance as the argument. Although this works just fine, it's not really a very elegant answer to that problem and maybe you could suggest a better solution?
- Casting classes loaded from different effective classloaders
  2003-11-19 04:39:16 anonymous2 [View]
  
  I know that these doesn't always help you, but there are as I see it, two possible solutions.
  
  One, where you use rmi to communicate, and let the class be downloaded via the stub's codebase. Here the interface must be on the clientside, and not the implementation.
  
  The other one is to use JMX, where use use, "tunneling" via the jmx framwork.
- Casting classes loaded from different effective classloaders
  2003-11-18 12:40:57 schaefera [View]
  
  Hi
  
  As we will see in the next installment this is how J2EE invocation on remote interfaces go around the problem even when you call them within the server. However, the client that receives the class will actually load the class locally (with its own class loader) and then incorporate the values from the object stream. But that requires that the client has a compatible class defintion available and if not deserialization will fail. In the case you have a compatible version the class behind the newly created instance is NOT (if the client uses a different class loader) compatible to the original class because the two class types have different class loaders. Thus to send the data back you need to serialize/deserialze the class again.
  In the next part of this article series I will explain a little bit more in detail how J2EE deals with multiple class loaders and talk about serilization/deserialization.
  Nevertheless I do not have a more elegant solution to propose even thought I would like to have a possiblity in Java to force a upcast as long as the class types are compatible to avoid wasting a lot of time.
  
  -Andy
  - Casting classes loaded from different effective classloaders
    2008-03-23 02:26:42 j0h4n [View]
    
    I'm in favor of this change.
    When will we see one?
    
    Maybe making a proxy like so:
    class A loaded by a classloader C1 denoted <A,C1>;
    another impl of class A loaded on a second instance of a classloader C2 denoted <A,C2>;
    
    whenever we want two classes loaded by diferent classloaders to communicate the jvm allows that by way of automatically creating proxies. In the case where an instance of class B loaded by C1 denoted <B, C1> want's to communicate with <A, C2> the JVM creates a proxy <A2, C1> that resides in the C1 classloader.
    
    I dunno, maybe it's easy said than done?
    
    regards,
    johan

baotoushashou

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Inside Class Loaders

Inside Class Loadersby Andreas Schaefer11/12/2003This series of articles started when I wanted to write a weblog about the impact of class loaders in a J2EE server. But the log entry grew, due t
复制链接

扫一扫