Learnitweb

Class loading process in Java

The class loading process in Java 21, while having evolved significantly from earlier versions (especially with the introduction of modules in Java 9), still fundamentally follows a three-step delegation model: Loading, Linking, and Initialization. However, the context of modules and the removal of the Extension Class Loader in favor of a simpler application class loader hierarchy, along with changes in how core modules are handled, are crucial in Java 21.

Let’s break down each phase in detail, considering the modern Java 21 environment.

Class Loading Process in Java 21

1. The Class Loaders

Before diving into the process, it’s essential to understand the key class loaders in Java 21:

  • Bootstrap Class Loader (Primordial Class Loader):
    • Purpose: This is the ultimate parent of all class loaders. It’s responsible for loading core Java classes (like java.lang.Object, java.lang.String, java.util.*, etc.) from the lib directory of the JDK (specifically, from the jrt-fs.jar internal image file system, which contains the Java Runtime modules).
    • Implementation: It’s not a Java object; it’s typically implemented in native code (C++). You cannot get a reference to it programmatically.
    • Module System Context: It loads classes from the bootstrap modules (e.g., java.base, java.logging, java.xml, etc.).
  • Platform Class Loader:
    • Purpose: Introduced in Java 9, this class loader loads classes from the platform modules. These are typically modules that extend the Java SE API but are not part of the core java.base module (e.g., java.compiler, java.desktop, java.sql, etc.).
    • Parent: Its parent is the Bootstrap Class Loader.
    • Module System Context: It handles modules that are part of the Java platform but are not strictly “bootstrapped.”
  • Application Class Loader (System Class Loader):
    • Purpose: This is the default class loader for applications. It loads classes from the application’s classpath (defined by the -cp or --class-path option or the CLASSPATH environment variable) and from modules resolved at application startup.
    • Parent: Its parent is the Platform Class Loader.
    • Module System Context: It’s responsible for loading classes from named modules that are part of your application or third-party libraries, as well as from the unnamed module (the traditional classpath).
    • Reference: You can obtain a reference to it using ClassLoader.getSystemClassLoader().

Delegation Model: The key principle is the delegation model. When a class loader is asked to load a class, it first delegates the request to its parent class loader. This continues up the hierarchy until the Bootstrap Class Loader is reached. If a parent can load the class, it does so. Only if no parent can load the class does the current class loader attempt to load it itself. This prevents duplicate class loading and ensures that core Java classes are always loaded by the most trusted class loader.

The Three Phases of Class Loading

Now, let’s detail the three main phases:

Phase 1: Loading

This is the process of finding the bytecode for a class and loading it into memory.

  1. Request: A class loader receives a request to load a class (e.g., MyClass). This typically happens when:
    • A new instance of a class is created using new.
    • A static method or field of a class is accessed.
    • A class is referenced by its name, for example, using Class.forName().
    • A subclass is loaded, requiring its superclass to be loaded first.
  2. Delegation: The current class loader (e.g., Application Class Loader) delegates the request to its parent (Platform Class Loader).
  3. Recursive Delegation: The Platform Class Loader delegates to its parent (Bootstrap Class Loader).
  4. Bootstrap Class Loader Attempts: The Bootstrap Class Loader first checks if the class has already been loaded. If not, it tries to find the class within the core Java runtime image (JRT). If found, it reads the bytecode and creates a Class object in the method area (part of the JVM’s memory).
  5. Platform Class Loader Attempts (if Bootstrap Fails): If the Bootstrap Class Loader cannot find or load the class, the Platform Class Loader attempts to find it within the platform modules.
  6. Application Class Loader Attempts (if Platform Fails): If the Platform Class Loader also fails, the Application Class Loader tries to find the class within the application’s classpath or resolved application modules.
  7. Bytecode Loading: Once found, the raw bytes of the class are loaded into memory.
  8. Class Object Creation: The JVM then creates a java.lang.Class object for the loaded class. This Class object acts as a blueprint or metadata for the class, containing information like its name, superclass, interfaces, fields, methods, and annotations.

Key Point: The Class object is created in this phase, but the static fields are not initialized yet.

Phase 2: Linking

Linking is the process of integrating the loaded class into the runtime state of the JVM. It consists of three sub-steps:

a. Verification
  • Purpose: This is a crucial security step. The JVM bytecode verifier checks the integrity and correctness of the loaded bytecode. It ensures that the bytecode adheres to the Java Virtual Machine Specification and doesn’t pose any security threats.
  • Checks Include:
    • Bytecode Format: Ensures the class file is well-formed.
    • Stack Underflow/Overflow: Checks that operand stack operations are valid.
    • Type Safety: Verifies that method calls and field accesses are type-safe (e.g., you can’t assign an Integer to a String reference without an explicit cast).
    • Access Control: Checks that private methods/fields are not accessed illegally.
    • Final Fields: Ensures final fields are not assigned more than once.
  • Outcome: If verification fails, a java.lang.VerifyError is thrown, indicating a corrupted or malicious class file.
b. Preparation
  • Purpose: This step involves allocating memory for static fields and initializing them to their default values.
  • Process:
    • All static fields (class variables) are allocated memory in the method area.
    • Numeric primitive static fields are initialized to 0 (or 0.0 for floating-points).
    • boolean static fields are initialized to false.
    • Reference static fields are initialized to null.
    • final static fields that are compile-time constants (e.g., static final int MY_CONST = 10;) are initialized to their actual values during this phase, as their values are known at compile time.

Key Point: Static initializers (blocks of code starting with static { ... }) are not executed in this phase. That happens during initialization.

c. Resolution (Optional)
  • Purpose: This is the process of resolving symbolic references in the constant pool of the class to direct references. Symbolic references are human-readable names that refer to other classes, methods, or fields. Direct references are actual memory addresses or offsets.
  • Process:
    • When a class is loaded, it contains symbolic references (e.g., “call method foo() in class Bar“).
    • Resolution replaces these symbolic references with direct pointers to the actual locations of the referenced entities in memory.
    • This typically happens lazily, meaning it only occurs when a symbolic reference is actually used for the first time by the JVM, rather than all at once during class loading. For example, when MyClass calls OtherClass.someMethod(), the symbolic reference to OtherClass and someMethod() will be resolved at that point.
  • Outcome: If a symbolic reference cannot be resolved (e.g., a referenced class or method is not found), an java.lang.NoClassDefFoundError or java.lang.NoSuchMethodError/java.lang.NoSuchFieldError is thrown.

Phase 3: Initialization

This is the final phase of class loading, where the class becomes ready for use.

  • Purpose: To execute the static initializers of the class and initialize static fields to their specified values (if not already handled in preparation for compile-time constants).
  • Trigger: Initialization is typically triggered the first time the class is actively used. Active uses include:
    • Creating a new instance of the class (new MyClass()).
    • Invoking a static method of the class.
    • Assigning a value to a static field of the class.
    • Using a non-compile-time constant static field of the class.
    • Calling a reflection method like Class.forName("MyClass", true, myClassLoader).
    • Initializing a subclass (which implicitly initializes its superclass first).
    • Starting the main class specified by java -jar or java -c.
  • Process:
    1. Superclass Initialization: Before a class is initialized, its direct superclass must be initialized first (unless it’s java.lang.Object).
    2. Synchronization: The JVM ensures that only one thread can initialize a class at a time. If multiple threads try to initialize the same class concurrently, one thread will perform the initialization, and the others will block until it’s complete.
    3. Static Initializer Execution: The JVM executes all static field assignments and static initializer blocks (static { ... }) in the order they appear in the class definition. This is where the actual values specified in the source code are assigned to static variables.
  • Outcome: After initialization, the class is fully ready for use by the application.

Class Loading with Modules (Java 9+)

The Java Platform Module System (JPMS) significantly changed how classes are found and loaded, but the fundamental three-phase process remains.

  • Module Path: Instead of just a classpath, Java 21 also uses a module path (--module-path or -p).
  • Module Descriptors (module-info.java): These files explicitly declare which packages are exported, which modules are required, and which services are provided/consumed. This information is used during the resolution phase (part of linking) to determine class visibility.
  • Named Modules vs. Unnamed Module:
    • Classes from explicit modules (your application modules, JDK modules, third-party modules on the module path) are loaded into their respective named modules.
    • Classes on the traditional classpath are treated as belonging to the unnamed module. The unnamed module “reads” all named modules, allowing classpath code to access modularized code. However, modularized code does not automatically read the unnamed module (unless explicitly stated or --add-reads is used).
  • Encapsulation: Modules enforce strong encapsulation. Even if a class is loaded, it might not be accessible if its package is not exported by its module or if the consuming module does not require the providing module. This is checked during linking (specifically resolution).
  • Layered Class Loaders: While not directly exposed for programmatic use in the same way, the JPMS implicitly creates a class loader hierarchy that maps to modules. Each module is effectively loaded by one of the class loaders (Bootstrap, Platform, Application).
  • jlink: The jlink tool allows creating custom runtime images that contain only the necessary modules. This affects what the Bootstrap and Platform class loaders load, making the runtime environment more compact.

Example Flow

Let’s say your main method in com.example.app.MyApp tries to instantiate com.example.data.MyData (which is in a different module, myapp.data, that your myapp.app module requires):

  1. Request: JVM needs com.example.app.MyApp. The Application Class Loader is asked to load it.
  2. Delegation: Application CL -> Platform CL -> Bootstrap CL.
  3. Bootstrap CL: Cannot find MyApp.
  4. Platform CL: Cannot find MyApp.
  5. Application CL: Finds MyApp.class in the myapp.app module on the module path. Loads bytecode, creates Class<MyApp>. (LOADING)
  6. Verification: JVM verifies MyApp‘s bytecode. (LINKING – Verification)
  7. Preparation: Static fields of MyApp are allocated and default-initialized. (LINKING – Preparation)
  8. Initialization: Static initializers of MyApp are run.
  9. Inside MyApp‘s main method: new MyData(); is encountered.
  10. Request: JVM needs com.example.data.MyData. Application CL is asked.
  11. Delegation: Application CL -> Platform CL -> Bootstrap CL.
  12. Bootstrap CL: Cannot find MyData.
  13. Platform CL: Cannot find MyData.
  14. Application CL: Finds MyData.class in the myapp.data module (which was resolved because myapp.app requires it). Loads bytecode, creates Class<MyData>. (LOADING)
  15. Verification: JVM verifies MyData‘s bytecode. (LINKING – Verification)
  16. Preparation: Static fields of MyData are allocated and default-initialized. (LINKING – Preparation)
  17. Resolution: If MyData references other classes/methods, their symbolic references are resolved as they are accessed. For example, if MyData has String name;, the reference to java.lang.String (which is in java.base, loaded by Bootstrap CL) is resolved. (LINKING – Resolution)
  18. Initialization: Static initializers of MyData are run. (INITIALIZATION)
  19. Instance Creation: Now that MyData is fully initialized, an instance of MyData can be created.

Understanding this process is crucial for debugging ClassNotFoundException and NoClassDefFoundError, for comprehending how static initializers work, and for leveraging the benefits of the Java Module System effectively.