JEP 457 Hello World Translator

March 16, 2024 - 14:22

If you're new to Java class files and Java bytecode check out JEP 457 Hello World which is an introduction to the topic.

Often, when you start to learn a new language you'll write a "Hello World" application as a first step. In this post, as a first step to transforming Java class files, we'll write a "Hello World" translator.

We'll see how, in just a few lines of code using the JEP 457 Java Class-File API, we can create an application that can apply a code transformation to automatically translate "Hello World" to some other language such as "Hallo Wereld".

Hello World

If you compile and run the following HelloWorld class, you'll of course see "Hello World".

public class HelloWorld {
    public static void main(String[] args) {
        System.out.println("Hello World");
    }
}

Our goal is to take as input the original class file, and modify it to print something else:

$ java --enable-preview HelloWorldTranslator HelloWorld.class "Hallo Wereld"
$ java HelloWorld
Hallo Wereld

Using the Class-File API

To use the Class-File API you'll need:

JDK 22 - the easiest way to install this on Linux is with sdkman:

$ sdk install java 22-open

To use the --release and the --enable-preview flags, since the API is currently a preview feature:

$ javac --release 22 --enable-preview HelloWorldTranslator.java
$ java --enable-preview HelloWorldTranslator

Reading and Writing a class file

We'll start with a main method that sets up the inputs:

public static void main(String[] args) throws IOException {
    if (args.length != 2) throw new RuntimeException("Expected input and translation");

    var input = Path.of(args[0]);
    var translation = args[1];
}

Then add some code to read the input class, apply a transform and write it out again. We'll start with an identity transform:

public static void main(String[] args) throws IOException {
    if (args.length != 2) throw new RuntimeException("Expected input and translation");

    var input = Path.of(args[0]);
    var translation = args[1];

    // Create a ClassFile object and parse the input into a ClassModel.
    var cf = ClassFile.of();
    var classModel = cf.parse(input);

    // For now, apply a transform that just passes through all code elements unchanged.
    byte[] newBytes = cf.transform(classModel, (classBuilder, classElement) -> {
        // TODO: transform specific instructions.
        classBuilder.with(classElement);
    });

    // Overrwrite the input file with the new bytes.
    Files.write(input, newBytes);
}

If you run this now, the input will be read & overwritten but no changes will be applied.

Transforming Methods

The ClassFile.transform method takes a ClassTransform instance to implement transformations on classes. In the current snippet, we simply pass the ClassElement (which is some component of a class such as a method, field, attribute etc) to the ClassBuilder; this does not make any changes to the class.

We can transform methods with the ClassBuilder.transformMethod method, which (similar to ClassFile.transform) allows implementing transformations on method elements. A MethodElement, like a ClassElement, is some component for a method such as code or attributes.

And since we want to transform code of a method, we'll need to use the MethodBuilder.transformCode method (you might notice a pattern here!) which provides a way to transform code elements (instructions and attributes).

So we can use the following transformer to drill down to the instructions that we want to modify:

byte[] newBytes = cf.transform(classModel, (classBuilder, classElement) -> {
    if (classElement instanceof MethodModel mm) {
        classBuilder.transformMethod(mm, (methodBuilder, me)-> {
            if (me instanceof CodeModel cm) {
                methodBuilder.transformCode(cm, (codeBuilder, e) -> {
                    // TODO: Transform specific instructions.
                    codeBuilder.with(e);
                });
            } else {
                // Leave everything else as it is.
                methodBuilder.with(me);
            }
        });
    } else {
        // Leave everything else as it is.
        classBuilder.with(classElement);
    }
});

This is a lot of boilerplate when we just want to transform some instructions! Fortunately, there are some helper methods that remove the boilerplate: we'll use the transformingMethodBodies method to create the class transform for us. This method directly takes a CodeTransform and handles the rest, reducing the complexity greatly:

byte[] newBytes = cf.transform(classModel, transformingMethodBodies((codeBuilder, codeElement) -> {
    // TODO: transform specific instructions.
    codeBuilder.with(codeElement);
}));

Transforming Instructions

The transformingMethodBodies method takes a CodeTransform function which provides a CodeBuilder and a CodeElement. Our goal is to transform specific instructions and leave the other code elements as they are: we can use the instanceof operator to check for ldc instructions, which load a constant onto the stack, and generate a new instruction with a different constant; and leave any other instruction untouched.

if (codeElement instanceof LoadConstantInstruction ldc 
      && ldc.constantValue().equals("Hello World")) {
    codeBuilder.constantInstruction(translation);
} else {
    codeBuilder.with(codeElement);
}

That's it! You can now use the HelloWorldTranslator on any class file and it will transform all the ldc instructions with the "Hello World" constant string to use the provided translation instead:

$ javac HelloWorld.java
$ java HelloWorld
Hello World
$ javac --enable-preview --release 22 HelloWorldTranslator.java
$ java --enable-preview HelloWorldTranslator HelloWorld.class "Hello Wereld"
Hallo Wereld

Leftover Constants

If you compare the generated class files, you'll notice very minimal differences (ignoring the filename, checksum and timestamp); we only added a new string constant and used that new constant for the ldc instruction:

$ diff <(javap -c -v -p original/HelloWorld.class | tail -n +4) <(javap -c -v -p HelloWorld.class | tail -n +4)
36a37,38
>   #28 = Utf8   Hello Wereld
>   #29 = String #28 // Hello Wereld
55c57
<          3: ldc #13 // String Hello World
---
>          3: ldc #29 // String Hello Wereld

By default, to optimize processing time and minimize changes between the original and transformed class, the original constant pool is used for the transformed class, which means that the original "Hello World" string is still there even though it's not used:

$ javap -c -v -p HelloWorld.class
public class HelloWorld
  minor version: 0
  major version: 66
  flags: (0x0021) ACC_PUBLIC, ACC_SUPER
  this_class: #21                         // HelloWorld
  super_class: #2                         // java/lang/Object
  interfaces: 0, fields: 0, methods: 2, attributes: 1
Constant pool:
...
  #13 = String             #14            // Hello World

If you prefer to not have the original constant hanging around in the transformed class file, you can use the ConstantPoolSharingOption.NEW_POOL option which will ensure that a new constant pool is created:

byte[] newBytes = cf
        .withOptions(ClassFile.ConstantPoolSharingOption.NEW_POOL)
        .transform(classModel, ...);

Using this option, the "Hello World" string will no longer appear in the transformed class file:

$ strings HelloWorld.class | grep "Hello World" | wc -l
0

Next steps

We've successfully created a simple Java program that uses the new Class-File API to transform existing classes, but we've only changed the constant operand of a single type of instruction. There is, of course, a lot more than can be done with the Class-File API! To continue further, dive into the API: the JEP 457 and follow-up JEP 466 give a good overview of the API; and the API is documented in the JDK 22 early access documentation. Brian Goetz also has a great talk about the design of the new API over on YouTube.

If you haven't already read my previous posts about generating classes (rather than transforming them), check those out too:

The code from this post can be found on GitHub here.