Writing a “Hello World” program is often a rite of passage for a software engineer when learning a new language.
If you’re a Java developer, you might even remember the first time you typed public static void main(String[] args) in your editor of choice. But did you ever wonder what’s inside that “.class” file that the compiler spits out? Let’s look at how we can write a JVM “Hello World” by creating a class file programmatically.
We’ll work through creating a class file for the following simple Java Hello World application.
public class HelloWorld {
public static void main(String[] args) {
System.out.println("Hello World");
}
}
By the end of this post you’ll have made your first steps into the world of Java bytecode: being able to generate a Java class file without a Java compiler (OK, technically we’ll still need a Java compiler, since we’re going to write Java code to generate the class file!).
What is a class file anyway?
A Java class file is a container for the compiled Java class, interface, enum or record definitions along with their corresponding members such as fields & methods. The methods in-turn contain the Java bytecode instructions that will be executed by a Java Virtual Machine (JVM).
At a high-level, a Java class file, as defined in the Java Virtual Machine Specification, contains the following structure:
- The magic number 0xCAFEBABE used to identify the file as a Java class file
- The major and minor version of the class file
- A constant pool containing all the literal constants used within the class file
- Access flags indicating whether the class is public, abstract etc
- The name of the class and its superclass
- The list of interfaces implemented by the class
- Fields and methods
- Attributes
In this post, we’re going to write code to generate a class that contains a main method and that method will contain bytecode which contains instructions to print “Hello World”.
Creating a class
So, how can we create a Java class file without starting from Java source code? Technically, a class file is just a bunch of bytes so we could just start writing out a stream of bytes:
DataOutputStream dataOutputStream =
new DataOutputStream(
new FileOutputStream("HelloWorld.class"));
dataOutputStream.writeInt(0xCAFEBABE);
//...
dataOutputStream.close();
But once we get past the magic number things get more complicated and we’d benefit from a higher-level API to help us out.
This is traditionally where a library like ProGuardCORE, ASM or ByteBuddy would come in handy. But the new JEP 457 Class-File API brings the ability to build, read and transform class files to Java without the need for a third-party library.
JEP 457: Class-File API
JEP 457 Class-File API is a preview API that aims to provide a standard API for parsing, generating, and transforming Java class files.
Unlike libraries such as ProGuardCORE, ASM or ByteBuddy its scope is smaller: only parsing, generating, & transforming are in-scope while code analysis features are explicitly out of scope. The biggest benefit is that it will be a standard Java API that keeps in-sync with the Java class file specification and evolves in-sync with the compiler and virtual machine.
And the API is pretty nice to use! Unlike older libraries like ASM and ProGuardCORE (both 20+ years old), the new API avoids the use of the visitor pattern and instead uses newer Java idioms and features that weren't present when the older libraries were designed.
Using JEP 457
Since this is currently a preview API in JDK 22, you'll need:
- JDK 22 - the easiest way to install this is with sdkman
$ sdk install java 22-open
- A few flags to enable access to the preview feature - to simplify this, you can use a script like this:
#!/bin/bash
JAVA_MAJOR_VERSION=$(java -version 2>&1 | sed -E -n 's/.* version "([^.-]*).*"/\1/p' | cut -d' ' -f1)
if [[ "$JAVA_MAJOR_VERSION" -lt 22 ]]; then
echo "Java version 22 required"
exit 1
fi
javac --release 22 --enable-preview Main.java
java -cp . --enable-preview Main
Writing a Java class file
We'll start with the Classfile API: first we create an instance of a Classfile which we can then fill in with a method and code to print "Hello World". We don't actually need to provide any information up-front when creating an instance, we can simply call the of static factory method:
Classfile helloWorldClass = Classfile.of();
The class can be written to a file HelloWorld.class using the buildTo method as follows:
helloWorldClass.buildTo(Path.of("HelloWorld.class"), ClassDesc.of("HelloWorld"), classBuilder -> {});
You can now use the command line tool javap to check that we’ve created a valid class file:
$ javap -c -v -p HelloWorld.class
Classfile HelloWorld.class
Last modified 5 Oct 2023; size 62 bytes
SHA-256 checksum c022c0ad1292c5d717efd2a0b2967ba9fe376d6d369d1d326e09fa80d52a33bc
public class HelloWorld
minor version: 0
major version: 66
flags: (0x0001) ACC_PUBLIC
this_class: #2 // HelloWorld
super_class: #4 // java/lang/Object
interfaces: 0, fields: 0, methods: 0, attributes: 0
Constant pool:
#1 = Utf8 HelloWorld
#2 = Class #1 // HelloWorld
#3 = Utf8 java/lang/Object
#4 = Class #3 // java/lang/Object
{
}
Notice that the generated file already contains the class name, version, superclass and a small constant pool containing the strings representing the class and superclass names.
There are, however, no fields or methods in the class!
Adding a main method
If you were paying careful attention, you might have noticed that the third parameter to the buildTo method is a function that provides a ClassBuilder: this is a key feature of the new API, builders for elements (such as class files) provide other builders to build sub-elements (such as methods).
helloWorldClass.buildTo(Path.of("HelloWorld.class"), ClassDesc.of("HelloWorld"), classBuilder -> {
// Here we can use classBuilder to add or transform methods or fields.
});
Adding a method using the ClassBuilder is easy with the withMethod builder method. You must provide the access flags, the name and the descriptor in the form of a MethodTypeDesc instance (see “type descriptors”):
Classfile helloWorldClass = Classfile.of();
helloWorldClass.buildTo(Path.of("HelloWorld.class"), ClassDesc.of("HelloWorld"), classBuilder -> classBuilder
.withMethod("main", MethodTypeDesc.ofDescriptor("([Ljava/lang/String;)V"), ACC_PUBLIC | ACC_STATIC, methodBuilder -> {}));
If you try to run the generated class file now, you’ll receive an error:
$ java HelloWorld
Error: LinkageError occurred while loading main class HelloWorld
java.lang.ClassFormatError: Absent Code attribute in method that is not native or abstract in class file HelloWorld
We added a method, but the method doesn’t contain any code!
Type descriptors
As you may have noticed, the descriptor doesn’t look like a Java signature as you would write in Java source code.
The types in descriptors in Java class files are encoded using characters which represent the types on the JVM and class names are always fully qualified, with the "/" as a separator instead of ".".
For example, the descriptor for the main method in Java (public static void main(String[] args)) is ([Ljava/lang/String;)V.
Character | Java type | Character | Java type |
B | byte | C | char |
D | double | F | float |
I | int | J | long |
LClassName; | class | S | short |
Z | boolean | [ | array |
Java bytecode instructions
We’ll need to add some code to our main method to actually get our Hello World program to print “Hello World”. The code that we need to generate is, of course, Java bytecode.
Since our Hello World program is very simple we’ll just need a few instructions to:
- load the string “Hello World”
- execute System.out.println
A Java virtual machine is a stack-based machine: many of the instructions deal with pushing and popping from the operand stack. For example, the instruction ldc is used to load a constant onto the stack and the invoke instructions will pop their operands from the stack.
In order to execute an instance method, such as println, we can use the invokevirtual instruction. The first operand for invokevirtual is a reference to the instance on which the method will be called: in our case a reference to System.out. The System.out instance and the string “Hello World” will be popped from the stack and the method will be executed.
In total, for our Hello World program, we’ll need 4 different bytecode instructions:
Instruction | Stack before | Stack after | Example | Example Description |
getstatic | …, | …, value | getstatic Ljava/lang/System; out | Pushes a reference to the System.out instance onto the stack |
ldc | …, | …, value | ldc “Hello World” | Pushes the constant “Hello World” onto the stack |
invokevirtual | …, objectref, [arg1, arg2, argN] | …, [return value] | invokevirtual Ljava/io/PrintStream; println(Ljava/lang/String;)V | Pops the reference to System.out and the “Hello World” string, and executes println |
return | …, | empty | return | Returns from a method |
CodeBuilder
We’ve already added a main method to our program using the withMethod builder method, which provided us with a MethodBuilder to add a method but so far we don't have any code.
The ClassBuilder also provides a withMethodBody builder method that allows us to short-cut to directly adding code if there is no need to modify other method attributes. The final parameter of withMethodBody is a lambda function that provides a CodeBuilder which allows us to build the code for the method.
As we learnt in the previous section we’ll need to generate four instructions: getstatic, ldc, invokevirtual and return. The CodeBuilder instruction methods closely resemble the JVM instruction set, so our code snippet to print “Hello World” uses 4 methods with familiar names to generate the getstatic, ldc, invokevirtual, and return instructions:
Classfile.of().buildTo(Path.of("HelloWorld.class"), ClassDesc.of("HelloWorld"), classBuilder -> classBuilder
.withMethodBody("main", .MethodTypeDescofDescriptor("([Ljava/lang/String;)V"), ACC_PUBLIC | ACC_STATIC, codeBuilder -> codeBuilder
.getstatic(ClassDesc.of("java.lang.System"), "out", ClassDesc.of("java.io.PrintStream"))
.ldc("Hello World")
.invokevirtual(of("java.io.PrintStream"), "println", MethodTypeDesc.ofDescriptor("(Ljava/lang/Object;)V"))
.return_()));
Finally, “Hello World”
Using the new JEP 457 Class-File API we’ve written a Java program that produces a Java class file that when executed prints “Hello World”.
You should be able to execute the generated HelloWorld.class file and see the result yourself:
$ java HelloWorld
Hello World
Congratulations! You’ve taken your first step into the world of Java bytecode in which you’ve learnt your first 4 Java bytecode instructions!
Next steps
We’ve only just scratched the surface of Java class files, Java bytecode and the new JEP 457 Class-File API. We've shown how to create a class file but the new API also has the tools to read class files & transform them.
For your next steps, take a look at the JEP, the preview API documentation, the JDK source code and, to learn more Java bytecode instructions, the Java Class File Specification. I'd also recommend to watch Brian Goetz's great talk introducing the new API.
The full code for the Hello World example can be found here on GitHub.
The article was originally published using ProGuardCORE to build the Java class file & corresponding bytecode rather than the new JEP 457 Class-File API.
ProGuardCORE provides tools for reading, writing and transforming class files just like the new Class-File API. The ClassBuilder API is particularly similar to the CompactCodeAttributeComposer API in ProGuardCORE because both are very closely aligned with the Java bytecode specification i.e. they provide methods whose names match almost 1:1 with the Java bytecode instruction names.
A big difference in the APIs, though, is that ProGuardCORE (like ASM) makes heavy use of the visitor pattern which the new JEP 457 API avoids because Java now has new powerful features such as pattern matching and sealed classes rendering the use of the visitor pattern unnecessary.
Additionally, unlike the new API, ProGuardCORE also provides code analysis tools to analyse Java bytecode which are used by software such as the open-source ProGuard shrinker, the Android security solution DexGuard and the application security testing tool AppSweep.
The original article was first published as part of Java Advent 2022.