Theo's Blog

Published

- 7 min read

Bytecode manipulation in JVM

img of Bytecode manipulation in JVM

Bytecode manipulation?

Bytecode manipulation is a thing that allows you to dynamically inject/modify/overwrite code in JVM apps. So yes, this will work for not just Java, but also Kotlin, Scala and Groovy.

This doesn’t seem very practical, but it’s very useful in games running on JVM. For example: Minecraft! Minecraft’s modding tools (Mixins) use bytecode manipulation to modify Minecraft’s code. Also, it’s important to know how JVM works behind the wall… i think.

The structure

TODO

Fine. But how?

Well, first, you’d need to understand how JVM works. When you compile your .java files you get .class files. Class files are the bytecode. Bytecode is a sequence of bytes, that describes the… well, code.

Each byte is a different instruction (or character, if it’s a string). Don’t worry, you won’t have to memorize what byte corresponds to what instruction.

So what you need to do, to modify bytecode of a class, you just need to parse the bytes, change a few bytes and write it back! Sounds rather simple, but how do you make JVM load the modifed class instead of the original? And is there a more convenient way of parsing bytecode? I mean, it’s gonna get nasty when you’ll try to build a house with a stone hatchet. I’ll break this article into multiple parts:

Libraries

There are not many bytecode manipulation libraries we can use out there. Here are some of the most popular solutions:

ObjectWeb ASM:

  • Active development.
  • License: BSD 3-Clause "New" or "Revised" License.
  • Apparently, used by Java’s compiler, SpongePowered’s Mixins and various build tools.
  • Low level.

Javassist:

  • Isn’t developed actively anymore, but mostly works.
  • Some features do not work (after Java 11 and/or Java 17).
  • License: MPL.
  • High level.

cglib:

  • Not supported anymore.
  • License: Apache License 2.0.
  • Low level.

ByteBuddy:

  • Active development.
  • Abstracts bytecode manipulation a lot.
  • License: Apache License 2.0.
  • High level.

Which one of these to use? The answer is, as always, it depends. If you want to generate some code (if you’re making a gradle plugin for example), then you’ll likely find ByteBuddy the easiest to use. If you want full control - you’ll find that ASM is your only option. If you want to hack something quickly without putting too much effort, Javassist may be the tool for you.

In this blog post I’m going to cover the use of ObjectWeb ASM and some of the ByteBuddy’s tooling (more on that later).

How to load the modified class?

Let’s say that we modified a class, or we loaded it from memory. How to make JVM load the class from bytes? Really, there are 2 solutions, and you should choose which one to use depending on your use-case.

Here’s a quick comparison between the solutions. I would still recommend to read the rest of the section, because it’s important to know how they work.

ClassLoadersAgents
Re-define loaded classes
Intercept JVM class loading
Define classes

1. ClassLoaders

ClassLoaders are special classes that handle class loading, and, conveniently, they have a protected method, called defineClass which accepts an array of bytes (the bytecode), it’s length, offset and the class’ name. Unfortunately, in the latest Java versions you can’t use reflections to invoke that method, so you have to subclass the ClassLoader and override the method.

Defining classes dynamically:

If you override the method and make it public, it will allow you to define classes dynamically at runtime!

   // This classloader allows other classes to define classes at runtime!
public class CustomClassLoader extends ClassLoader {

    @Override
    public Class<?> defineClass(String name, byte[] bytes, int offset, int length) {
        return super.defineClass(name, bytes, offset, length);
    }

    // An utility method, a shortcut for the method above.
    public Class<?> defineClass(String name, byte[] bytes) {
        return this.defineClass(name, bytes, 0, bytes.length);
    }
}

CustomClassLoader loader = ...;
byte[] bytes = getBytecode();

Class<?> clazz = loader.defineClass("name/of/the/Class", bytes);

UPD: You can also use MethodHandles to define classes dynamically!

   MethodHandles.Lookup lookup = MethodHandles.lookup(); // any other way of getting the lookup class will do
byte[] bytes = getBytecode();

lookup.defineClass(bytes);

Intercepting class loading:

JVM calls the defineClass method as well! So if you just insert a check there, you can get away with modifying some class’ code.

Warning: you either need to force load the classes to load from the custom class loader, or make it the context class loader for this thread.

   // This classloader allows other classes to define classes at runtime!
public class CustomClassLoader extends ClassLoader {

    @Override
    public Class<?> defineClass(String name, byte[] bytes, int offset, int length) {
        if (name.equals("name/of/the/Class"))
            super.define(name, transform(bytes), offset, length);

        return super.defineClass(name, bytes, offset, length);
    }
}

CustomClassLoader loader = ...;
Thread.currentThread().setContextClassLoader(loader);
// or
loader.loadClass("name.of.the.Class");

2. Agents

Agents are a wonderful thing. They were meant for debugging & profiling tools, but we can use it’s functionality to our advantage!

Warning: you can’t use agents without the JDK!

Agents can’t define new classes, but they can re-define classes and intercept class loading without having to do class loader juggling. Quite amazing if you ask me.

One slight problem though: to get your agent running, you need to put it in the run arguments of JVM, like this:

   java -javaagent:agent.jar -jar app.jar

If you can’t have that (for example in Minecraft), then you’ll likely want to use a self-attaching agent. If you don’t mind using the command argument, you can move on to Using an Agent section.

Self-attaching an agent

There are ways to screw around with agents and force the JVM to self-attach it, but it’s unnecessary complicated and really based on hopes and dreams.

There is a way to do this more conveniently though. Remember when I said I’ll cover ByteBuddy’s tooling earlier? The time has come! Introducing, ByteBuddy Agent! It’s a library made for ByteBuddy, but very convenient for our use case as well!

To use it, just put it in the dependencies in your build system and hook the agent:

   Instrumentation instrumentation = ByteBuddyAgent.install();

Implementing an agent

Agents have 2 entrypoints: premain and agentmain. The first one gets executed after JVM got loaded, but before the app did, and the latter gets executed after JVM initializes the app.

   public class Agent {

    public static void premain(String args, Instrumentation instrumentation) {
        // ...
    }

    public static void agentmain(String args, Instrumentation instrumentation) {
        // ...
    }
}

(You don’t have to add both entrypoints if you use just one though)

Make sure to add these entries to your manifest file, or it won’t work!

   Agent-Class: path.to.your.Agent
Premain-Class: path.to.your.Agent
Can-Redefine-Classes: true
Can-Retransform-Classes: true

(The last 2 entries are optional, but make sure to include them as well, or you might lose some functionality!)

Using an agent

Once you get an Instrumentation object, you can use it to re-define classes and register a ClassFileTransformer, which in turn allows you to intercept class loading.

   public class CustomTransformer implements ClassFileTransformer {

    @Override
    public byte[] transform(ClassLoader loader, String name, Class<?> clazz, ProtectionDomain domain, byte[] bytes) {
        // ...
        return bytes;
    }
}

instrumentation.addTransformer(new CustomTransformer(), true);

(You can actually return a null in the transform method for this specific case, since we don’t do any changes to the array… yet.)

Changing the bytecode

Now that you have a byte array and a place to put it - let’s do something with it! First, we need to read that array then you do something with it and write it back to a byte array!

   ClassReader reader = new ClassReader(bytes);
ClassWriter writer = new ClassWriter(ClassWriter.COMPUTE_FRAMES);

// ...

return writer.toByteArray();

Now, you can use the potential of ASM to do whatever you want! If you want a guide on how to use ObjectWeb ASM please let me know in the comments.