deserialization in Java

Serialization and deserialization in Java: explaining the Java deserialize vulnerability

Java serialization is a mechanism to transform an object into a byte stream. Java deserialization is exactly the other way around and allows us to recreate an object from a byte stream. Java serialization—and more specifically deserialization in Java—is also known as “the gift that keeps on giving”. This relates to the many security issues and other problems it has produced over the years.

What is serialization in Java?

In Java, we create objects. These objects live in memory and are removed by the garbage collector once they are not used anymore. If we want to transfer an object, for instance, store it on a disk or send it over a network, we need to transform it into a byte stream. To do this, the class of that object needs to implement the interface Serializable. Serialization is converting the state of an object into a byte stream. This byte stream does not contain the actual code.

What is deserialization in Java?

Deserialization is precisely the opposite of serialization. With deserialization, you have a byte stream and you recreate the object in the same state as when you serialized it. This means that you need to have the actual definition of the object to accomplish the recreation.

How does Java serialization work?

Java serialization uses reflection to scrape all the data from the object’s fields that need to be serialized. This includes private and final fields. If a field contains an object, that object is serialized recursively. Even though you might have getters and setters, these functions are not used when serializing an object in Java.

How does Java deserialization work?

When deserializing a byte stream back to an object it does not use the constructor. It creates an empty object and uses reflection to write the data to the fields. Just like with serialization, private and final fields are also included.

What is a Java deserialize vulnerability?

A Java deserialize vulnerability is a security vulnerability that occurs when a malicious user tries to insert a modified serialized object into the system that eventually compromises the system or its data. Think of arbitrary code execution that can be triggered when deserializing a serialized object. To better explain Java deserialize vulnerabilities, we first need to explore how deserialization works in Java.

Explaining Java deserialize vulnerabilities

A serialized object in Java is a byte array with information of the state. It contains the name of the object it refers to and the data of the field. If you look at a stored serialized object with a hex-editor, you can enclose and manipulate the information quickly.

We already know that Java deserialization does not use the constructor to create an object but rather uses reflection to load the fields. This means that any validation checks done in the constructor are never called when recreating the object. You can think about checks like start-date before end-date when describing a period. When deserializing a Java object, this new object can have an invalid state.

Let’s look at the following example of Java deserialize vulnerability where we serialize an object from a serializable class ValueObject:

public class ValueObject implements Serializable {

   private String value;
   private String sideEffect;

   public ValueObject() {
       this("empty");
   }

   public ValueObject(String value) {
       this.value = value;
       this.sideEffect = java.time.LocalTime.now().toString();
   }
}


ValueObject vo1 = new ValueObject("Hi");
FileOutputStream fileOut = new FileOutputStream("ValueObject.ser");
ObjectOutputStream out = new ObjectOutputStream(fileOut);
out.writeObject(vo1);
out.close();
fileOut.close();

When reading the file ValueObject.ser containing the serialized object with a hex-editor the output is this.

serialization in Java example:  file “ValueObject.ser” containing the serialized object with a hex-editor the output is this

Now I can easily manipulate the string value. Below I change it from Hi to Hallo.

FileInputStream fileIn = new FileInputStream("ValueObject2.ser");
ObjectInputStream in = new ObjectInputStream(fileIn);
ValueObject vo2 = (ValueObject) in.readObject();

When deserializing the adjusted binary file, we find out that the object’s value changed. We also see that the timestamp didn’t change, proving that the constructor is never called. So, if an application accepts serialized objects, it is relatively easy to temper with the values. By altering the serialized objects, we can create invalid objects, mess with the data’s integrity, or even worse.

Arbitrary code execution, gadgets, and chains

Tampering with the data in an object is harmful already. However, this can also lead to code execution if the correct set of objects is deserialized. To explain this I first have to explain gadgets and chains.

Gadgets

A gadget—as used by Lawrence & Frohoff in their talk Marschalling Pickle at AppSecCali 2015—is a class or function that has already existing executable code present in the vulnerable process. This existing executable code can be reused for malicious purposes. If we look at Java serializable objects, some magic methods—like the private readObject() method—are reflectively called when deserializing. 

Let’s look at the simplified gadget below:

public class Gadget implements Serializable {

   private Runnable command;

   public Gadget(Command command) {
       this.command = command;
   }

   private final void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException {
       in.defaultReadObject();
       command.run();
   }
}

This gadget class overrides the default readObject method. As a result, every time an Object of class Gadget gets deserialized, the Runnable object command is executed. When a command class looks something like the example below, it is easy to manipulate this serialized object and perform code injection.

public class Command implements Runnable, Serializable {

   private String command;

   public Command(String command) {
       this.command = command;
   }

   @Override
   public void run() {
       try {
           Runtime.getRuntime().exec(command);
       } catch (IOException e) {
           throw new RuntimeException(e);
       }
   }

Also, note that if an application accepts serialized objects, the object is deserialized first before it is cast to the desired type. This means that even if casting fails, deserialization is already completed and the readObject() method is executed. 

FileInputStream fileIn = new FileInputStream("Gadget.ser");
ObjectInputStream in = new ObjectInputStream(fileIn);
var obj = (ValueObject)in.readObject();

Gadget chains deserialization attack

A typical deserialization attack consists of a cleverly crafted chain of gadgets. An attacker searches for a gadget that is usable for launching an attack and chains several executions that end with arbitrary code execution, for instance.

In our example:

Gadget -> readObject() -> command.run() -> Runtime.getRuntime().exec()

For a more real-life example, take a look at the implementation of java.util.HashMap. This class has a custom implementation of the readObject() method that triggers every key’s hashcode() function. 

Libraries

It is good to know that whatever gadget chains are available in your application, is not related to your code. Because we import lots of code from libraries and frameworks, the number of classes imported by your (transitive) dependencies influences certain gadget chains’ possibility. Although creating such a malicious gadget chain is very hard and labor-intensive, Java deserialization vulnerabilities are a genuine and dangerous security risk.

How to prevent a Java deserialize vulnerability?

The best way to prevent a Java deserialize vulnerability is to prevent Java serialization overall. If your application does not accept serialized objects at all, it cannot harm you.

However, if you do need to implement the Serializable interface due to inheritance, you can override the readObject(), as seen below, to prevent actual deserialization.

private final void readObject(ObjectInputStream in) throws java.io.IOException {
   throw new java.io.IOException("Deserialized not allowed");
}

If your application relies on serialized objects, you can consider inspecting your ObjectInputStream before deserializing. A library that can help you with this is the Apache Commons IO library. This library provides a ValidatedObjectInputStream where you can explicitly allow the objects you want to deserialize. Now you prevent that unexpected types are deserialized at all.

FileInputStream fileIn = new FileInputStream("Gadget.ser");
ValidatingObjectInputStream in = new ValidatingObjectInputStream(fileIn);
in.accept(ValueObject.class);
var obj = (ValueObject)in.readObject();

A tool like ysoserial is also extremely useful in finding Java deserialize vulnerabilities in your code. It is a tool that generates payload to discover gadget chains in common Java libraries that can, under the right conditions, exploit Java applications performing unsafe deserialization of objects.

Note that Java deserialization vulnerabilities are not exclusive to Java’s custom serialization implementation. Although this article merely focuses on this part, the same vulnerabilities exist in serialization or marshaling frameworks that handle this for you. If there is a framework that magically creates POJO’s out of XML, JSON, yaml, or similar formats, it probably uses reflection in the same way as described above. This means that the same problems exist. 

To prevent these kinds of Java deserialize vulnerabilities in your external libraries, scan your libraries with Snyk Open Source early on and often. It’s free!