How Deserialization works internally in Java ?

Deserialization is the process by which the object previously serialized is reconstructed back into it’s original form i.e. object instance. The input to the deserialization process is the stream of bytes which we get over the other end of network OR we simply read it from file system/database or memory. One question arise immediately, what is written inside this stream of bytes ?


Stream Elements : A basic structure is required to represent the objects in a stream. Each attribute of the object needs to be represented: class’s meta data, type information of instance fields and values of instance fields as well. The representation of objects in the stream can be described with a grammar. There are special representations for null objects, new objects, classes, arrays, strings, and back references to any object already in the stream. Each object written to the stream is assigned a handle that is used to refer back to the object. Handles are assigned sequentially starting from 0x7E0000. The handles restart at 0x7E0000 when the stream is reset.

This information is needed when object is re-constructed back to a new object instance. While deserializing an object, the JVM reads its class metadata from the stream of bytes which specifies whether the class of an object implements either ‘Serializable’ or ‘Externalizable’ interface.


To create an Object from byte stream by the deserilization process the bytecode of a class, whose object is being deserialized, must be present within the JVM performing deserialization. Otherwise, the ‘ClassNotFoundException’ is thrown. If instance implements the serializable interface, then an instance of the class is created without invoking it’s any constructor. Let us check how the object is being created without calling its constructor ? 

 

public class TestDeserialization
{
    public static void main(String[] args)
    {
        System.out.println("Hello World!");
    }
}

Byte code:

public class TestDeserialization extends java.lang.Object{
public TestDeserialization();
  Code:
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."":()V
   4:   return
 
public static void main(java.lang.String[]);
  Code:
   0:   getstatic   #2; //Field java/lang/System.out:Ljava/io/PrintStream;
   3:   ldc #3; //String Hello World!
   5:   invokevirtual   #4; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   8:   return
}

In the above bytecode in the first line, we’re going to push a value from the “local variable table” onto the stack. In this case, we’re really only pushing the implicit reference to “this”. Second instruction is main thing. It actually invokes the constructor of super most class and in above case it is the Object class constructor. And once the constructor of super most class (i.e. Object in this case) has been called, rest of the code does specific instructions written in code.

In deserialization process, it is required that all the parent classes of instance should be Serializable; and if any super class in hirarchy is not Serializable then it must have a default constructor.So, while deserialization the super most class is searched first until any non-serializable class is found. If all super classes are serializable then JVM end up reaching Object class itself and create an instance of Object class first. If in between searching the super classes, any class is found non-serializable then it’s default constructor will be used to allocate an instance in memory. If any super class of instance to be de-serialized in non-serializable and also does not have a default constructor then the java.io.InvalidClassException is thrown by JVM.

Consider the following example:

I have a class Person :

public class Person {
    private final String name ;
    public Person(String name) {
           super ();
           this .name = name ;
   }
}

Now I have a child class Employee extending the Person class and implementing the Serializable interface.

package com.tuturself.serialization

import java.io.Serializable;

public class Employee extends Person implements Serializable {
    private final Integer employeeId ;
    public Employee(Integer employeeId,String name) {
           super (name);
           this .employeeId = employeeId;
   }
    @Override
    public String toString() {
           return "Employee [employeeId=" + employeeId + "]" ;
   }
}

Now the question is can I serialize the Employee class. Although It is implementing the Serializable interface. The answer is: If we try to serialize , it works fine from top. But we will get an exception while trying to de-serialize the object. Let us check why it will happen? 

package com.tuturself.serialization;

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;

public class SerializeDemo {

  public static void serialize() throws IOException {
	Employee employee = new Employee(10,"Ninja Panda");
	System.out.println("After creation:=" + employee);
	System.out.println("Before serialization:=" + employee);
	FileOutputStream fos = new FileOutputStream("serial.ser");
	ObjectOutputStream oos = new ObjectOutputStream(fos);
	oos.writeObject(employee);
	oos.close();
  }
  public static void deSerialize() throws IOException,ClassNotFoundException {
	FileInputStream fis = new FileInputStream("serial.ser");
	ObjectInputStream ois = new ObjectInputStream(fis);
	Employee employee = (Employee) ois.readObject();
	System.out.println("After de-serialization:=" + employee);
	ois.close();
  }

  // Our Client code which is serializing and deserializing the object.
  public static void main(String[] args) {
    try {
	serialize();
	deSerialize();
    } catch (ClassNotFoundException ce) {
	ce.printStackTrace();
    } catch (IOException io) {
	io.printStackTrace();
    }
  }
}

The output of the program is :

After creation:=Employee [employeeId=10]
Before serialization:=Employee [employeeId=10]
java.io.InvalidClassException: com.tuturself.serialization.Employee; no valid constructor
	at java.io.ObjectStreamClass$ExceptionInfo.newInvalidClassException(ObjectStreamClass.java:150)
	at java.io.ObjectStreamClass.checkDeserialize(ObjectStreamClass.java:790)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1782)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
	at com.tuturself.serialization.SerializeDemo.deSerialize(SerializeDemo.java:23)
	at com.tuturself.serialization.SerializeDemo.main(SerializeDemo.java:32)

Let us check why we are not able to deserialize the object. Deserialization process needs to reconstruct entire object state and that includes the state of its super-classes as well.If the super class is not serializable then serialization needs to instantiate the super class.In order to do that, the super class needs to have a no-arg constructor. The exception is thrown at the time of serialization only. However, just like any other type of Exception, InvalidClassException itself is Serializable, since it has Throwable in object hierarchy which implements Serializable. So, the exception itself is serialized and thrown only during deserialization.  So, in order to solve our current problem, we supply the Person class with a no-arg constructor. 

public class Person {
    // no-arg constructor added
    public Person() {
        super ();
    }
    public Person(String name) {
        super ();
    }
}

So till now we got the instance located in memory using one of superclass’s default constructor. Note that after this no constructor will be called for any class. After executing super class constructor, JVM read the byte stream and use instance’s meta data to set type information and other meta information of instance.

After the blank instance is created, JVM first set it’s static fields and then invokes the default readObject() method [if it’s not overridden, otherwise overridden method will be called] internally which is responsible for setting the values from byte stream to blank instance. After the readObject() method is completed, the deserialization process is done and you are ready to work with new deserialized instance.

CORE JAVA SERIALIZATION