High performance Serialization

High performance Serialization

Serialization is the process of converting an object into a stream of bytes. That stream can then be sent through a socket, stored to a file and/or database or simply manipulated as is. With this article we do not intend to present an in depth description of the serialization mechanism, there are numerous articles out there that provide this kind of information. What will be discussed here is our proposition for utilizing serialization in order to achieve high performance results.

The three main performance problems with serialization are :

  • Serialization is a recursive algorithm. Starting from a single object, all the objects that can be reached from that object by following instance variables, are also serialized. The default behavior can easily lead to unnecessary Serialization overheads
  • Both serializing and deserializing require the serialization mechanism to discover information about the instance it is serializing. Using the default serialization mechanism, will use reflection to discover all the field values. Furthermore if you don’t explicitelly set a „serialVersionUID“ class attribute, the serialization mechanism has to compute it. This involves going through all the fields and methods to generate a hash. The aforementioned procedure can be quite slow
  • Using the default serialization mechanism, all the serializing class description information is included in the stream, such as :
    • The description of all the serializable superclasses
    • The description of the class itself
    • The instance data associated with the specific instance of the class

To solve the aforementioned performance problems you can use Externalization instead. The major difference between these two methods is that Serialization writes out class descriptions of all the serializable superclasses along with the information associated with the instance when viewed as an instance of each individual superclass. Externalization, on the other hand, writes out the identity of the class (the name of the class and the appropriate „serialVersionUID“ class attribute) along with the superclass structure and all the information about the class hierarchy. In other words, it stores all the metadata, but writes out only the local instance information. In short, Externalization eliminates almost all the reflective calls used by the serialization mechanism and gives you complete control over the marshalling and demarshalling algorithms, resulting in dramatic performance improvements.

Of course, Externalization efficiency comes at a price. The default serialization mechanism adapts to application changes due to the fact that metadata is automatically extracted from the class definitions. Externalization on the other hand isn’t very flexible and requires you to rewrite your marshalling and demarshalling code whenever you change your class definitions.

What follows is a short demonstration on how to utilize Externalization for high performance applications. We will start by providing the “Employee” object to perform serialization and deserialization operations. Two flavors of the “Employee” object will be used. One suitable for standard serialization operations and another that is modified so as to able to be externalized.

Below is the first flavor of the “Employee” object :

001 package com.javacodegeeks.test;
002  
003 import java.io.Serializable;
004 import java.util.Date;
005 import java.util.List;
006  
007 public class Employee implements Serializable {
008  
009  private static final long serialVersionUID = 3657773293974543890L;
010   
011  private String firstName;
012  private String lastName;
013  private String socialSecurityNumber;
014  private String department;
015  private String position;
016  private Date hireDate;
017  private Double salary;
018  private Employee supervisor;
019  private List<string> phoneNumbers;
020   
021  public Employee() {
022  }
023   
024  public Employee(String firstName, String lastName,
025    String socialSecurityNumber, String department, String position,
026    Date hireDate, Double salary) {
027   this.firstName = firstName;
028   this.lastName = lastName;
029   this.socialSecurityNumber = socialSecurityNumber;
030   this.department = department;
031   this.position = position;
032   this.hireDate = hireDate;
033   this.salary = salary;
034  }
035  
036  public String getFirstName() {
037   return firstName;
038  }
039  
040  public void setFirstName(String firstName) {
041   this.firstName = firstName;
042  }
043  
044  public String getLastName() {
045   return lastName;
046  }
047  
048  public void setLastName(String lastName) {
049   this.lastName = lastName;
050  }
051  
052  public String getSocialSecurityNumber() {
053   return socialSecurityNumber;
054  }
055  
056  public void setSocialSecurityNumber(String socialSecurityNumber) {
057   this.socialSecurityNumber = socialSecurityNumber;
058  }
059  
060  public String getDepartment() {
061   return department;
062  }
063  
064  public void setDepartment(String department) {
065   this.department = department;
066  }
067  
068  public String getPosition() {
069   return position;
070  }
071  
072  public void setPosition(String position) {
073   this.position = position;
074  }
075  
076  public Date getHireDate() {
077   return hireDate;
078  }
079  
080  public void setHireDate(Date hireDate) {
081   this.hireDate = hireDate;
082  }
083  
084  public Double getSalary() {
085   return salary;
086  }
087  
088  public void setSalary(Double salary) {
089   this.salary = salary;
090  }
091  
092  public Employee getSupervisor() {
093   return supervisor;
094  }
095  
096  public void setSupervisor(Employee supervisor) {
097   this.supervisor = supervisor;
098  }
099  
100  public List<string> getPhoneNumbers() {
101   return phoneNumbers;
102  }
103  
104  public void setPhoneNumbers(List<string> phoneNumbers) {
105   this.phoneNumbers = phoneNumbers;
106  }
107  
108 }

Things to notice here :

  • We assume that the following fields are mandatory :
    • “firstName”
    • “lastName”
    • “socialSecurityNumber”
    • “department”
    • “position”
    • “hireDate”
    • “salary”

Following is the second flavor of the “Employee” object :

001 package com.javacodegeeks.test;
002  
003 import java.io.Externalizable;
004 import java.io.IOException;
005 import java.io.ObjectInput;
006 import java.io.ObjectOutput;
007 import java.util.Arrays;
008 import java.util.Date;
009 import java.util.List;
010  
011 public class Employee implements Externalizable {
012  
013  private String firstName;
014  private String lastName;
015  private String socialSecurityNumber;
016  private String department;
017  private String position;
018  private Date hireDate;
019  private Double salary;
020  private Employee supervisor;
021  private List<string> phoneNumbers;
022   
023  public Employee() {
024  }
025   
026  public Employee(String firstName, String lastName,
027    String socialSecurityNumber, String department, String position,
028    Date hireDate, Double salary) {
029   this.firstName = firstName;
030   this.lastName = lastName;
031   this.socialSecurityNumber = socialSecurityNumber;
032   this.department = department;
033   this.position = position;
034   this.hireDate = hireDate;
035   this.salary = salary;
036  }
037  
038  public String getFirstName() {
039   return firstName;
040  }
041  
042  public void setFirstName(String firstName) {
043   this.firstName = firstName;
044  }
045  
046  public String getLastName() {
047   return lastName;
048  }
049  
050  public void setLastName(String lastName) {
051   this.lastName = lastName;
052  }
053  
054  public String getSocialSecurityNumber() {
055   return socialSecurityNumber;
056  }
057  
058  public void setSocialSecurityNumber(String socialSecurityNumber) {
059   this.socialSecurityNumber = socialSecurityNumber;
060  }
061  
062  public String getDepartment() {
063   return department;
064  }
065  
066  public void setDepartment(String department) {
067   this.department = department;
068  }
069  
070  public String getPosition() {
071   return position;
072  }
073  
074  public void setPosition(String position) {
075   this.position = position;
076  }
077  
078  public Date getHireDate() {
079   return hireDate;
080  }
081  
082  public void setHireDate(Date hireDate) {
083   this.hireDate = hireDate;
084  }
085  
086  public Double getSalary() {
087   return salary;
088  }
089  
090  public void setSalary(Double salary) {
091   this.salary = salary;
092  }
093  
094  public Employee getSupervisor() {
095   return supervisor;
096  }
097  
098  public void setSupervisor(Employee supervisor) {
099   this.supervisor = supervisor;
100  }
101  
102  public List<string> getPhoneNumbers() {
103   return phoneNumbers;
104  }
105  
106  public void setPhoneNumbers(List<string> phoneNumbers) {
107   this.phoneNumbers = phoneNumbers;
108  }
109  
110  public void readExternal(ObjectInput objectInput) throws IOException,
111    ClassNotFoundException {
112    
113   this.firstName = objectInput.readUTF();
114   this.lastName = objectInput.readUTF();
115   this.socialSecurityNumber = objectInput.readUTF();
116   this.department = objectInput.readUTF();
117   this.position = objectInput.readUTF();
118   this.hireDate = new Date(objectInput.readLong());
119   this.salary = objectInput.readDouble();
120    
121   int attributeCount = objectInput.read();
122  
123   byte[] attributes = new byte[attributeCount];
124  
125   objectInput.readFully(attributes);
126    
127   for (int i = 0; i < attributeCount; i++) {
128    byte attribute = attributes[i];
129  
130    switch (attribute) {
131    case (byte0:
132     this.supervisor = (Employee) objectInput.readObject();
133     break;
134    case (byte1:
135     this.phoneNumbers = Arrays.asList(objectInput.readUTF().split(";"));
136     break;
137    }
138   }
139    
140  }
141  
142  public void writeExternal(ObjectOutput objectOutput) throws IOException {
143    
144   objectOutput.writeUTF(firstName);
145   objectOutput.writeUTF(lastName);
146   objectOutput.writeUTF(socialSecurityNumber);
147   objectOutput.writeUTF(department);
148   objectOutput.writeUTF(position);
149   objectOutput.writeLong(hireDate.getTime());
150   objectOutput.writeDouble(salary);
151    
152   byte[] attributeFlags = new byte[2];
153    
154   int attributeCount = 0;
155    
156   if (supervisor != null) {
157    attributeFlags[0] = (byte1;
158    attributeCount++;
159   }
160   if (phoneNumbers != null && !phoneNumbers.isEmpty()) {
161    attributeFlags[1] = (byte1;
162    attributeCount++;
163   }
164    
165   objectOutput.write(attributeCount);
166    
167   byte[] attributes = new byte[attributeCount];
168  
169   int j = attributeCount;
170  
171   for (int i = 0; i < 2; i++)
172    if (attributeFlags[i] == (byte1) {
173     j--;
174     attributes[j] = (byte) i;
175    }
176  
177   objectOutput.write(attributes);
178    
179   for (int i = 0; i < attributeCount; i++) {
180    byte attribute = attributes[i];
181  
182    switch (attribute) {
183    case (byte0:
184     objectOutput.writeObject(supervisor);
185     break;
186    case (byte1:
187     StringBuilder rowPhoneNumbers = new StringBuilder();
188     for(int k = 0; k < phoneNumbers.size(); k++)
189      rowPhoneNumbers.append(phoneNumbers.get(k) + ";");
190     rowPhoneNumbers.deleteCharAt(rowPhoneNumbers.lastIndexOf(";"));
191     objectOutput.writeUTF(rowPhoneNumbers.toString());
192     break;
193    }
194   }
195    
196  }
197 }

Things to notice here :

  • We implement the “writeExternal” method for marshalling the “Employee” object. All mandatory fields are written to the stream
  • For the “hireDate” field we write only the number of milliseconds represented by this Date object. Assuming that the demarshaller will be using the same timezone as the marshaller the milliseconds value is all the information we need to properly deserialize the “hireDate” field. Keep in mind that we could serialize the entire “hireDate” object by using the “objectOutput.writeObject(hireDate)” operation. In that case the default serialization mechanism would kick in resulting in speed degradation and size increment for the resulting stream
  • All the non mandatory fields (“supervisor” and “phoneNumbers”) are written to the stream only when they have actual (not null) values. To implement this functionality we use the “attributeFlags” and “attributes” byte arrays. Each position of the “attributeFlags” array represents a non mandatory field and holds a “marker” indicating whether the specific field has a value. We check each non mandatory field and populate the “attributeFlags” byte array with the corresponding markers. The “attributes” byte array indicates the actual non mandatory fields that must be written to the stream by means of “position”. For example if both “supervisor” and “phoneNumbers” non mandatory fields have actual values then “attributeFlags” byte array should be [1,1] and “attributes” byte array should be [0,1]. In case only “phoneNumbers” non mandatory field has a non null value “attributeFlags” byte array should be [0,1] and “attributes” byte array should be [1]. By using the aforementioned algorithm we can achieve minimal size footprint for the resulting stream. To properly deserialize the “Employee” object non mandatory parameters we must write to the steam only the following information :
    • The overall number of non mandatory parameters that will be written (aka the “attributes” byte array size – for the demarshaller to parse)
    • The “attributes” byte array (for the demarshaller to properly assign field values)
    • The actual non mandatory parameter values
  • For the “phoneNumbers” field we construct and write to the stream a String representation of its contents. Alternatively we could serialize the entire “phoneNumbers” object by using the “objectOutput.writeObject(phoneNumbers)” operation. In that case the default serialization mechanism would kick in resulting in speed degradation and size increment for the resulting stream
  • We implement the “readExternal” method for demarshalling the “Employee” object. All mandatory fields are written to the stream. For the non mandatory fields the demarshaller assigns the appropriate field values according to the protocol described above

For the serialization and deserialization processes we used the following four functions. These functions come in two flavors. The first pair is suitable for serializing and deserializing Externalizable object instances, whereas the second pair is suitable for serializing and deserializing Serializable object instances.

01 public static byte[][] serializeObject(Externalizable object) throws Exception {
02   ByteArrayOutputStream baos = null;
03   ObjectOutputStream oos = null;
04   byte[][] res = new byte[2][];
05    
06   try {
07    baos = new ByteArrayOutputStream();
08    oos = new ObjectOutputStream(baos);
09     
10    object.writeExternal(oos);
11    oos.flush();
12     
13    res[0] = object.getClass().getName().getBytes();
14    res[1] = baos.toByteArray();
15    
16   catch (Exception ex) {
17    throw ex;
18   finally {
19    try {
20     if(oos != null)
21      oos.close();
22    catch (Exception e) {
23     e.printStackTrace();
24    }
25   }
26    
27   return res;
28  }
01 public static Externalizable deserializeObject(byte[][] rowObject) throws Exception {
02   ObjectInputStream ois = null;
03   String objectClassName = null;
04   Externalizable res = null;
05    
06   try {
07     
08    objectClassName = new String(rowObject[0]);
09    byte[] objectBytes = rowObject[1];
10     
11    ois = new ObjectInputStream(new ByteArrayInputStream(objectBytes));
12     
13    Class objectClass = Class.forName(objectClassName);
14    res = (Externalizable) objectClass.newInstance();
15    res.readExternal(ois);
16    
17   catch (Exception ex) {
18    throw ex;
19   finally {
20    try {
21     if(ois != null)
22      ois.close();
23    catch (Exception e) {
24     e.printStackTrace();
25    }
26     
27   }
28    
29   return res;
30    
31  }
01 public static byte[] serializeObject(Serializable object) throws Exception {
02   ByteArrayOutputStream baos = null;
03   ObjectOutputStream oos = null;
04   byte[] res = null;
05    
06   try {
07    baos = new ByteArrayOutputStream();
08    oos = new ObjectOutputStream(baos);
09     
10    oos.writeObject(object);
11    oos.flush();
12     
13    res = baos.toByteArray();
14    
15   catch (Exception ex) {
16    throw ex;
17   finally {
18    try {
19     if(oos != null)
20      oos.close();
21    catch (Exception e) {
22     e.printStackTrace();
23    }
24   }
25    
26   return res;
27  }
01 public static Serializable deserializeObject(byte[] rowObject) throws Exception {
02   ObjectInputStream ois = null;
03   Serializable res = null;
04    
05   try {
06     
07    ois = new ObjectInputStream(new ByteArrayInputStream(rowObject));
08    res = (Serializable) ois.readObject();
09    
10   catch (Exception ex) {
11    throw ex;
12   finally {
13    try {
14     if(ois != null)
15      ois.close();
16    catch (Exception e) {
17     e.printStackTrace();
18    }
19     
20   }
21    
22   return res;
23    
24  }

Below we present a performance comparison chart between the two aforementioned approaches

The horizontal axis represents the number of test runs and the vertical axis the average transactions per second (TPS) for each test run. Thus higher values are better. As you can see by using the Externalizable approach you can achieve superior performance gains when serializing and deserializing compared to the plain Serializable approach.

Lastly we must pinpoint that we performed our tests providing values for all non mandatory fields of the “Employee” object. You should expect even higher performance gains if you do not use all the non mandatory parameters for your tests, either when comparing between the same approach and most importantly when cross comparing between the Externalizable and Serializable approaches.

Happy coding!

Justin

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值