java序列化篇

最近用到spark的序列化,因为sc的paralize方法和transform和action都是分布式行为,所以存在driver与worker中的task的复制传递一些值,所以对序列化性能做一下测试。java的对象要想网络之间进行传输,必须序列化成字节数组才能进行传输

spark默认的序列化是Java自带的序列化器ObjectInputStream和ObjectOutputStream(主要是考虑了方便性或者通用性),在默认情况下如果自定义了RDD中数据元素的类型则必须实现Serializable接口,当然,也可以实现自己的序列化接口Externalizable来实现更加高效的Java序列化算法;采用默认ObjectInputStream和ObjectOutputStream会导致序列化后的数据占用大量的内存或者磁盘及大量消耗网络,且在序列化和反序列化的时候比较消耗CPU;

此次测试列举一下一些常用的序列化

1 java 默认的序列化  实现serialize接口

2 java的自定义序列化 实现Externalizable接口

3 java序列化框架  Kyro

4 java序列化框架 FST

当然其他还有一些fastjson,jackson,之类的在此先不做探讨

 

1首先针对默认的序列化测试,测试写入50W个User对象

 

public class DefaultUser implements Serializable{
    private String username;
    private String password;
    private int age;
    private Date birth;

    public DefaultUser(String username, String password, int age, Date birth) {
        this.username = username;
        this.password = password;
        this.age = age;
        this.birth = birth;
    }
    public DefaultUser() {
    }
}

 

 

 

 

public class JDfaultSerialize {
    public static void main(String[] args) throws IOException, ClassNotFoundException {
        ObjectOutputStream out = new ObjectOutputStream( new
                FileOutputStream(
                "E:\\data1.dat"));

        ObjectInputStream in = new ObjectInputStream(new FileInputStream(
                "E:\\data1.dat"));
        ArrayList<Object> stu = new ArrayList<>();
        for (int i=0;i<=500000;i++){
            stu.add(new DefaultUser("gg","123",11,new Date()));
        }
        long start=new Date().getTime();
        out.writeObject(stu);
        out.flush();    out.close();
// ...
        List<Object> someObject = (List<Object>) in.readObject();
        System.out.println(someObject);
        in.close();
        System.out.println(new  Date().getTime()-start);
    }
}

时间 24503      文件大小   18,500,243K

 

2下面采用实现ExSerialize接口来自定义序列化

 

public class ExterUser implements Externalizable {
    private String username;
    private String password;
    private int age;
    private Date birth;

    public ExterUser(String username, String password, int age, Date birth) {
        this.username = username;
        this.password = password;
        this.age = age;
        this.birth = birth;
    }

    public ExterUser() {
    }

    @Override
    public void writeExternal(ObjectOutput stream) throws IOException {
        stream.writeObject(this.username);
        stream.writeObject(this.password);
        stream.writeInt(this.age);
        stream.writeObject(this.birth);
    }

    @Override
    public void readExternal(ObjectInput stream) throws IOException, ClassNotFoundException {
        this.username = (String)stream.readObject();
        this.password = (String)stream.readObject();
        this.age = stream.readInt();
        this.birth = (Date)stream.readObject();
    }
}
public class ExSerialize {
    public static void main(String[] args) throws IOException, ClassNotFoundException {

        ObjectOutputStream out = new ObjectOutputStream( new
                FileOutputStream(
                "E:\\data1.dat"));

        ObjectInputStream in = new ObjectInputStream(new FileInputStream(
                "E:\\data1.dat"));



        ArrayList<Object> stu = new ArrayList<>();
        for (int i=0;i<=500000;i++){
            stu.add(new ExterUser("gg","123",11,new Date()));
        }
         long start=new Date().getTime();
        out.writeObject(stu);
        out.flush();  out.close();
// ...

        List<Object> someObject = (List<Object>) in.readObject();
        System.out.println(someObject);
        in.close();
        System.out.println(new  Date().getTime()-start);
    }
}

时间花费:31459   文件大小 20,000,163K

 

3针对kyro测试 kyro(规定必须有默认构造器,甚至可以不用实现serialize接口)

 

public class KyroUser{
    private String username;
    private String password;
    private int age;
    private Date birth;

    public KyroUser(String username, String password, int age, Date birth) {
        this.username = username;
        this.password = password;
        this.age = age;
        this.birth = birth;
    }

    public KyroUser() {
    }
}
public class KyroSerialize {
    public static void main(String[] args) throws FileNotFoundException {

        Kryo kryo = new Kryo();
        Output output = new Output(new FileOutputStream("F://a.txt"));
        Input input = new Input(new FileInputStream("f://a.txt"));

        ArrayList<Object> objects = new ArrayList<>();
        for (int i=0;i<=500000;i++){
            objects.add(new KyroUser("gg","123",11,new Date()));
        }

        long start=new Date().getTime();
        kryo.writeObject(output,objects);
        output.flush();   output.close();
// ...

        List<Object> someObject = kryo.readObject(input, ArrayList.class);
        System.out.println(someObject);
        input.close();
        System.out.println(new  Date().getTime()-start);
    }
} 

 

 花费时间: 2864         文件大小  7,500,065K

4下面针对得是FST序列化框架

 

public class FTSUser implements Serializable{
    private String username;
    private String password;
    private int age;
    private Date birth;

    public FTSUser(String username, String password, int age, Date birth) {
        this.username = username;
        this.password = password;
        this.age = age;
        this.birth = birth;
    }

    public FTSUser() {
    }
}
public class FTSSerialize {
    public static void main(String arg[]) throws IOException, ClassNotFoundException {
        FileOutputStream fos=new FileOutputStream(new File("F://d.txt"));
        FSTConfiguration configuration = FSTConfiguration.createDefaultConfiguration();
        ArrayList<Object> stu = new ArrayList<>();
        for (int i=0;i<=500000;i++){
            stu.add(new FTSUser("gg","123",11,new Date()));
        }
        long start=new Date().getTime();
        byte[] bytes = configuration.asByteArray(stu);
        Object someObject = configuration.asObject(bytes);
        out.println(someObject);
        out.println(new  Date().getTime()-start);
        fos.write(bytes); fos.close();
    }


}

 

 

 

花费时间 :2305    数组大小  9,500,058K

结果统计

java默认serialize花费时间 24503      文件大小   18,500,243K

java自定义exSerialize时间花费:31459   文件大小 20,000,163K

java kyro花费时间: 2864         文件大小  7,500,065K

java FST花费时间 :2305    数组大小  9,500,058K

java默认的序列化性能是非常低下的,并且java默认的序列化存在一些不安全行为

kryo的效率很高。今日测试性能接近java原生序列化的9倍左右,大小也是非常节省空间的,大小仅为默认序列化大小的0.5倍

FST 在测试中的性能也是很高,但是因为FST并没有广泛应用到产品中,稳定性还不清楚,推荐使用kyro

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值