实现Hadoop的Writable接口Implementing Writable interface of Hadoop

As we saw in the previous posts, Hadoop makes an heavy use of network transmissions for executing its jobs. As Doug Cutting (the creator of Hadoop) explaines in this post on the Lucene mailing list, java.io.Serializable is too heavy for Hadoop's needs and so a new interface has been developed: Writable. Every object you need to emit from mapper to reducers or as an output has to implement this interface in order to make Hadoop trasmit the data from/to the nodes in the cluster. 

Hadoop comes with several wrappers around primitive types and widely used classes in Java: 

Java primitiveWritable implementation
booleanBooleanWritable
byteByteWritable
shortShortWritable
intIntWritable
VIntWritable
floatFloatWritable
longLongWritable
VLongWritable
doubleDoubleWritable



Java classWritable implementation
StringText
byte[]BytesWritable
ObjectObjectWritable
nullNullWritable



Java collectionWritable implementation
arrayArrayWritable
ArrayPrimitiveWritable
TwoDArrayWritable
MapMapWritable
SortedMapSortedMapWritable
enumEnumWritable


For example, if we need a mapper to emit a String, we need to use a Text object wrapped around the string we want to emit. 

The interface Writable defines two methods:

  • public void write(DataOutput dataOutput) throws IOException
  • public void readFields(DataInput dataInput) throws IOException

The first method, write() is used for writing the data onto the stream, while the second method, readFields(), is used for reading data from the stream. The wrappers we saw above just send and receive their binary representation over a stream. 
Since Hadoop needs also to sort data while in the shuffle-and-sort phase, it needs also the Comparable interface to be implemented, so it defines the WritableComparable interface which is an interface that implements both Writable and Comparable. 
If we need to emit a custom object which has no default wrapper, we need to create a class that implements the WritableComparable interface. In the mean example we saw on this post, we used the SumCount class, which is a class that implements WritableComparable (the source code is available on github):

public class SumCount implements WritableComparable<SumCount> { DoubleWritable sum; IntWritable count; public SumCount() { set(new DoubleWritable(0), new IntWritable(0)); } public SumCount(Double sum, Integer count) { set(new DoubleWritable(sum), new IntWritable(count)); } public void set(DoubleWritable sum, IntWritable count) { this.sum = sum; this.count = count; } public DoubleWritable getSum() { return sum; } public IntWritable getCount() { return count; } public void addSumCount(SumCount sumCount) { set(new DoubleWritable(this.sum.get() + sumCount.getSum().get()), new IntWritable(this.count.get() + sumCount.getCount().get())); } @Override public void write(DataOutput dataOutput) throws IOException { sum.write(dataOutput); count.write(dataOutput); } @Override public void readFields(DataInput dataInput) throws IOException { sum.readFields(dataInput); count
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值