1.Writable
- 序列化时重要的接口,很多Hadoop中的数据类型都实现来这个接口,常见的有:FloatWritable ,DoubleWritable ,IntWritable ,LongWritable ,MapWritable ,Text 等Class 都实现来中个接口。
- 在Hadoop中定义一个结构化对象都要实现Writable接口,使得该结构化对象可以序列化为字节流,字节流也可以反序列化为结构化对象。
- 作为 Hadoop 中的 key 和 value 时,必须实现这个接口,作为key时,还必须实现WritableComparable这个接口。
2.WritableComparable
- MapReduce中所有的key值类型都必须实现这个接口。
- 既然是可序列化的那就必须得实现readFiels()和write()这两个序列化和反序列化函数,既然也是可比较的那就必须得实现compareTo()函数,该函数即是比较和排序规则的实现。
这样MR中的key值就既能可序列化又是可比较的,排序的。
public interface WritableComparable<T>
extends
Writable,
Comparable<T>
A Writable which is alsoComparable.
WritableComparables can be compared to each other, typically via Comparators. Any type which is to be used as a key in the Hadoop Map-Reduce framework should implement this interface.
Example:
public class MyWritableComparable implements WritableComparable {
// Some data
private int counter;
private long timestamp;
public void write(DataOutput out) throws IOException {
out.writeInt(counter);
out.writeLong(timestamp);
}
public void readFields(DataInput in) throws IOException {
counter = in.readInt();
timestamp = in.readLong();
}
public int compareTo(MyWritableComparable w) {
int thisValue = this.value;
int thatValue = ((IntWritable)o).value;
return (thisValue < thatValue ? -1 : (thisValue==thatValue ? 0 : 1));
}
}
Ps:
假设 my1 和 my2 都是MyWritableComparable 的对象,比较时:
my1.compareTo(my2)
由代码:
return (thisValue < thatValue ? -1 : (thisValue==thatValue ? 0 : 1));
可知:
my1 小于 my2 时,返回值为-1,
my1 等于 my2 时,返回值为0,
my1 大于 my2 时,返回值为1。