我继续遇到错误java.lang.NoSuchMethodException: org.apache.hadoop.io.ArrayWritable.()即使我已经创建了TextArrayWritable来实现它
我有一个数据集,其中每个条目的格式为:((a,b,c),(d,e,f,g))。 它们是pyspark中的元组,并通过以下方式保存到顺序文件中:
output.saveAsSequenceFile(
path=os.path.join(output_path, 'date=%s' % date),
compressionCodecClass='org.apache.hadoop.io.compress.SnappyCodec'
)
现在我想用java加载它们
JavaPairRDD distFile = sc.sequenceFile(s3inputPath.toString(), TextArrayWritable.class, TextArrayWritable.class);
其中TextArrayWritable heritages ArrayWritable,见下文:
public static class TextArrayWritable extends ArrayWritable {
public TextArrayWritable() {
super(Text.class);
}
public TextArrayWritable(String[] strings) {
super(Text.class);
Text[] texts = new Text[strings.length];
for (int i = 0; i < strings.length; i++) {
texts[i] = new Text(strings[i]);
}
set(texts);
}
}
不幸的是,我遇到了错误,说java.lang.NoSuchMethodException: org.apache.hadoop.io.ArrayWritable.()
任何人都可以帮我解决这个问题吗?
谢谢!!!