java readfully_MapReduce Error: java.io.EOFException at java.io.DataInputStream.readFully(DataInputS...

最新推荐文章于 2024-02-04 08:51:17 发布

认知能力训练

最新推荐文章于 2024-02-04 08:51:17 发布

阅读量1.3k

点赞数

文章标签： java readfully

本文链接：https://blog.csdn.net/weixin_33372672/article/details/115072608

版权

13/07/23 22:53:05 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=

13/07/23 22:53:05 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.

13/07/23 22:53:05 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).

13/07/23 22:53:05 INFO input.FileInputFormat: Total input paths to process : 44

13/07/23 22:53:10 INFO mapred.JobClient: Running job: job_local_0001

13/07/23 22:53:10 INFO input.FileInputFormat: Total input paths to process : 44

13/07/23 22:53:10 INFO mapred.MapTask: io.sort.mb = 100

13/07/23 22:53:14 INFO mapred.JobClient: map 0% reduce 0%

13/07/23 22:53:14 INFO mapred.MapTask: data buffer = 79691776/99614720

13/07/23 22:53:14 INFO mapred.MapTask: record buffer = 262144/327680

13/07/23 22:53:14 INFO mapred.MapTask: Starting flush of map output

13/07/23 22:53:14 WARN mapred.LocalJobRunner: job_local_0001

java.io.EOFException

at java.io.DataInputStream.readFully(DataInputStream.java:197)

at org.apache.hadoop.io.Text.

readFields(Text.java:265)

at CoOccurrence$TextPair.

readFields(CoOccurrence.java:

74)

at org.apache.hadoop.io.

serializer.

WritableSerialization$

WritableDeserializer.

deserialize(

WritableSerialization.java:67)

at org.apache.hadoop.io.

serializer.

WritableSerialization$

WritableDeserializer.

deserialize(

WritableSerialization.java:40)

at org.apache.hadoop.mapreduce.

ReduceContext.nextKeyValue(

ReduceContext.java:113)

at org.apache.hadoop.mapreduce.

ReduceContext.nextKey(

ReduceContext.java:92)

at org.apache.hadoop.mapreduce.

Reducer.run(Reducer.java:175)

at org.apache.hadoop.mapred.Task$

NewCombinerRunner.combine(

Task.java:1222)

at org.apache.hadoop.mapred.

MapTask$MapOutputBuffer.

sortAndSpill(MapTask.java:

1265)

at org.apache.hadoop.mapred.

MapTask$MapOutputBuffer.flush(

MapTask.java:1129)

at org.apache.hadoop.mapred.

MapTask$NewOutputCollector.

close(MapTask.java:549)

at org.apache.hadoop.mapred.

MapTask.runNewMapper(MapTask.

java:623)

at org.apache.hadoop.mapred.

MapTask.run(MapTask.java:305)

at org.apache.hadoop.mapred.

LocalJobRunner$Job.run(

LocalJobRunner.java:177)

13/07/23 22:53:15 INFO mapred.JobClient: Job complete: job_local_0001

13/07/23 22:53:15 INFO mapred.JobClient: Counters: 0

1、程序没有Warning

2、StackOverFlow上有说是While循环的问题：然后我把程序中的While循环注释掉还是这个Error。

3、StackOverFlow上还有人说不同的运行时间有的时候会出现，有的时候不出现。

网上还有很多办法，但是依旧没有解决掉。

最终查找到的原因：

@Override

public void readFields(DataInput in) throws IOException {

first.readFields(in);

second.readFields(in);

}

@Override

public void write(DataOutput out) throws IOException {

first.write(out);

out.write('\t');//错误原因！！！

second.write(out);

错误原因是DataInput类的readFully方法读到了文件末尾抛出了异常……

在TextPair类的readFields方法中直接使用的Text类的readFields方法

public void readFields(DataInput in) throws IOException {

int newLength = WritableUtils.readVInt(in);

setCapacity(newLength, false);

in.readFully(bytes, 0, newLength);

length = newLength;

}

这是Text类的readFields方法源代码，里面牵扯到两个类，一个是Java的DataInput类，一个是WritableUtils类，大体看了一下，Text的在序列化和反序列化时，把数据长度写入第一个byte，你在序列化时，在写了第一个变量后，又写了一个转义字符，然后继续写下一个变量，而在你读的时候却没做处理，这样就会导致读第二个变量的时候独到的第一个byte不是第二个变量的长度，而是你打的\t，长度出错了，如果读的长度比原来的长度短，反而不会出错(但是结果不对)，如果比原长度长，那肯定就是EOFException，没东西可读了。

把out.write("\t")去掉就好了。

多谢cs402的同学！

认知能力训练

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
java readfully_MapReduce Error: java.io.EOFException at java.io.DataInputStream.readFully(DataInputS...

13/07/23 22:53:05 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=13/07/23 22:53:05 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Appl...
复制链接

扫一扫