win7下myeclipse部署nutch1.2报Expecting a line not the end of stream异常解决

最新推荐文章于 2020-04-28 16:46:37 发布

豹先生_MR-BAO

最新推荐文章于 2020-04-28 16:46:37 发布

阅读量2k

点赞数

分类专栏： cloudera 文章标签： myeclipse stream exception null thread output

本文链接：https://blog.csdn.net/a221133/article/details/6912573

版权

cloudera 专栏收录该内容

69 篇文章 0 订阅

订阅专栏

在win7通过myeclipse部署nutch1.2源码，报如下异常：

2011-10-28 00:09:37,784 WARN mapred.LocalJobRunner (LocalJobRunner.java:run(256)) - job_local_0001
java.io.IOException: Expecting a line not the end of stream
at org.apache.hadoop.fs.DF.parseExecResult(DF.java:109)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:179)
at org.apache.hadoop.util.Shell.run(Shell.java:134)
at org.apache.hadoop.fs.DF.getAvailable(DF.java:73)
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:329)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1221)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1129)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:359)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
2011-10-28 00:09:38,174 INFO mapred.JobClient (JobClient.java:monitorAndPrintJob(1288)) - map 0% reduce 0%
Exception in thread "main" java.io.IOException: Job failed!
2011-10-28 00:09:38,174 INFO mapred.JobClient (JobClient.java:monitorAndPrintJob(1343)) - Job complete: job_local_0001
2011-10-28 00:09:38,174 INFO mapred.JobClient (Counters.java:log(514)) - Counters: 0
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
at org.apache.nutch.crawl.Injector.inject(Injector.java:217)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:124)

在网上找了很多资料，有说改cygwin语言环境的，有说是权限问题的但是实验了都不行，只好自己追溯问题，首先找到异常抛出方法：

DF类中的parseExecResult

protected void parseExecResult(BufferedReader lines) throws IOException {
lines.readLine(); // skip headings

String line = lines.readLine();
if (line == null) {
throw new IOException( "Expecting a line not the end of stream" );
}
StringTokenizer tokens =
new StringTokenizer(line, " \t\n\r\f%");

this.filesystem = tokens.nextToken();
if (!tokens.hasMoreTokens()) { // for long filesystem name
line = lines.readLine();
if (line == null) {
throw new IOException( "Expecting a line not the end of stream" );//这就是105行了
}
tokens = new StringTokenizer(line, " \t\n\r\f%");
}
this.capacity = Long.parseLong(tokens.nextToken()) * 1024;
this.used = Long.parseLong(tokens.nextToken()) * 1024;
this.available = Long.parseLong(tokens.nextToken()) * 1024;
this.percentUsed = Integer.parseInt(tokens.nextToken());
this.mount = tokens.nextToken();
}

打印103行，是能取到值的，但是乱码，发生错行，将第二行的数据放入了第一行，导致了105的错误，

按照http://hi.baidu.com/amdkings/blog/item/b589a5f56c1ddae17609d78f.html博文中的设置了myeclipse的编译环境还是不行，继续

往前追溯错误抛出在

at org.apache.hadoop.util.Shell.at org.apache.hadoop.util.Shell.runCommand(Shell.java:179)(Shell.java:179)

即是shell类中的runCommand方法调用

parseExecResult(inReader); // parse the output

在该方法中找到inReader变量的定义及初始化位置如下：

BufferedReader inReader = new BufferedReader(new InputStreamReader(process .getInputStream()));

很明显因为inReader 初始化没有进行charset设置，设置charset如下：

BufferedReader inReader = new BufferedReader(new InputStreamReader(process .getInputStream(),"utf-8"));

然后再运行，至此可正确往后运行

根据分析过程可得

临时解决办法：

将shell.java类中inReader变量进行编码设置，就是

BufferedReader inReader = new BufferedReader(new InputStreamReader(process .getInputStream(),"utf-8"));

较好实践思路：

cygwin中设置英文环境export LANG="en.UTF-8",df是变成英文显示了，但是在myeclipse里是不起作用的，还是中文乱码，可以考虑下载或者用什么方式

将cygwin改成英文环境，使得myeclipse读到英文环境，这样nutch1.2的源码就不需要调整即可运行了

豹先生_MR-BAO

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
win7下myeclipse部署nutch1.2报Expecting a line not the end of stream异常解决

在win7通过myeclipse部署nutch1.2源码，报如下异常：2011-10-28 00:09:37,784 WARN mapred.LocalJobRunner (LocalJobRunner.java:run(256)) - job_local_0001java.io.IOException: Expecting a line not the end of streama
复制链接

扫一扫