Exception in thread "main" java.io.FileNotFoundException: hdfs:\192.168.73.16:8020\user\9003547\text.txt (文件名、目录名或卷标语法不正确。)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:146)
at org.chinaskin.hadoop.FileCopy.main(FileCopy.java:29)
在win上使用Myeclipse链接hadoop时出现如上错误,注意看打印的错误信息,传入的是 hdfs://192.xxxx,变成了hdfs:\192xxxxx,将传入的两个参数打印出来发现并无异常并且文件在hdfs上是存在的。
代码如下:
import java.io.InputStream;
import java.io.OutputStream;
import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.net.URI;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
public class aa {
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(URI.create(args[1]), conf);
InputStream in = new BufferedInputStream(new FileInputStream(args[0]));
OutputStream out = fs.create(new Path(args[1]));
IOUtils.copyBytes(in, out, 4096, true);
}
}
解决方法
因为是从windows上访问hadoop集群,可以认为是远程,所以考虑使用url去读,资料如下
http://www.tuicool.com/articles/aeAVJ3
修改代码
public class aa {
static {// 静态块,设置hdfs协议
URL. setURLStreamHandlerFactory ( new FsUrlStreamHandlerFactory());
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
URL url = new URL(args[0]);
InputStream in = url.openStream();//引入相应的包使用url读取数据
FileSystem fs = FileSystem.get(URI.create(args[1]), conf);
OutputStream out = fs.create(new Path(args[1]));
IOUtils.copyBytes(in, out, 4096, true);
}
}
另一种解决方法,其实找不到文件名的错误是因为,在将一个文件读取转换为字节流的时候,没有经过hadoop 的FileSystem,即
```java
InputStream in = new BufferedInputStream(new FileInputStream(args[0]));
```
修改为InputStream in = new BufferedInputStream(fs.open(new Path(args[0])));
读取hdfs的文件使用 fs.open(enw Path(args[0]))