通过java接口下载文件
//下载文件,获取FileSystem的实例,FileSystem是抽象类,其实是获取DistributedFileSystem
FileSystem fs = FileSystem.get(new URI("hdfs://itcast01:9000"),new Configuration());
//Returns the FileSystem for this URI's scheme and authority.
//通过open()方法获取文件的输入流
InputStream in = fs.open(new Path("/jdk1.7"));
OutputStream out = new FileOutputStream("E://jdk1.7");
//拷贝字节流,缓冲区大小为4096,true表示拷贝完成将数据流关闭
IOUtils.copyBytes(in,out,4096,true);
上传文件
//需要指定用户,HDFS默认的是仅root可写,可改
FileSystem fs = FileSystem.get(new URI("hdfs://itcast01:9000"),new Configuration(),”root”);
//获取本地文件系统的文件,返回输入流
InputStream in = new FileInputStream("E://alive.mp4");
//在HDFS上创建一个文件,返回其输出流
OutputStream out = fs.create(new Path("/alive"));
//输出到输入
IOUtils.copyBytes(in, out, 4096, true);
其他操作
同样的代码,为什么下载会出现空指针错误
//删除文件
boolean flag1 = fs.delete(new Path("/alive"), false);//true代表递归删除
//创建文件夹
boolean flag2 = fs.mkdirs(new Path("/home"));
这里想说的是上传与下载的两个简便方法
//上传文件
fs.copyFromLocalFile(new Path("E://alive.mp4"), new Path("/al"));
//下载文件
fs.copyToLocalFile(new Path("/jdk1.7"), new Path("f://jkd"));
查看源代码之后,发现copyFromLocalFile默认的调用以下的函数
它们的delSrc都是是否删除源文件
那么问题就出现在这个useRawLocalFileSystem上
/**
* The src file is on the local disk. Add it to FS at
* the given dst name and the source is kept intact afterwards
* @param src path
* @param dst path
*/
public void copyFromLocalFile(Path src, Path dst)
throws IOException {
copyFromLocalFile(false, src, dst);
}
/**
* The src file is on the local disk. Add it to FS at
* the given dst name.
* delSrc indicates if the source should be removed
* @param delSrc whether to delete the src
* @param src path
* @param dst path
*/
public void copyFromLocalFile(boolean delSrc, Path src, Path dst)
throws IOException {
copyFromLocalFile(delSrc, true, src, dst);
}
/**
* The src file is on the local disk. Add it to FS at
* the given dst name.
* delSrc indicates if the source should be removed
* @param delSrc whether to delete the src
* @param overwrite whether to overwrite an existing file
* @param src path
* @param dst path
*/
public void copyFromLocalFile(boolean delSrc, boolean overwrite,
Path src, Path dst)
throws IOException {
Configuration conf = getConf();
FileUtil.copy(getLocal(conf), src, this, dst, delSrc, overwrite, conf);
}
而copyToLocalFile则会调用以下函数
/**
* The src file is under FS, and the dst is on the local disk.
* Copy it from FS control to the local dst name.
* @param src path
* @param dst path
*/
public void copyToLocalFile(Path src, Path dst) throws IOException {
copyToLocalFile(false, src, dst);
}
/**
* The src file is under FS, and the dst is on the local disk.
* Copy it from FS control to the local dst name.
* delSrc indicates if the src will be removed or not.
* @param delSrc whether to delete the src
* @param src path
* @param dst path
*/
public void copyToLocalFile(boolean delSrc, Path src, Path dst)
throws IOException {
copyToLocalFile(delSrc, src, dst, false);
}
/**
* The src file is under FS, and the dst is on the local disk. Copy it from FS
* control to the local dst name. delSrc indicates if the src will be removed
* or not. useRawLocalFileSystem indicates whether to use RawLocalFileSystem
* as local file system or not. RawLocalFileSystem is non crc file system.So,
* It will not create any crc files at local.
*
* @param delSrc whether to delete the src
* @param src path
* @param dst path
* @param useRawLocalFileSystem
* whether to use RawLocalFileSystem as local file system or not.
*/
public void copyToLocalFile(boolean delSrc, Path src, Path dst,
boolean useRawLocalFileSystem) throws IOException {
Configuration conf = getConf();
FileSystem local = null;
if (useRawLocalFileSystem) {
local = getLocal(conf).getRawFileSystem();
} else {
local = getLocal(conf);
}
FileUtil.copy(this, src, local, dst, delSrc, conf);
}
唯一不同的是上传多出来的一个boolean是询问是否覆盖已存在的文件
而下载多出来的一个boolean是是否创建本地文件系统,恰好是false
如果下载写成下面这样便会通过编译
fs.copyToLocalFile(false,new Path("/jdk1.7"), new Path("f://jkd"),true);
“下载”源代码的42和44行,说明了如果useRawLocalFileSystem为真就会调用一个getRawFileSystem()方法
API上说这个方法将会返回一个本地文件系统
上看到RawLocalSystem类是一个本地文件系统及详细的一些解释
而我猜想或许下载可以往多台机器上下载,不一定是本地机器,
所以当下载到本地机器上的时候,需要将useRawLocalFileSystem参数改为true
返回一个本地文件系统,否则默认是false,会抛出空指针异常