I need to move multiple files in HDFS, that correspond to a given regular expression, using a Java / Scala program. For example, I have to move all files with name *.xml from folder a to folder b.
Using a shell command I can use the following:
bin/hdfs dfs -mv a/*.xml b/
I can move a single file using Java API, with the following code (scala language), using the rename method on FileSystem class:
// Prepare initial configuration
val conf = new Configuration()
conf.set("fs.defaultFS", "hdfs://hdfs:9000/user/root")
val fs = FileSystem.get(conf)
// Move a single file
val ok = fs.rename(new Path("a/file.xml"), new Path("b/file.xml"));
As far as I know the Path class represents an URI. Then, I can't use in the following way:
val ok = fs.rename(new Path("a/*.xml"), new Path("b/"));
Is there a way to move a set of file in HDFS via Java / Scala API?
解决方案
You can use fs.rename(new Path("a"), new Path("b"))
But if you want to have *.xml there are filter files like globfilter.
FileSystem fs = FileSystem.get(URI.create(arg0[0]), conf);
Path path = new Path(arg0[0] + arg0[1]); // arg0[1] NYSE_201[2-3]
//arg0[0] is base path
//ar0[1] uses regular expression
FileStatus[] status = fs.globStatus(path);
Path[] paths = FileUtil.stat2Paths(status);
for (Path p : paths) {
//
//
}