Hadoop3.1.2 使用java连接 HDFS踩得坑
hadoop3中DistributedFileSystem类找不到错误
按照hadoop2的示例,运行错误如下:
org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "hdfs"
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3281)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3301)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479)
at xin.tomdonkey.HadoopDemo.initHDFS(HadoopDemo.java:53)
at xin.tomdonkey.HadoopDemo.main(HadoopDemo.java:63)
Exception in thread "main" java.lang.NullPointerException
at xin.tomdonkey.HadoopDemo.main(HadoopDemo.java:63)
如果加入了如下代码:
Configuration conf = new Configuration();
conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
则错误变成:
Exception in thread "main" java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.hdfs.DistributedFileSystem not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2595)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3269)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3301)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479)
at xin.tomdonkey.HadoopDemo.initHDFS(HadoopDemo.java:53)
at xin.tomdonkey.HadoopDemo.main(HadoopDemo.java:63)
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.hdfs.DistributedFileSystem not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2499)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2593)
... 8 more
显然错误是由于缺少hdfs的scheme 引起的,也就是FileSystem 缺少相对应hdfs的实现。
这是因为org.apache.hadoop.hdfs.DistributedFileSystem类
在hadoop2中hadoop-hdfs-3.1.2.jar中
在hadoop3中被迁移到了:hadoop-hdfs-client-3.1.2中
按照原来的maven依赖进行依赖,是不行的,再加入就可以了
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>3.1.2</version>
</dependency>