hadoop2.5.2 + hbase 0.98的环境下, 在使用hbase 导出数据时候遇到了一个问题。在使用hbase提供的一些基于mapreduce job的工具的时候。都出报出一些问题。
比如下面两个工具
./hbase org.apache.hadoop.hbase.mapreduce.Driver rowcounter bs2_file //mapreduce job 计算表bs2_file的行的总数
./hbase class org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot test_bs2_file_11_07 -copy-to files:///tmp/db-dump -mappers 16 //mapreduce job导出test_bs2_file_11_07 这个snapshot 到本地目录/tmp/db-dump
报出的问题是找不到 org.apache.hadoop.mapreduce.JobCounter.MB_MILLIS_MAPS 这个枚举类型
2016-11-07 11:35:55,619 ERROR [main] snapshot.ExportSnapshot: Snapshot export failed
java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.mapreduce.JobCounter.MB_MILLIS_MAPS
at java.lang.Enum.valueOf(Enum.java:238)
at org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.valueOf(FrameworkCounterGroup.java:148)
at org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.findCounter(FrameworkCounterGroup.java:182)
at org.apache.hadoop.mapreduce.counters.AbstractCounters.findCounter(AbstractCounters.java:154)
at org.apache.hadoop.mapreduce.TypeConverter.fromYarn(TypeConverter.java:240)
at org.apache.hadoop.mapred.ClientServiceDelegate.getJobCounters(ClientServiceDelegate.java:370)
at org.apache.hadoop.mapred.YARNRunner.getJobCounters(YARNRunner.java:511)
at org.apache.hadoop.mapreduce.Job$7.run(Job.java:756)
at org.apache.hadoop.mapreduce.Job$7.run(Job.java:753)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapreduce.Job.getCounters(Job.java:753)
at org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1361)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1289)
at org.apache.hadoop.hbase.snapshot.ExportSnapshot.runCopyJob(ExportSnapshot.java:816)
at org.apache.hadoop.hbase.snapshot.ExportSnapshot.run(ExportSnapshot.java:995)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.hbase.snapshot.ExportSnapshot.innerMain(ExportSnapshot.java:1064)
at org.apache.hadoop.hbase.snapshot.ExportSnapshot.main(ExportSnapshot.java:1068)
经查这个枚举类型在下面的源文件中定义,在hadoop 2.5.2版本中才添加。
hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobCounter.java
我的集群中hadoop 版本:
hadoop@hadoop-svr1:~/hadoop/bin$ ./hadoop version
Hadoop 2.5.2
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r cc72e9b000545b86b75a61f4835eb86d57bfafc0
Compiled by jenkins on 2014-11-14T23:45Z
Compiled with protoc 2.5.0
From source with checksum df7537a4faa4658983d397abf4514320
This command was run using /home/hadoop/hadoop/share/hadoop/common/hadoop-common-2.5.2.jar
我的集群中的hbase版本:
hadoop@hadoop-svr1:~/hbase/bin$ ./hbase version
2016-11-07 17:12:15,997 INFO [main] util.VersionInfo: HBase 0.98.17-hadoop2
2016-11-07 17:12:15,997 INFO [main] util.VersionInfo: Source code repository git://aspire/home/apurtell/src/hbase revision=d5f8300c082a75ce8edbbe08b66f077e7d663a4a
2016-11-07 17:12:15,997 INFO [main] util.VersionInfo: Compiled by apurtell on Fri Jan 15 22:46:43 PST 2016
2016-11-07 17:12:15,997 INFO [main] util.VersionInfo: From source with checksum 6e40b5bc9a3782b583c36af66806049d
Hbase 0.98 使用的默认hadoop client 版本是2.2.0, 可以通过命令./hbase classpath 显示。
这是由于hbase 0.98使用的hadoop-client 与集群的hadoop版本不一致造成的。可能的几种解决方法
1, 升级hbase0.98到 hbase1.0+
2, 让hbase引用hadoop-2.5.2 版本的客户端。我选择的第二种方法。升级hbase 有点麻烦。下面的是具体的方法
1)进入hbase/lib目录, 删除(或者移走)以hadoop- 开头的所有jar 包
2)通过./hbase classpath, 可以看到hbase的运行classpath已经包含了hadoop的library 路径,应该是通过hbase-env.sh 文件中的# Extra Java CLASSPATH elements. Optional. export HBASE_CLASSPATH=/home/hadoop/hadoop/etc/hadoop 这条配置路径。
所以实际上只需要第一步,删除或者移走hbase0.98 lib 目录下的hadoop-2.2.0客户端jar包。
运行后提交mapreduce job 成功。