有人说需要用hadoop下的jar替换掉hbase下的jar,我想是可能的,以前粗略翻看官方文档中,好像并没有明确说要求替换,所以也未处理这块。
再次翻阅了一下文档,果然有相关的说法,摘录如下:
Because HBase depends on Hadoop, it bundles an instance of the Hadoop jar under its lib directory. The bundled jar is ONLY for use in standalone mode. In distributed mode, it is critical that the version of Hadoop that is out on your cluster match what is under HBase. Replace the hadoop jar found in the HBase lib directory with the hadoop jar you are running on your cluster to avoid version mismatch issues. Make sure you replace the jar in HBase everywhere on your cluster. Hadoop version mismatch issues have various manifestations but often all looks like its hung up.
筛选了一下HBase/lib目录下的Jar并找出Hadoop下Jar对应位置。
hadoop-annotations-2.5.1.jar common/lib/
hadoop-auth-2.5.1.jar common/lib/
hadoop-client-2.5.1.jar google检索从mave下载,对比2.6.0的Jar包就配置版本号不同,其它一样
hadoop-common-2.5.1.jar common/
hadoop-hdfs-2.5.1.jar hdfs/
hadoop-mapreduce-client-app-2.5.1.jar mapreduce/
hadoop-mapreduce-client-common-2.5.1.jar mapreduce/
hadoop-mapreduce-client-core-2.5.1.jar mapreduce/
hadoop-mapreduce-client-jobclient-2.5.1.jar mapreduce/
hadoop-mapreduce-client-shuffle-2.5.1.jar mapreduce/
hadoop-yarn-api-2.5.1.jar yarn/
hadoop-yarn-client-2.5.1.jar yarn/
hadoop-yarn-common-2.5.1.jar yarn/
hadoop-yarn-server-common-2.5.1.jar yarn/
[hadoop@bd110 lib]$ cd ~/hbase-1.1.1/lib/
[hadoop@bd110 lib]$ ls | wc -l
112
[hadoop@bd110 lib]$ ls | grep 2.5.1| wc -l
14
[hadoop@bd110 lib]$ rm -rf *2.5.1*
[hadoop@bd110 lib]$ ls | wc -l
98
copy2.6.0相关jar包至hbase,执行
[hadoop@bd110 hadoop]$ cd ~/hadoop-2.6.0/share/hadoop/
[hadoop@bd110 hadoop]$ ls
common hdfs httpfs kms mapreduce tools yarn zookeeper.out
[hadoop@bd110 hadoop]$ cd common/
[hadoop@bd110 common]$ cp hadoop-common-2.6.0.jar ~/hbase-1.1.1/lib/
[hadoop@bd110 common]$ cd lib/
[hadoop@bd110 lib]$ cp hadoop-annotations-2.6.0.jar hadoop-auth-2.6.0.jar ~/hbase-1.1.1/lib/
[hadoop@bd110 lib]$ cd ../../hdfs/
[hadoop@bd110 hdfs]$ cp hadoop-hdfs-2.6.0.jar ~/hbase-1.1.1/lib/
[hadoop@bd110 hdfs]$ cd ../mapreduce/
[hadoop@bd110 mapreduce]$ cp hadoop-mapreduce-client-app-2.6.0.jar hadoop-mapreduce-client-common-2.6.0.jar hadoop-mapreduce-client-core-2.6.0.jar hadoop-mapreduce-client-jobclient-2.6.0.jar hadoop-mapreduce-client-shuffle-2.6.0.jar ~/hbase-1.1.1/lib/
[hadoop@bd110 mapreduce]$ cd ../yarn/
[hadoop@bd110 yarn]$ cp hadoop-yarn-api-2.6.0.jar hadoop-yarn-client-2.6.0.jar hadoop-yarn-common-2.6.0.jar hadoop-yarn-server-common-2.6.0.jar ~/hbase-1.1.1/lib/
还有一个hadoop-client-2.6.0.jar从mave下载~
拷贝至其他机器
cd ~/hbase-1.1.1/lib/
scp *2.6.0* bd107:/home/hadoop/hbase-1.1.1/lib/
scp *2.6.0* bd108:/home/hadoop/hbase-1.1.1/lib/
hbase shell里面测试出现错误,重启再试,也会出现另外一个错误。
怀疑之前创建的hbase.rootdir,Jar包升级后历史数据不可用?
修改后再试,依然有些错误。将zookeeper的hbase.zookeeper.property.dataDir目录清空再试,还是不行。
最后查看hbase-hadoop-master-bd110.log日志。发现一些异常信息:
2015-07-27 11:48:53,918 WARN [main-SendThread(bd110:2181)] zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: 拒绝连接
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
这个挺奇怪,没有找到解决方法
另一个异常:
2015-07-27 13:25:11,972 ERROR [PriorityRpcServer.handler=5,queue=1,port=16000] master.MasterRpcServices: Region server bd108 209 ,16020,1437974687971 reported a fatal error:
ABORTING region server bd108,16020,1437974687971: Unhandled: Region server startup failed
Cause:
java.io.IOException: Region server startup failed
at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:2929)
at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1370)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:898)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoClassDefFoundError: org/htrace/Trace
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:214)
at com.sun.proxy.$Proxy19.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslator 220 PB.java:752)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy20.getFileInfo(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279)
at com.sun.proxy.$Proxy21.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1988)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1118)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1400)
at org.apache.hadoop.hbase.regionserver.HRegionServer.setupWALAndReplication(HRegionServer.java:1596)
at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1352)
... 2 more
Caused by: java.lang.ClassNotFoundException: org.htrace.Trace
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 27 more
居然有classNotFoundException 检查org.htrace.Trace,发现这个类在share/hadoop/common/lib/htrace-core-3.0.4.jar中,于是拷贝入各个hbase下lib目录。
重启,测试。hbase正常了。
hbase.rootdir切换回原来的目录,重启,测试,一样是正常的。Jar替换的问题算是解决了。
欣喜之余,立即测试是否依然会有“断开的管道”问题,本以为会就此解决,但是错误依旧如此。
是Jar包替换的不够完全吗?
据说用源码编译的环境不容易出问题,是不是该拿源码重新编译一次试试?