搭建配置环境
导入所需要的三个配置文件
检查包导出的路径
hbase api简单实用
整合mapreduce
在使用bin/yarn jar jar包的时候,hadoop会自动加载所需依赖的jar包。但是在整合hbase时,不会自动导入hbase所需要的依赖jar包,在运行hbase整合mapreduce的时候,要手动添加。命令如下:
$ bin/hbase mapredcp -》这个指令代表与mapreduce集成的时候需要的jar包
2015-10-12 08:23:04,194 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
/opt/modules/hbase-0.98.6-hadoop2/lib/hbase-common-0.98.6-hadoop2.jar:/opt/modules/hbase-0.98.6-hadoop2/lib/protobuf-java-2.5.0.jar:/opt/modules/hbase-0.98.6-hadoop2/lib/hbase-client-0.98.6-hadoop2.jar:/opt/modules/hbase-0.98.6-hadoop2/lib/hbase-hadoop-compat-0.98.6-hadoop2.jar:/opt/modules/hbase-0.98.6-hadoop2/lib/hbase-server-0.98.6-hadoop2.jar:/opt/modules/hbase-0.98.6-hadoop2/lib/zookeeper-3.4.6.jar:/opt/modules/hbase-0.98.6-hadoop2/lib/hbase-protocol-0.98.6-hadoop2.jar:/opt/modules/hbase-0.98.6-hadoop2/lib/high-scale-lib-1.1.1.jar:/opt/modules/hbase-0.98.6-hadoop2/lib/guava-12.0.1.jar:/opt/modules/hbase-0.98.6-hadoop2/lib/htrace-core-2.04.jar:/opt/modules/hbase-0.98.6-hadoop2/lib/netty-3.6.6.Final.jar
可以发现执行这个命令,会把所有hbase所依赖的jar包路径找到
在运行mapreduce的时候,把hbase依赖的jar包路径添加进去即可运行
hbase自带的mapreduce程序
在hbase的lib目录下有个hbase-server-0.98.6-hadoop2.jar包,
里面自带了许多mapreduce程序,我们这里运行下。
$ /opt/cdh-5.3.6/hadoop-2.5.0-cdh5.3.6/bin/hadoop jar lib/hbase-server-0.98.6-hadoop2.jar
这里发现报错了是因为我没有导入hbase所依赖的环境,要自己设置下。
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/filter/Filter
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2570)
at java.lang.Class.getMethod0(Class.java:2813)
at java.lang.Class.getMethod(Class.java:1663)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.<init>(ProgramDriver.java:60)
at org.apache.hadoop.util.ProgramDriver.addClass(ProgramDriver.java:104)
at org.apache.hadoop.hbase.mapreduce.Driver.main(Driver.java:39)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.filter.Filter
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)`
导入环境变量
export HBASE_HOME=/opt/modules/hbase-0.98.6-hadoop2
export HADOOP_HOME=/opt/cdh-5.3.6/hadoop-2.5.0-cdh5.3.6
执行hbase lib包内的mapreduce程序
HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase mapredcp` $HADOOP_HOME/bin/yarn jar $HBASE_HOME/lib/hbase-server-0.98.6-hadoop2.jar
CellCounter: Count cells in HBase table
completebulkload: Complete a bulk data load.
copytable: Export a table from local cluster to peer cluster
export: Write table data to HDFS.
import: Import data written by Export.
importtsv: Import data in TSV format.
rowcounter: Count rows in HBase table
verifyrep: Compare the data from tables in two different clusters.
运行一个rowcount程序,统计czk表所含有的记录数
`HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase mapredcp` $HADOOP_HOME/bin/yarn jar $HBASE_HOME/lib/hbase-server-0.98.6-hadoop2.jar rowcounter czk`
结果如下,czk表记录数为1
org.apache.hadoop.hbase.mapreduce.RowCounter\\$RowCounterMapper\\$Counters
ROWS=1