(1)在old cluster上执行:./hbase org.apache.hadoop.hbase.mapreduce.Export hbasetable hdfs://new cluster ip:8020/user/dirkzhang
在import的时候指定timestamp或是version,他的代码如下
Scan s = new Scan();
// Optional arguments.
// Set Scan Versions
int versions = args.length > 2? Integer.parseInt(args[2]): 1;
s.setMaxVersions(versions);
// Set Scan Range
long startTime = args.length > 3? Long.parseLong(args[3]): 0L;
long endTime = args.length > 4? Long.parseLong(args[4]): Long.MAX_VALUE;
s.setTimeRange(startTime, endTime);
第三个参数是version
第四个参数是start timestamp
第五个参数是end timestamp
(2)在new cluster上执行:./hbase org.apache.hadoop.hbase.mapreduce.Import hbasetable hdfs://new cluster ip:8020/user/dirkzhang
在import的时候使用bulkload,将hdfs://new cluster ip:8020/user/dirkzhang的export文件放在/user/dirkzhang/hfiledir
bin/hbase org.apache.hadoop.hbase.mapreduce.Import -D mapreduce.job.queuename=rtb -D mapreduce.job.maxtaskfailures.per.tracker=100 -D import.bulk.output=/user/dirkzhang/hfiledir hbasetable hdfs://new cluster ip:8020/user/dirkzhang > report.log 2>&1 &
在用completebulkload,将hfile导入hbase
hadoop jar /home/q/hadoop/hbase-0.98.0-hadoop2/lib/hbase-server-0.98.0-hadoop2.jar completebulkload -Dhbase.zookeeper.quorum=192.168.xx.xx -Dhbase.zookeeper.property.clientPort=2181 hdfs://192.168.xxx.xxx:8020/user/hadoop/wfdata/testoutput xxx_table