版本:
cdh5.0.0+hadoop2.3.0+hbase0.96.1.1+Spoon5.0.1
一、HBase数据导入
HBase数据导入使用org.apache.hadoop.hbase.mapreduce.ImportTsv 的两种方式,一种是直接导入,一种是转换为HFile,然后再次导入。
1. HDFS数据为(部分):
[root@node33 data]# hadoop fs -ls /input
Found 1 items
-rwxrwxrwx 1 hdfs supergroup 13245467 2014-05-01 17:09 /input/hbase-data.csv
[root@node33 data]# hadoop fs -cat /input/* | head -n 3
1,1.52101,13.64,4.49,1.1,71.78,0.06,8.75,0,0,1
2,1.51761,13.89,3.6,1.36,72.73,0.48,7.83,0,0,1
3,1.51618,13.53,3.55,1.54,72.99,0.39,7.78,0,0,1
2. 使用直接导入的方式
a. 建立hbase-employees-1表,使用hbase shell,进入shell模式,使用命令:create 'hbase-employees-1','col' ,建立表;
b. 进入hbase安装目录,如果使用cdh默认安装,一般在目录/usr/lib/hbase/bin中,运行:
./hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator="," -Dimporttsv.columns=HBASE_ROW_KEY,col:x1,col:x2,col:x3,col:x4,col:x5,col:x6,col:x7,col:x8,col:x9,col:y hbase-employees-1 hdfs://node33:8020/input/hbase-data.csv
日志如下:
2014-05-02 13:15:07,716 INFO [main] mapreduce.JobSubmitter: Submitting tokens for job: job_1398958404577_0018
2014-05-02 13:15:08,674 INFO [main] impl.YarnClientImpl: Submitted application application_1398958404577_0018
2014-05-02 13:15:09,101 INFO [main] mapreduce.Job: The url to track the job: http://node33:8088/proxy/application_1398958404577_0018/
2014-05-02 13:15:09,103 INFO [main] mapreduce.Job: Running job: job_1398958404577_0018
2014-05-02 13:15:34,169 INFO [main] mapreduce.Job: Job job_1398958404577_0018 running in uber mode : false
2014-05-02 13:15:34,207 INFO [main] mapreduce.Job: map 0% reduce 0%
2014-05-02 13:16:32,789 INFO [main] mapreduce.Job: map 1% reduce 0%
2014-05-02 13:16:53,477 INFO [main] mapreduce.Job: map 5% reduce 0%
2014-05-02 13:16:56,701 INFO [main] mapreduce.Job: map 9% reduce 0%
2014-05-02 13:16:59,928 INFO [main] mapreduce.Job: map 13% reduce 0%
2014-05-02 13:17:02,970 INFO [main] mapreduce.Job: map 16% reduce 0%
2014-05-02 13:17:07,260 INFO [main] mapreduce.Job: map 22% reduce 0%
2014-05-02 13:17:10,472 INFO [main] mapreduce.Job: map 29% reduce 0%
2014-05-02 13:17:12,879 INFO [main] mapreduce.Job: map 36% reduce 0%
2014-05-02 13:17:16,555 INFO [main] mapreduce.Job: map 45% reduce 0%
2014-05-02 13:17:43,452 INFO [main] mapreduce.Job: map 48% reduce 0%
2014-05-02 13:17:45,629 INFO [main] mapreduce.Job: map 63% reduce 0%
2014-05-02 13:17:52,845 INFO [main] mapreduce.Job: map 79% reduce 0%
2014-05-02 13