数据转移-从HBase到Hive

最新推荐文章于 2024-06-28 12:43:09 发布

我的猪仔队友

最新推荐文章于 2024-06-28 12:43:09 发布

阅读量9.2k

点赞数 3

本文链接：https://blog.csdn.net/qq_34100655/article/details/81070216

版权

数据从HBase导入到Hive，过程参考：
https://blog.csdn.net/wuxintdrh/article/details/78935597；
https://blog.csdn.net/dominic_tiger/article/details/70237542；

1.进入HBase：

[root@name01-test ~]# su hdfs
[hdfs@name01-test root]$ hbase shell
Usage: hbase [<options>] <command> [<args>]
Options:
  --config DIR    Configuration direction to use. Default: ./conf
  --hosts HOSTS   Override the list in 'regionservers' file
  --auth-as-server Authenticate to ZooKeeper using servers configuration

Commands:
Some commands take arguments. Pass no args or -h for usage.
  shell           Run the HBase shell
  hbck            Run the hbase 'fsck' tool
  snapshot        Create a new snapshot of a table
  snapshotinfo    Tool for dumping snapshot information
  wal             Write-ahead-log analyzer
  hfile           Store file analyzer
  zkcli           Run the ZooKeeper shell
  upgrade         Upgrade hbase
  master          Run an HBase HMaster node
  regionserver    Run an HBase HRegionServer node
  zookeeper       Run a Zookeeper server
  rest            Run an HBase REST server
  thrift          Run the HBase Thrift server
  thrift2         Run the HBase Thrift2 server
  clean           Run the HBase clean up script
  classpath       Dump hbase CLASSPATH
  mapredcp        Dump CLASSPATH entries required by mapreduce
  pe              Run PerformanceEvaluation
  ltt             Run LoadTestTool
  version         Print the version
  CLASSNAME       Run the class named CLASSNAME
[hdfs@name01-test root]$

只需输入[root@name01-test ~]# su hdfs [hdfs@name01-test root]$ hbase shell即可；

2.创建HBase表格
参考：https://www.cnblogs.com/tony-tang/p/6473393.html；

create 'userinfo', 'info'


hbase(main):020:0> put 'userinfo', '1', 'info:age', '23'
0 row(s) in 0.0120 seconds

hbase(main):030:0> put 'userinfo', '2', 'info:name', 'wsx'
0 row(s) in 0.0150 seconds

hbase(main):031:0> put 'userinfo', '3', 'info:name', 'chengbao'
0 row(s) in 0.0100 seconds

hbase(main):032:0> scan 'userinfo'
ROW                                                   COLUMN+CELL                                                                                                                                                
 1                                                    column=info:age, timestamp=1531745740707, value=23                                                                                                         
 1                                                    column=info:name, timestamp=1531745798480, value=chb1                                                                                                      
 1                                                    column=info:sex, timestamp=1531745814210, value=male                                                                                                       
 2                                                    column=info:name, timestamp=1531745886703, value=wsx                                                                                                       
 3                                                    column=info:name, timestamp=1531745906320, value=chengbao                                                                                                  
3 row(s) in 0.0130 seconds

这样就在HBase里新建了一个表格，这个表格需要从HBase转移到Hive当中；

3.创建Hive的映射表格
创建Hbase映射的Hive表

--key是hbase的rowkey, 各个字段是hbase中的quailiter
CREATE external TABLE hbase_table_1(key String, name string)  -- 创建hive的表 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'  -- 使用的类
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,info:name") -- 字段映射关系
TBLPROPERTIES ("hbase.table.name" = "userinfo"); --映射的表

4.查看Hive表的取值
此时查看Hive表格，发现其值已经和HBase中的表格一样。

hive> select * from hbase_table_1; 
OK
1       chb1
2       wsx
3       chengbao
Time taken: 0.085 seconds, Fetched: 3 row(s)
hive>

5.更新HBase的值后，再观察Hive，发现其数值会连带动态更新

hbase(main):001:0> put 'userinfo', '4', 'info:name', 'mike'
0 row(s) in 0.3210 seconds

hbase(main):002:0> scan 'userinfo'
ROW                                                   COLUMN+CELL                                                                                                                                                
 1                                                    column=info:age, timestamp=1531783708749, value=23                                                                                                         
 1                                                    column=info:name, timestamp=1531783863243, value=chb1                                                                                                      
 1                                                    column=info:sex, timestamp=1531783905927, value=male                                                                                                       
 2                                                    column=info:name, timestamp=1531783929350, value=wsx                                                                                                       
 3                                                    column=info:name, timestamp=1531783948542, value=chengbao                                                                                                  
 4                                                    column=info:name, timestamp=1531784664541, value=mike                                                                                                      
4 row(s) in 0.0600 seconds

hive> select * from hbase_ys;
OK
1   chb1
2   wsx
3   chengbao
4   mike
Time taken: 2.764 seconds, Fetched: 4 row(s)

这样就完成了数据从HBase到Hive的迁移。

总的来说，HBase语言和MySQL语言有如下不同：
1.HBase只有表的概念，没有库的概念；
2.语句中，表的名称要加引号，而MySQL和Hive则不需要加引号；
3.对于大小写的区别极其严格，标识符都要小写；
4.语句末尾不需要加引号；
5.删除表之前，一定要先将其disable处理；
6.当HBase SQL拼写错误时，删除语句方法为：Ctrl+Backspace；

我的猪仔队友

关注

3
点赞
踩
19

收藏

觉得还不错? 一键收藏
2
评论
数据转移-从HBase到Hive

HBase是hadoop生态圈非常重要的一个环节，也是存储数据的仓库之一。和Hive类似，都是在Linux系统下使用，其语法比较类似于SQL语言，却于Hive QS/MySQL等有着一定的区别。这里，将HBase的使用方法记录下来。详细的入门教程可移步：https://www.yiibai.com/hbase/...
复制链接

扫一扫