Phoenix视图映射

Phoenix定位为OLTP和操作型分析(operational analytics),大多用于在线业务,稳定性要求第一位。Phoenix的功能很强大,也很灵活,Phoenix SQL基于SQL-92标准,但是还是有很多方言,使用时需要特别注意。

DataX是阿里巴巴集团内被广泛使用的离线数据同步工具/平台,实现包括 MySQL、Oracle、SqlServer、Postgre、HDFS、Hive、ADS、HBase、TableStore(OTS)、MaxCompute(ODPS)、DRDS 等各种异构数据源之间高效的数据同步功能。

这里主要用到oraclereaderhbase11xwriter

 

基础软件版本:

Apache Hadoop:hadoop-2.8.5

Apache HBase:hbase-1.4.10

Apache Phoenix:phoenix-4.14.3-HBase-1.4-bin

 

Phoenix版本支持:

Phoenix Current release 4.15.0 can run on Apache HBase 1.3, 1.4 and 1.5. CDH HBase 5.11, 5.12, 5.13 and 5.14 is supported by 4.14.0. Apache HBase 2.0 is supported by 5.0.0.

 

hbase客户端

hbase shell

phoenix客户端

phoenix-4.14.3-HBase-1.4-bin/bin/sqlline.py

 

# 进入hbase shell客户端$ hbase shell

SLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/home/admin/hbase-1.4.10/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/home/admin/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]HBase ShellUse "help" to get list of supported commands.Use "exit" to quit this interactive shell.Version 1.4.10, r76ab087819fe82ccf6f531096e18ad1bed079651, Wed Jun  5 16:48:11 PDT 2019

# 创建命名空间ZLXX

hbase(main):006:0> create_namespace 'ZLXX'0 row(s) in 1.1150 seconds

# 创建hbase表,使用命名空间​​​​​​​

hbase(main):001:0> create 'ZLXX:WT_TRADE_REFUND', 'INFO'0 row(s) in 1.7510 seconds => Hbase::Table - ZLXX:WT_TRADE_REFUND

查看表描述信息(desc)​​​​​​​

hbase(main):009:0> desc 'ZLXX:WT_TRADE_REFUND'Table ZLXX:WT_TRADE_REFUND is ENABLED                                                                                                                                                      ZLXX:WT_TRADE_REFUND                                                                                                                                                                       COLUMN FAMILIES DESCRIPTION                                                                                                                                                                {NAME => 'INFO', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                                                                                       1 row(s) in 0.0730 seconds

# 使用DataX往hbase表里面添加测试数据(存量批量数据采集)

数据源来自Oracle,数据目标系统是HBase原生创建的表。

使用oraclereader和hbase11xwriter。

具体配置zlxx_test.json​​​​​​​

$ vi zlxx_test.json{  "job": {    "content": [      {        "reader": {          "name": "oraclereader",          "parameter": {            "column": [],            "connection": [              {                "fetchSize": "1024",                "jdbcUrl": [                  "$DW_ORCL_ZLXX_JDBCURL"                ],                "querySql": [                  "select id, trade_id, amount, status, TO_CHAR(modify_time,'YYYYMMDDhh24miss') AS modify_time from zlxx.trade_refund"                ]              }            ],            "password": "$DW_ORCL_ZLXX_PASSWORD",            "username": "$DW_ORCL_ZLXX_USERNAME"          }        },        "writer": {          "name": "hbase11xwriter",          "parameter": {            "hbaseConfig": {              "hbase.zookeeper.quorum": "192.168.10.211:12181,192.168.10.212:12181,192.168.10.213:12181"            },            "table": "ZLXX:WT_TRADE_REFUND",            "mode": "normal",            "nullMode": "empty",            "rowkeyColumn": [              {                "index":0,                "type":"string"              }            ],            "column": [              {                "index":1,                "name": "INFO:TRADE_ID",                "type": "string"              },              {                "index":2,                "name": "INFO:AMOUNT",                "type": "string"              },              {                "index":3,                "name": "INFO:STATUS",                "type": "string"              },              {                "index":4,                "name": "INFO:MODIFY_TIME",                "type": "string"              }            ],            "encoding": "utf-8"          }        }      }    ],    "setting": {      "speed": {        "channel": "10"      }    }  }}

执行DataX的shell脚本

数据库连接和用户名密码等配置在服务器的环境变量中​​​​​​​

$ vi zlxx_test.shsource ~/.bash_profilepython datax/bin/datax.py -j "-Xms1g -Xmx1g -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=datax/log" -p"-DDW_ORCL_ZLXX_JDBCURL=$DW_ORCL_ZLXX_JDBCURL -DDW_ORCL_ZLXX_PASSWORD=$DW_ORCL_ZLXX_PASSWORD -DDW_ORCL_ZLXX_USERNAME=$DW_ORCL_ZLXX_USERNAME"

执行shell脚本​​​​​​​

$ sh zlxx_test.sh......任务总计耗时                    :                 10s任务平均流量                    :            1.77KB/s记录写入速度                    :             52rec/s读出记录总数                    :                 528读写失败总数                    :                   0

查看一下数据​​​​​​​

hbase(main):012:0* scan 'ZLXX:WT_TRADE_REFUND', {LIMIT=>1}ROW                                             COLUMN+CELL                                                                                                                                 1005528                                        column=INFO:AMOUNT, timestamp=1594783514554, value=100                                                                                      1005528                                        column=INFO:MODIFY_TIME, timestamp=1594783514554, value=20200228185803                                                                      1005528                                        column=INFO:STATUS, timestamp=1594783514554, value=01                                                                                       1005528                                        column=INFO:TRADE_ID, timestamp=1594783514554, value=T12059592                                                                             1 row(s) in 0.1220 seconds

Tips:Phoenix不管你输入的是大写还是小写都默认把它转成大写的,如果要小写的话必须加上引号;所以上面从hbase shell创建的表和列均使用了大写,便于Phoenix自然映射。

# Phoenix使用as select创建视图映射到HBase表

先创建Phoenix schema:​​​​​​​

0: jdbc:phoenix:> create schema ZLXX;No rows affected (0.012 seconds)

创建视图,由于Phoenix大小写最终都是大写,所以这里可以使用小写​​​​​​​

create view ZLXX.WT_TRADE_REFUND (    id varchar primary key,    info.trade_id varchar,    info.amount  varchar,    info.status varchar,    info.modify_time varchar) as select * from zlxx.wt_trade_refund;

视图创建效果:​​​​​​​

0: jdbc:phoenix:> create view ZLXX.WT_TRADE_REFUND (. . . . . . . . >     id varchar primary key,. . . . . . . . >     info.trade_id varchar,. . . . . . . . >     info.amount  varchar,. . . . . . . . >     info.status varchar,. . . . . . . . >     info.modify_time varchar. . . . . . . . > ) as select * from zlxx.wt_trade_refund;No rows affected (5.956 seconds)

# 走一个查询试试看(完美)​​​​​​​

0: jdbc:phoenix:> select * from zlxx.wt_trade_refund limit 5;+----------+------------+---------+---------+-----------------+|    ID    |  TRADE_ID  | AMOUNT  | STATUS  |   MODIFY_TIME   |+----------+------------+---------+---------+-----------------+| 1005528  | T12059592  | 100     | 01      | 20200228185803  || 1005529  | T12059591  | 10      | 03      | 20200228185846  || 1005530  | T12059606  | 10      | 01      | 20200228195442  || 1005531  | T12059605  | 10      | 01      | 20200228195442  || 1005532  | noTradeId  | 100     | 03      | 20200229224015  |+----------+------------+---------+---------+-----------------+5 rows selected (0.062 seconds)

# Phoenix使用column_encoded_bytes=0

创建视图映射到HBase表(测试下多视图)

由于需要名字相同,先把上面的view ZLXX.WT_TRADE_REFUND删除,稍后再使用as select创建不同名称的视图。

删除视图:

drop view ZLXX.WT_TRADE_REFUND;

创建视图:​​​​​​​

create view ZLXX.WT_TRADE_REFUND (    id varchar primary key,    info.trade_id varchar,    info.amount  varchar,    info.status varchar,    info.modify_time varchar) column_encoded_bytes=0;

也是正常的,能够看到有数据查询出来​​​​​​​

0: jdbc:phoenix:> drop view ZLXX.WT_TRADE_REFUND;No rows affected (0.046 seconds)0: jdbc:phoenix:> create view ZLXX.WT_TRADE_REFUND (. . . . . . . . >     id varchar primary key,. . . . . . . . >     info.trade_id varchar,. . . . . . . . >     info.amount  varchar,. . . . . . . . >     info.status varchar,. . . . . . . . >     info.modify_time varchar. . . . . . . . > ) column_encoded_bytes=0;No rows affected (0.041 seconds)

limit查询数据:​​​​​​​

0: jdbc:phoenix:> select * from zlxx.wt_trade_refund limit 5;+----------+------------+---------+---------+-----------------+|    ID    |  TRADE_ID  | AMOUNT  | STATUS  |   MODIFY_TIME   |+----------+------------+---------+---------+-----------------+| 1005528  | T12059592  | 100     | 01      | 20200228185803  || 1005529  | T12059591  | 10      | 03      | 20200228185846  || 1005530  | T12059606  | 10      | 01      | 20200228195442  || 1005531  | T12059605  | 10      | 01      | 20200228195442  || 1005532  | noTradeId  | 100     | 03      | 20200229224015  |+----------+------------+---------+---------+-----------------+5 rows selected (0.056 seconds)

下面使用as select创建名称不同的视图,假设不需要status字段​​​​​​​

create view ZLXX.WT_TRADE_REFUND2 (    id varchar primary key,    info.trade_id varchar,    info.amount  varchar,    info.modify_time varchar) as select INFO.TRADE_ID, INFO.AMOUNT, INFO.MODIFY_TIME from zlxx.wt_trade_refund;

会报错​​​​​​​

0: jdbc:phoenix:> create view ZLXX.WT_TRADE_REFUND2 (. . . . . . . . >     id varchar primary key,. . . . . . . . >     info.trade_id varchar,. . . . . . . . >     info.amount  varchar,. . . . . . . . >     info.modify_time varchar. . . . . . . . > ) as select INFO.TRADE_ID, INFO.AMOUNT, INFO.MODIFY_TIME from zlxx.wt_trade_refund;Error: ERROR 604 (42P00): Syntax error. Mismatched input. Expecting "ASTERISK", got "INFO" at line 6, column 13. (state=42P00,code=604)org.apache.phoenix.exception.PhoenixParserException: ERROR 604 (42P00): Syntax error. Mismatched input. Expecting "ASTERISK", got "INFO" at line 6, column 13.

同一张表创建名称不同的视图: ​​​​​​​

create view ZLXX.WT_TRADE_REFUND2 (    id varchar primary key,    info.trade_id varchar,    info.amount  varchar,    info.status varchar,    info.modify_time varchar) as select * from zlxx.wt_trade_refund; 

则是可以的​​​​​​​

0: jdbc:phoenix:> select * from zlxx.wt_trade_refund2 limit 5;+----------+------------+---------+---------+-----------------+|    ID    |  TRADE_ID  | AMOUNT  | STATUS  |   MODIFY_TIME   |+----------+------------+---------+---------+-----------------+| 1005528  | T12059592  | 100     | 01      | 20200228185803  || 1005529  | T12059591  | 10      | 03      | 20200228185846  || 1005530  | T12059606  | 10      | 01      | 20200228195442  || 1005531  | T12059605  | 10      | 01      | 20200228195442  || 1005532  | noTradeId  | 100     | 03      | 20200229224015  |+----------+------------+---------+---------+-----------------+5 rows selected (0.093 seconds)

简单查看一下Phoenix视图的描述信息(列信息)​​​​​​​

0: jdbc:phoenix:> !describe zlxx.wt_trade_refund+------------+--------------+------------------+--------------+------------+------------+--------------+----------------+-----------------+-----------------+-----------+----------+------+| TABLE_CAT  | TABLE_SCHEM  |    TABLE_NAME    | COLUMN_NAME  | DATA_TYPE  | TYPE_NAME  | COLUMN_SIZE  | BUFFER_LENGTH  | DECIMAL_DIGITS  | NUM_PREC_RADIX  | NULLABLE  | REMARKS  | COLU |+------------+--------------+------------------+--------------+------------+------------+--------------+----------------+-----------------+-----------------+-----------+----------+------+|            | ZLXX         | WT_TRADE_REFUND  | ID           | 12         | VARCHAR    | null         | null           | null            | null            | 0         |          |      ||            | ZLXX         | WT_TRADE_REFUND  | TRADE_ID     | 12         | VARCHAR    | null         | null           | null            | null            | 1         |          |      ||            | ZLXX         | WT_TRADE_REFUND  | AMOUNT       | 12         | VARCHAR    | null         | null           | null            | null            | 1         |          |      ||            | ZLXX         | WT_TRADE_REFUND  | STATUS       | 12         | VARCHAR    | null         | null           | null            | null            | 1         |          |      ||            | ZLXX         | WT_TRADE_REFUND  | MODIFY_TIME  | 12         | VARCHAR    | null         | null           | null            | null            | 1         |          |      |+------------+--------------+------------------+--------------+------------+------------+--------------+----------------+-----------------+-----------------+-----------+----------+------+

简单查看一下Phoenix表元数据信息(TABLE_TYPE为VIEW)​​​​​​​

0: jdbc:phoenix:> !tables+------------+--------------+-------------------+---------------+----------+------------+----------------------------+-----------------+--------------+-----------------+---------------+-+| TABLE_CAT  | TABLE_SCHEM  |    TABLE_NAME     |  TABLE_TYPE   | REMARKS  | TYPE_NAME  | SELF_REFERENCING_COL_NAME  | REF_GENERATION  | INDEX_STATE  | IMMUTABLE_ROWS  | SALT_BUCKETS  | |+------------+--------------+-------------------+---------------+----------+------------+----------------------------+-----------------+--------------+-----------------+---------------+-+|            | SYSTEM       | CATALOG           | SYSTEM TABLE  |          |            |                            |                 |              | false           | null          | ||            | SYSTEM       | FUNCTION          | SYSTEM TABLE  |          |            |                            |                 |              | false           | null          | ||            | SYSTEM       | LOG               | SYSTEM TABLE  |          |            |                            |                 |              | true            | 32            | ||            | SYSTEM       | SEQUENCE          | SYSTEM TABLE  |          |            |                            |                 |              | false           | null          | ||            | SYSTEM       | STATS             | SYSTEM TABLE  |          |            |                            |                 |              | false           | null          | ||            | ZLXX         | WT_TRADE_REFUND   | VIEW          |          |            |                            |                 |              | false           | null          | ||            | ZLXX         | WT_TRADE_REFUND2  | VIEW          |          |            |                            |                 |              | false           | null          | |+------------+--------------+-------------------+---------------+----------+------------+----------------------------+-----------------+--------------+-----------------+---------------+-+

 

# 在HBase里插入点数据看看Phoenix的视图能否实时更新?答案是YES!​​​​​​​

put 'ZLXX:WT_TRADE_REFUND', '2009999', 'INFO:TRADE_ID', 'ZXR3099999'put 'ZLXX:WT_TRADE_REFUND', '2009999', 'INFO:AMOUNT', '100'put 'ZLXX:WT_TRADE_REFUND', '2009999', 'INFO:STATUS', '00'put 'ZLXX:WT_TRADE_REFUND', '2009999', 'INFO:MODIFY_TIME', '20200715124015'

hbase shell端put数据

Phoenix端查询数据

总结:

1.hbase shell这边创建表时,命名空间namespace和表名称都要大写,这样方便Phoenix这边小写。

2.在hbase shell这边,命名空间与表名称之间需要注意是双引号,DataX里的全表名称也是如此。

3.Phoenix这边创建相同命名空间名称的schema。

4.Phoenix这边创建view视图映射有两种方式:as select和column_encoded_bytes=0。

附录hbase整合Phoenix的基本配置:​​​​​​​

hbase-site.xml<property>  <name>phoenix.schema.isNamespaceMappingEnabled</name>  <value>true</value></property><property>  <name>phoenix.schema.mapSystemTablesToNamespace</name>  <value>true</value></property>

【Phoenix视图映射和DataX数据同步到HBase、END】

往期推荐:

Kafka消息送达语义说明

Kafka基础知识总结

Hadoop YARN:ApplicationMaster向ResourceManager注册AM源码调试

Apache Hadoop YARN:Client<-->ResourceManager源码解析

Apache Hadoop YARN:Client<-->ResourceManager源码DEBUG

Hadoop YARN:ApplicationMaster与ResourceManager交互源码解析

Hive企业级调优

HiveQL查询连续三天有销售记录的店铺

HiveQL实战蚂蚁森林低碳用户排名分析:解法一

HiveQL实战蚂蚁森林低碳用户排名分析:解法二

HiveQL实战蚂蚁森林植物申领统计分析

Hive-函数

Hive-查询

Hive-DML(Data Manipulation Language)数据操作语言

Hive-DDL(Data Definition Language)数据定义

Hive优化(整理版)

Spark Core之Shuffle解析

数据仓库开发规范

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值