2.paimon-hive-flink整合

概述

paimon系列中已经完成默认catalog使用,基于文件系统

paimon现在支持两种

  • 文件系统catalog是默认的
  • hive方式 相关元数据信息保存在hive中(实际上是mysql),通过hive就可以直接访问

今天主要说明hive的使用
paimon其它相关文章请 移步
paimon官方文档

paimon与hive整合

跟着官网进行配置;
注意:
1.使用 hive catalog,数据库名称、表名称和字段名称应为小写
2.flink lib目录下添加 flink hive connector的jar包

在这里插入图片描述
启动 hive 服务

[root@hadoop01 apache-hive-3.1.3-bin]# nohup bin/hive --service metastore & 
[1] 18297
[root@hadoop01 apache-hive-3.1.3-bin]# nohup: 忽略输入并把输出追加到"nohup.out"

[root@hadoop01 apache-hive-3.1.3-bin]# netstat -nlp | grep :9083
tcp6       0      0 :::9083                 :::*                    LISTEN      18297/java          
[root@hadoop01 apache-hive-3.1.3-bin]# 

# 查看是否正常启动
netstat -nlp | grep :9083

建立 hive catalog
CREATE CATALOG paimon_hive WITH (
‘type’ = ‘paimon’,
‘metastore’ = ‘hive’,
‘uri’ = ‘thrift://10.32.36.142:9083’,
‘warehouse’ = ‘hdfs:///data/hive/warehouse/paimon/hive’,
‘default-database’=‘test’
);
使用 所建立 的catalog paimon_hive,并在其中建立表
USE CATALOG paimon_hive;
插入数据,并查询一下插入的数据


Flink SQL> CREATE CATALOG paimon_hive WITH (
>     'type' = 'paimon',
>     'metastore' = 'hive',
>     'uri' = 'thrift://10.32.36.142:9083',
>     'warehouse' = 'hdfs:///data/hive/warehouse/paimon/hive',
>     'default-database'='test'
> );
[INFO] Execute statement succeed.

Flink SQL> USE CATALOG paimon_hive;
[INFO] Execute statement succeed.

Flink SQL> show databases;
+---------------+
| database name |
+---------------+
|       default |
|          test |
+---------------+
2 rows in set

Flink SQL> CREATE TABLE test_table (
>   a int,
>   b string
> );
[INFO] Execute statement succeed.

Flink SQL> INSERT INTO test_table VALUES (1, 'Table'), (2, 'Store');WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.flink.api.java.ClosureCleaner (file:/data/soft/flink/lib/flink-dist-1.17.1.jar) to field java.lang.Class.ANNOTATION
WARNING: Please consider reporting this to the maintainers of org.apache.flink.api.java.ClosureCleaner
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
2023-10-21 16:35:43,712 WARN  org.apache.flink.yarn.configuration.YarnLogConfigUtil        [] - The configuration directory ('/data/soft/flink/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2023-10-21 16:35:43,750 INFO  org.apache.hadoop.yarn.client.RMProxy                        [] - Connecting to ResourceManager at hadoop01/10.32.36.142:8032
2023-10-21 16:35:43,842 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2023-10-21 16:35:43,843 WARN  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set.The Flink YARN Client needs one of these to be set to properly load the Hadoop configuration for accessing YARN.
2023-10-21 16:35:43,869 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Found Web Interface hadoop02:42563 of application 'application_1697598809136_0008'.

[INFO] Submitting SQL update statement to the cluster...
[INFO] SQL update statement has been successfully submitted to the cluster:
Job ID: 7177023eb237173633fd2efd69d86e82


Flink SQL> 
> SELECT * FROM test_table;2023-10-21 16:35:51,215 WARN  org.apache.flink.yarn.configuration.YarnLogConfigUtil        [] - The configuration directory ('/data/soft/flink/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2023-10-21 16:35:51,237 INFO  org.apache.hadoop.yarn.client.RMProxy                        [] - Connecting to ResourceManager at hadoop01/10.32.36.142:8032
2023-10-21 16:35:51,238 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2023-10-21 16:35:51,238 WARN  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set.The Flink YARN Client needs one of these to be set to properly load the Hadoop configuration for accessing YARN.
2023-10-21 16:35:51,240 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Found Web Interface hadoop02:42563 of application 'application_1697598809136_0008'.

[INFO] Result retrieval cancelled.

Flink SQL> SET 'sql-client.execution.result-mode' = 'tableau';
[INFO] Execute statement succeed.

Flink SQL> SELECT * FROM test_table;2023-10-21 16:36:47,923 WARN  org.apache.flink.yarn.configuration.YarnLogConfigUtil        [] - The configuration directory ('/data/soft/flink/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2023-10-21 16:36:47,945 INFO  org.apache.hadoop.yarn.client.RMProxy                        [] - Connecting to ResourceManager at hadoop01/10.32.36.142:8032
2023-10-21 16:36:47,945 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2023-10-21 16:36:47,945 WARN  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set.The Flink YARN Client needs one of these to be set to properly load the Hadoop configuration for accessing YARN.
2023-10-21 16:36:47,948 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - Found Web Interface hadoop02:42563 of application 'application_1697598809136_0008'.

+----+-------------+--------------------------------+
| op |           a |                              b |
+----+-------------+--------------------------------+
| +I |           1 |                          Table |
| +I |           2 |                          Store |
^CQuery terminated, received a total of 2 rows
Flink SQL> use test;
[INFO] Execute statement succeed.

Flink SQL> show tables;
+------------+
| table name |
+------------+
| test_table |
+------------+
1 row in set

Flink SQL> 

注意点

注意:当使用配置 hive catalog ,通过表更改不兼容的列类型时,需要配置配置单元.metastore.disable.incompatible.col.type.changes=false
如果使用的是hive3版本,还要关闭 HIVE ACID
hive.strict.managed.tables=false
hive.create.as.insert.only=false
metastore.create.as.acid=false
修改hive-site.xml配置

# 加至末尾,其它找到配置属性,修改配置属性
<property>
  <name>hive.strict.managed.tables</name>
  <value>false</value>
</property>
<property>
  <name>metastore.create.as.acid</name>
  <value>false</value>
</property>

重新启动,配置后面需要用到

结束

至此,paimon,hive及flink进行整合

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 1
    评论
为什么要学习这门课程?·新一代流式数据湖技术组件深入讲解,帮助你快速构造数据湖知识体系。·为构建湖仓一体架构提供底层技术支撑。本课程将从原理、架构、底层存储细节、性能优化、管理等层面对Paimon流式数据湖组件进行详细讲解,原理+实战,帮助你快速上手使用数据湖技术。讲师介绍华为HCIP认证大数据高级工程师北京猎豹移动大数据技术专家中科院大数据研究院大数据技术专家51CTO企业IT学院优秀讲师电子工业出版社2022年度优秀作者出版书籍:《Flink入门与实战》、《大数据技术及架构图解实战派》。本课程提供配套课件、软件、试题、以及源码。课程内容介绍:1、什么是Apache Paimon2、Paimon的整体架构3、Paimon的核心特点4、Paimon支持的生态5、基于Flink SQL操作Paimon6、基于Flink DataStream API 操作Paimon7、Paimon中的内部表和外部表8、Paimon中的分区表和临时表9、Paimon中的Primary Key表(主键表)10、Paimon中的Append Only表(仅追加表)11、Changelog Producers原理及案例实战12、Merge Engines原理及案例实战13、Paimon中的Catalog详解14、Paimon中的Table详解15、PaimonHive Catalog的使用16、动态修改Paimon表属性17、查询Paimon系统表18、批量读取Paimon表19、流式读取Paimon表20、流式读取高级特性Consumer ID21、Paimon CDC数据摄取功能22、CDC之MySQL数据同步到Paimon23、CDC之Kafka数据同步到Paimon24、CDC高级特性之Schema模式演变25、CDC高级特性之计算列26、CDC高级特性之特殊的数据类型映射27、CDC高级特性之中文乱码28、Hive引擎集成Paimon29、在Hive中配置Paimon依赖30、在Hive中读写Paimon表31、在Hive中创建Paimon表32、HivePaimon数据类型映射关系33、Paimon底层文件基本概念34、Paimon底层文件布局35、Paimon底层文件操作详解36、Flink流式写入Paimon表过程分析37、读写性能优化详细分析38、Paimon中快照、分区、小文件的管理39、管理标签(自动管理+手工管理)40、管理Bucket(创建+删除+回滚)

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

流月up

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值