概述
paimon 0.5 ,flink 1.17.1版本, 目标flink与paimon进行整合,测试的环境搞起
相关链接
阅读之前,可以先浏览一下
下面的链接是paimon官方文档地址,有兴趣的同学可以浏览一下
官方快速入门链接
jar包
整合需要添加一些jar包
上传
上传至
linux
机器
上传之后如下图
重启一下 yarn session ,如果还没有yarn session环境,请参照下面链接
yarn session环境搭建
测试
测试flink与paimon整合的环境是否有问题
yarn session
命令行连接
yarn session
环境
# 连接上 yarn session环境
[root@hadoop01 flink]# bin/sql-client.sh
Flink SQL> CREATE CATALOG my_catalog WITH (
> 'type'='paimon',
> 'warehouse'='hdfs:///data/hive/warehouse/paimon'
> );
[INFO] Execute statement succeed.
Flink SQL> USE CATALOG my_catalog;
[INFO] Execute statement succeed.
Flink SQL> CREATE TABLE word_count (
> word STRING PRIMARY KEY NOT ENFORCED,
> cnt BIGINT
> );
[INFO] Execute statement succeed.
Flink SQL> CREATE TEMPORARY TABLE word_table (
> word STRING
> ) WITH (
> 'connector' = 'datagen',
> 'fields.word.length' = '1'
> );
[INFO] Execute statement succeed.
Flink SQL> SET 'execution.checkpointing.interval' = '10 s';
[INFO] Execute statement succeed.
Flink SQL> INSERT INTO word_count SELECT word, COUNT(*) FROM word_table GROUP BY word;2023-10-19 09:44:34,955 WARN org.apache.flink.yarn.configuration.YarnLogConfigUtil [] - The configuration directory ('/data/soft/flink/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2023-10-19 09:44:34,992 INFO org.apache.hadoop.yarn.client.RMProxy [] - Connecting to ResourceManager at hadoop01/10.32.36.142:8032
2023-10-19 09:44:35,081 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2023-10-19 09:44:35,082 WARN org.apache.flink.yarn.YarnClusterDescriptor [] - Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set.The Flink YARN Client needs one of these to be set to properly load the Hadoop configuration for accessing YARN.
2023-10-19 09:44:35,106 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Found Web Interface hadoop02:42563 of application 'application_1697598809136_0008'.
[INFO] Submitting SQL update statement to the cluster...
[INFO] SQL update statement has been successfully submitted to the cluster:
Job ID: 13414ca4b36fd7adf20aa0b59e5fba0e
Flink SQL> SET 'sql-client.execution.result-mode' = 'tableau';
[INFO] Execute statement succeed.
Flink SQL> RESET 'execution.checkpointing.interval';
[INFO] Execute statement succeed.
Flink SQL> SET 'execution.runtime-mode' = 'batch';
[INFO] Execute statement succeed.
Flink SQL> SELECT * FROM word_count;2023-10-19 09:45:04,314 WARN org.apache.flink.yarn.configuration.YarnLogConfigUtil [] - The configuration directory ('/data/soft/flink/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2023-10-19 09:45:04,357 INFO org.apache.hadoop.yarn.client.RMProxy [] - Connecting to ResourceManager at hadoop01/10.32.36.142:8032
2023-10-19 09:45:04,358 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2023-10-19 09:45:04,358 WARN org.apache.flink.yarn.YarnClusterDescriptor [] - Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set.The Flink YARN Client needs one of these to be set to properly load the Hadoop configuration for accessing YARN.
2023-10-19 09:45:04,361 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Found Web Interface hadoop02:42563 of application 'application_1697598809136_0008'.
+------+-------+
| word | cnt |
+------+-------+
| 0 | 13114 |
| 1 | 13090 |
| 2 | 13223 |
| 3 | 12934 |
| 4 | 13266 |
| 5 | 12952 |
| 6 | 13113 |
| 7 | 13051 |
| 8 | 13070 |
| 9 | 13193 |
| a | 13212 |
| b | 13032 |
| c | 13054 |
| d | 13277 |
| e | 13252 |
| f | 13167 |
+------+-------+
16 rows in set
Flink SQL> SET 'execution.runtime-mode' = 'streaming';
[INFO] Execute statement succeed.
Flink SQL> SELECT `interval`, COUNT(*) AS interval_cnt FROM
> (SELECT cnt / 10000 AS `interval` FROM word_count) GROUP BY `interval`;2023-10-19 09:50:49,672 WARN org.apache.flink.yarn.configuration.YarnLogConfigUtil [] - The configuration directory ('/data/soft/flink/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2023-10-19 09:50:49,695 INFO org.apache.hadoop.yarn.client.RMProxy [] - Connecting to ResourceManager at hadoop01/10.32.36.142:8032
2023-10-19 09:50:49,696 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2023-10-19 09:50:49,696 WARN org.apache.flink.yarn.YarnClusterDescriptor [] - Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set.The Flink YARN Client needs one of these to be set to properly load the Hadoop configuration for accessing YARN.
2023-10-19 09:50:49,699 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Found Web Interface hadoop02:42563 of application 'application_1697598809136_0008'.
+----+----------------------+----------------------+
| op | interval | interval_cnt |
+----+----------------------+----------------------+
| +I | 22 | 1 |
| -U | 22 | 1 |
| +U | 22 | 2 |
| -U | 22 | 2 |
| +U | 22 | 3 |
| -U | 22 | 3 |
| +U | 22 | 4 |
| -U | 22 | 4 |
| +U | 22 | 5 |
| -U | 22 | 5 |
| +U | 22 | 6 |
| -U | 22 | 6 |
| +U | 22 | 7 |
| -U | 22 | 7 |
| +U | 22 | 8 |
| -U | 22 | 8 |
| +U | 22 | 9 |
| -U | 22 | 9 |
| +U | 22 | 10 |
| -U | 22 | 10 |
| +U | 22 | 11 |
| -U | 22 | 11 |
| +U | 22 | 12 |
| -U | 22 | 12 |
| +U | 22 | 13 |
| -U | 22 | 13 |
| +U | 22 | 14 |
| -U | 22 | 14 |
| +U | 22 | 15 |
| -U | 22 | 15 |
| +U | 22 | 16 |
| -U | 22 | 16 |
| +U | 22 | 15 |
| +I | 23 | 1 |
| -U | 22 | 15 |
| +U | 22 | 14 |
| -U | 23 | 1 |
| +U | 23 | 2 |
| -U | 22 | 14 |
| +U | 22 | 13 |
| -U | 23 | 2 |
| +U | 23 | 3 |
| -U | 22 | 13 |
| +U | 22 | 12 |
| -U | 23 | 3 |
| +U | 23 | 4 |
| -U | 22 | 12 |
| +U | 22 | 11 |
| -U | 23 | 4 |
| +U | 23 | 5 |
| -U | 22 | 11 |
| +U | 22 | 10 |
| -U | 23 | 5 |
| +U | 23 | 6 |
| -U | 22 | 10 |
| +U | 22 | 9 |
| -U | 23 | 6 |
| +U | 23 | 7 |
| -U | 22 | 9 |
| +U | 22 | 8 |
| -U | 23 | 7 |
| +U | 23 | 8 |
| -U | 22 | 8 |
| +U | 22 | 7 |
| -U | 23 | 8 |
| +U | 23 | 9 |
| -U | 22 | 7 |
| +U | 22 | 6 |
| -U | 23 | 9 |
| +U | 23 | 10 |
| -U | 22 | 6 |
| +U | 22 | 5 |
| -U | 23 | 10 |
| +U | 23 | 11 |
| -U | 22 | 5 |
| +U | 22 | 4 |
| -U | 23 | 11 |
| +U | 23 | 12 |
| -U | 22 | 4 |
| +U | 22 | 3 |
| -U | 23 | 12 |
| +U | 23 | 13 |
| -U | 22 | 3 |
| +U | 22 | 2 |
| -U | 23 | 13 |
| +U | 23 | 14 |
| -U | 22 | 2 |
| +U | 22 | 1 |
| -U | 23 | 14 |
| +U | 23 | 15 |
| -D | 22 | 1 |
| -U | 23 | 15 |
| +U | 23 | 16 |
^CQuery terminated, received a total of 93 rows
Flink SQL>
效果
结束
paimon与flink的整合环境ok