环境:
flink1.14.5
iceberg0.13.2
hadoop2.6.7
从https://iceberg.incubator.apache.org/releases/下载flink1.14所需的运行jar,放到flink的lib目录下,启动flink集群:
./bin/start-cluster.sh
启动Flink SQL Client:
./bin/sql-client.sh embedded shell
执行操作
Flink SQL> CREATE CATALOG hadoop_catalog WITH (
> 'type'='iceberg',
> 'catalog-type'='hadoop',
> 'cache-enabled'='true',
> 'warehouse'='hdfs://localhost:8020/flink-iceberg/iceberg-hadoop',
> 'property-version'='1'
> );
[INFO] Execute statement succeed.
Flink SQL> CREATE DATABASE hadoop_catalog.iceberg_db;
[INFO] Execute statement succeed.
Flink SQL> CREATE TABLE hadoop_catalog.iceberg_db.sample_test (
> id BIGINT COMMENT 'unique id',
> data STRING,
> PRIMARY KEY(id) NOT ENFORCED
> )
> WITH (
> 'format-version'= '2',
> 'write.format.default'='parquet',
> 'write.parquet.compression-codec'='gzip',
> 'write.upsert.enable'='true'
> );
[INFO] Execute statement succeed.
Flink SQL> INSERT INTO hadoop_catalog.iceberg_db.sample_test VALUES
> (10, 'test10_U'), (11, 'test11'), (12, 'test12');
[INFO] Submitting SQL update statement to the cluster...
[INFO] SQL update statement has been successfully submitted to the cluster:
Job ID: ab22b4fbe1601bafc50f9c581648707f
Flink SQL> select * from hadoop_catalog.iceberg_db.sample_test;
[INFO] Result retrieval cancelled.
查询数据结果: