背景
因为项目需要,之前基于Hadoop+yarn+flink+hdfs+hive 构建一套文件存储体系,但是由于Hadoop商业发行版cdh和hdp开始收费,开始思考如何构建没有hadoop生态的数据湖,搜集网上资料,尝试基于现代存储S3或者OSS来代替HDFS,使用k8s + kafka + Flink + iceberg + trino构建实时计算体系。 网上的教程大多问题很多,记录下来以作参考。
前提
安装k8s集群 、 minio(省略)
安装
一、kafka安装——推荐使用Strimzi快速搭建Kafka
---
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: my-cluster
namespace: kafka
spec:
entityOperator:
topicOperator: {}
userOperator: {}
kafka:
config:
default.replication.factor: 1
inter.broker.protocol.version: '3.3'
min.insync.replicas: 1
offsets.topic.replication.factor: 1
transaction.state.log.min.isr: 1
transaction.state.log.replication.factor: 1
listeners:
- configuration:
bootstrap:
nodePort: 32410
brokers:
- broker: 0
nodePort: 32420
- broker: 1
nodePort: 32421
- broker: 2
nodePort: 32422
name: external
port: 9094
tls: false
type: nodeport
- name: plain
port: 9092
tls: false
type: internal
- name: tls
port: 9093
tls: true
type: internal
replicas: 3
storage:
type: jbod
volumes:
- class: ceph-kafka
deleteClaim: false
id: 0
size: 100Gi
type: persistent-claim
version: 3.3.1
zookeeper:
replicas: 3
storage:
class: ceph-kafka
deleteClaim: false
size: 100Gi
type: persistent-claim
二、flink编译安装支持s3的镜像
2.1 提前下载
-
aws-java-sdk-bundle-1.11.375.jar
-
commons-cli-1.5.0.jar
-
flink-s3-fs-hadoop-1.14.6.jar
-
flink-shaded-hadoop-3-uber-3.1.1.7.2.9.0-173-9.0.jar
-
hadoop-aws-3.2.2.jar
-
guava-27.0-jre.jar
2.2 core-site.xml
Hadoop Core的配置项,无需安装hadoop,但是连接s3还是使用hadoop的方法连接,这里使用的是s3a,当然也可以使用s3、s3p,关于s3、s3a和s3p的区别,可以参考flink的官网介绍:Amazon S3 | Apache Flink
<configuration xmlns:xi="http://www.w3.org/2001/XInclude">
<property>
<name>fs.s3a.connection.ssl.enabled</name>
<value>false</value>
</property>
<property>
<name>fs.s3a.aws.credentials.provider</name>
<value>
com.amazonaws.auth.InstanceProfileCredentialsProvider,
org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider,
com.amazonaws.auth.EnvironmentVariableCredentialsProvider
</value>
</property>
<property>
<name>fs.s3a.endpoint</name>
<value><填你的s3地址></value>
</property>
<property>
<name>fs.s3a.access.key</name>
<value>xxxx</value>
</property>
<property>
<name>fs.s3a.secret.key</name>
<value>xxxx</value>
</property>
<property>
<name>fs.s3a.path.style.access</name>
<value>true</value>
</property>
<property>
<name>fs.s3a.impl</name>
<value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
</property>
<property>
<name>fs.s3a.fast.upload</name>
<value>true</value>
</property>
</configuration>
2.3 Dockerfile
FROM docker.io/flink:1.14.6-scala_2.11-java8
#设置hadoop配置
RUN mkdir -p /usr/local/hadoop/etc/hadoop/
ADD core-site.xml /usr/local/hadoop/etc/hadoop/
#minio的秘钥,也可以使用iam身份令牌,官方更推荐令牌
ENV AWS_ACCESS_KEY_ID xxxx
ENV AWS_SECRET_ACCESS_KEY xxxx
ENV AWS_DEFAULT_REGION cn-northwest-1
# 这些架包需要提前下载,下载方式可以百度
COPY flink/lib/aws-java-sdk-bundle-1.11.375.jar /opt/flink/lib/
COPY flink/lib/commons-cli-1.5.0.jar /opt/flink/lib/
COPY flink/lib/flink-s3-fs-hadoop-1.14.6.jar /opt/flink/lib/
COPY flink/lib/flink-shaded-hadoop-3-uber-3.1.1.7.2.9.0-173-9.0.jar /opt/flink/lib/
COPY flink/lib/guava-27.0-jre.jar /opt/flink/lib/
COPY flink/lib/hadoop-aws-3.2.2.jar /opt/flink/lib/
RUN chown -R flink:flink /opt/flink/lib/*.jar
RUN cd /opt/flink && \
#s3a建议flink-s3-fs-hadoop,s3p建议flink-s3-fs-presto 区别百度
mkdir ./plugins/s3-fs-hadoop && \
cp ./opt/flink-s3-fs-hadoop-1.14.6.jar ./plugins/s3-fs-hadoop/
三、flink启动
3.1 Flink通过Native Kubernetes(k8s)方式Session模式运行部署 (不推荐使用)
3.2 推荐安装streamx,启动flink集群
3.2.1 安装
(类似工具还有dinky,侧重功能点不同) 官网文档:框架介绍 | Apache StreamPark (incubating) 说明:Flink流处理工具,单机安装就可以,实际执行还是依赖Flink集群 安装过程:省略
3.2.2 配置flink
(1) 配置Flink Home
注:streamx安装的主机,也下载一份flink,例如下载到/opt路径,此配置更多是给standlone模式使用
cd /opt
wget https://archive.apache.org/dist/flink/flink-1.14.6/flink-1.14.6-bin-scala_2.11.tgz
tar -zxvf flink-1.14.6-bin-scala_2.11.tgz
(2) 配置启动flink-cluster集群(session模式)——代替3.1节点
创建工作空间
kubectl create ns flink
kubectl create serviceaccount flink -n flink
kubectl create clusterrolebinding flink-role-bind --clusterrole=edit --serviceaccount=flink:flink
k8s session模式启动, 生产系统可以使用application模式
Dynamic Option 对应flink-conf.yaml的配置
-Dkubernetes.flink.conf.dir=/opt/flink/conf
-Dfs.allowed-fallback-filesystems=s3
-Ds3a.access-key=填你的s3的accesskey
-Ds3a.secret-key=填你的s3的secretkey
-Ds3a.endpoint=填你的s3的endpoint
-Dstate.backend=filesystem
-Dstate.checkpoints.dir=s3a://flink/checkpoints/
-Dstate.backend.fs.checkpointdir=s3a://flink/checkpoints/
-Dstate.savepoints.dir=s3a://flink/savepoints/
-Dstate.backend.fs.savepoints=s3a://flink/savepoints/
-Dkubernetes.jobmanager.cpu=0.2
-Djobmanager.memory.process.size=1024m
-Dresourcemanager.taskmanager-timeout=3600000
-Dkubernetes.taskmanager.cpu=0.2
-Dtaskmanager.memory.process.size=1024m
-Ds3a.connection.ssl.enabled=false
-Ds3.aws.credentials.provider=com.amazonaws.auth.EnvironmentVariableCredentialsProvider
-Dfs.hdfs.hadoopconf=/usr/local/hadoop/etc/hadoop/
-Dfs.s3.impl=org.apache.hadoop.fs.s3a.S3AFileSystem
-Ds3a.fast.upload=true
-Ds3a.path.style.access=true
-Dexecution.checkpointing.interval=5000
-Dexecution.checkpointing.mode=EXACTLY_ONCE
-Dexecution.checkpointing.timeout=600000
-Dexecution.checkpointing.min-pause=5000
-Dexecution.checkpointing.max-concurrent-checkpoints=1
-Dstate.checkpoints.num-retained=3
-Dexecution.checkpointing.externalized-checkpoint-retention=RETAIN_ON_CANCELLATION
四、 k8s安装trino
4.1 元数据存储
自定义docker镜像编译参考:
https://github.com/joshuarobinson/trino-on-k8s 手把手带你玩转 iceberg - trino on k8s - 文章详情
4.1.1 创建元数据存储pvc
maria_pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: maria-pv-claim
spec:
storageClassName: trino-storage
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
4.1.2 元数据存储mariadb
** maria_deployment.yaml**
apiVersion: v1
kind: Service
metadata:
name: metastore-db
namespace: trino
spec:
ports:
- port: 13306
targetPort: 3306
selector:
app: mysql
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: mysql
namespace: trino
spec:
selector:
matchLabels:
app: mysql
strategy:
type: Recreate
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mariadb
image: "mariadb/server:latest"
imagePullPolicy: IfNotPresent
env:
- name: MYSQL_ROOT_PASSWORD
value: 123456
ports:
- containerPort: 3306
name: mysql
volumeMounts:
- name: mariadb-for-hive
mountPath: /var/lib/mysql
resources:
requests:
memory: "1G"
cpu: 0.5
volumes:
- name: mariadb-for-hive
persistentVolumeClaim:
claimName: maria-pv-claim
4.1.3 元数据存储服务
hive-initschema.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: hive-initschema
namespace: trino
spec:
template:
spec:
containers:
- name: hivemeta
image: zaki297004707/metastore:v1.0.1
command: ["/opt/hive-metastore/bin/schematool"]
args: ["--verbose" ,"-initSchema" , "-dbType", "mysql" , "-userName", "root",
"-passWord", "123456" , "-url", "jdbc:mysql://metastore-db:13306/metastore_db?createDatabaseIfNotExist=true"]
restartPolicy: Never
backoffLimit: 4
metastore-cfg
---
kind: ConfigMap
apiVersion: v1
metadata:
name: metastore-cfg
namespace: trino
data:
core-site.xml: |-
<configuration>
<property>
<name>fs.s3a.connection.ssl.enabled</name>
<value>false</value>
</property>
<property>
<name>fs.s3a.endpoint</name>
<value>http://xxxx:9000</value>
</property>
<property>
<name>hive.s3a.aws-access-key</name>
<value>xxx</value>
</property>
<property>
<name>hive.s3a.aws-secret-key</name>
<value>xxxxxxxxxxxxxx</value>
</property>
<property>
<name>fs.s3a.access.key</name>
<value>xxx</value>
</property>
<property>
<name>fs.s3a.secret.key</name>
<value>xxxx</value>
</property>
<property>
<name>fs.s3a.path.style.access</name>
<value>true</value>
</property>
<property>
<name>fs.s3a.impl</name>
<value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
</property>
<property>
<name>fs.s3a.fast.upload</name>
<value>true</value>
</property>
</configuration>
metastore-site.xml: |-
<configuration>
<property>
<name>metastore.task.threads.always</name>
<value>org.apache.hadoop.hive.metastore.events.EventCleanerTask</value>
</property>
<property>
<name>metastore.expression.proxy</name>
<value>org.apache.hadoop.hive.metastore.DefaultPartitionExpressionProxy</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://metastore-db.trino.svc.cluster.local:13306/metastore_db</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>autoai123</value>
</property>
<property>
<name>metastore.warehouse.dir</name>
<value>s3a://trino/warehouse/</value>
</property>
<property>
<name>metastore.thrift.port</name>
<value>9083</value>
</property>
</configuration>
创建my-s3-keys
apiVersion: v1
kind: Secret
metadata:
name: my-s3-keys
type:
Opaque
data:
access-key: xxxxxxxxxx
secret-key: xxxxxxxxxxxxxxxxxxx
创建metastore.yaml
---
apiVersion: v1
kind: Service
metadata:
name: metastore
namespace: trino
spec:
ports:
- port: 9083
selector:
app: metastore
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: metastore
namespace: trino
spec:
selector:
matchLabels:
app: metastore
strategy:
type: Recreate
template:
metadata:
labels:
app: metastore
spec:
containers:
- name: metastore
image: zaki297004707/metastore:v1.0.1
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: my-s3-keys
key: access-key
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: my-s3-keys
key: secret-key
ports:
- containerPort: 9083
volumeMounts:
- name: metastore-cfg-vol
mountPath: /opt/hive-metastore/conf/metastore-site.xml
subPath: metastore-site.xml
- name: metastore-cfg-vol
mountPath: /opt/hadoop/etc/hadoop/core-site.xml
subPath: core-site.xml
command: ["/opt/hive-metastore/bin/start-metastore"]
args: ["-p", "9083"]
resources:
requests:
memory: "1G"
cpu: 0.5
imagePullPolicy: Always
volumes:
- name: metastore-cfg-vol
configMap:
name: metastore-cfg
4.2、trino
1、新建配置
trino-cfgs.yaml
---
kind: ConfigMap
apiVersion: v1
metadata:
name: trino-configs
namespace: trino
data:
jvm.config: |-
-server
-Xmx2G
-XX:-UseBiasedLocking
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+ExplicitGCInvokesConcurrent
-XX:+ExitOnOutOfMemoryError
-XX:+UseGCOverheadLimit
-XX:+HeapDumpOnOutOfMemoryError
-XX:ReservedCodeCacheSize=512M
-Djdk.attach.allowAttachSelf=true
-Djdk.nio.maxCachedBufferSize=2000000
config.properties.coordinator: |-
coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=8080
query.max-memory=200GB
query.max-memory-per-node=0.2GB
query.max-total-memory-per-node=0.6GB
query.max-stage-count=200
task.writer-count=4
discovery-server.enabled=true
discovery.uri=http://trino-coordinator:8080
config.properties.worker: |-
coordinator=false
http-server.http.port=8080
query.max-memory=200GB
query.max-memory-per-node=0.2GB
query.max-total-memory-per-node=0.6GB
query.max-stage-count=200
task.writer-count=4
discovery.uri=http://trino-coordinator:8080
node.properties: |-
node.environment=test
spiller-spill-path=/tmp
max-spill-per-node=4TB
query-max-spill-per-node=1TB
hive.properties: |-
connector.name=hive-hadoop2
hive.metastore.uri=thrift://metastore:9083
hive.allow-drop-table=true
hive.max-partitions-per-scan=1000000
hive.s3.endpoint=10.233.41.1:9001
hive.s3.path-style-access=true
hive.s3.ssl.enabled=false
hive.s3.max-connections=100
iceberg.properties: |-
connector.name=iceberg
hive.metastore.uri=thrift://metastore:9083
hive.max-partitions-per-scan=1000000
hive.s3.endpoint=10.233.41.1:9001
hive.s3.path-style-access=true
hive.s3.ssl.enabled=false
hive.s3.max-connections=100
mysql.properties: |-
connector.name=mysql
connection-url=jdbc:mysql://metastore-db.trino.svc.cluster.local:13306
connection-user=root
connection-password=autoai123
2、创建服务
trino.yaml
spec:
selector:
matchLabels:
app: trino-coordinator
strategy:
type: Recreate
template:
metadata:
labels:
app: trino-coordinator
spec:
containers:
- name: trino
image: trinodb/trino:361
ports:
- containerPort: 8080
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: my-s3-keys
key: access-key
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: my-s3-keys
key: secret-key
volumeMounts:
- name: trino-cfg-vol
mountPath: /etc/trino/jvm.config
subPath: jvm.config
- name: trino-cfg-vol
mountPath: /etc/trino/config.properties
subPath: config.properties.coordinator
- name: trino-cfg-vol
mountPath: /etc/trino/node.properties
subPath: node.properties
- name: trino-cfg-vol
mountPath: /etc/trino/catalog/hive.properties
subPath: hive.properties
- name: trino-cfg-vol
mountPath: /etc/trino/catalog/iceberg.properties
subPath: iceberg.properties
- name: trino-cfg-vol
mountPath: /etc/trino/catalog/mysql.properties
subPath: mysql.properties
resources:
requests:
memory: "1G"
cpu: 0.5
imagePullPolicy: Always
volumes:
- name: trino-cfg-vol
configMap:
name: trino-configs
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: trino-worker
namespace: trino
spec:
serviceName: trino-worker
replicas: 1
selector:
matchLabels:
app: trino-worker
template:
metadata:
labels:
app: trino-worker
spec:
securityContext:
fsGroup: 1000
containers:
- name: trino
image: trinodb/trino:361
ports:
- containerPort: 8080
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: my-s3-keys
key: access-key
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: my-s3-keys
key: secret-key
volumeMounts:
- name: trino-cfg-vol
mountPath: /etc/trino/jvm.config
subPath: jvm.config
- name: trino-cfg-vol
mountPath: /etc/trino/config.properties
subPath: config.properties.worker
- name: trino-cfg-vol
mountPath: /etc/trino/node.properties
subPath: node.properties
- name: trino-cfg-vol
mountPath: /etc/trino/catalog/hive.properties
subPath: hive.properties
- name: trino-cfg-vol
mountPath: /etc/trino/catalog/iceberg.properties
subPath: iceberg.properties
- name: trino-cfg-vol
mountPath: /etc/trino/catalog/mysql.properties
subPath: mysql.properties
- name: trino-tmp-data
mountPath: /tmp
resources:
requests:
memory: "1G"
cpu: 0.5
imagePullPolicy: Always
volumes:
- name: trino-cfg-vol
configMap:
name: trino-configs
volumeClaimTemplates:
- metadata:
name: trino-tmp-data
spec:
storageClassName: trino-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 40Gi
---
apiVersion: v1
kind: Pod
metadata:
name: trino-cli
namespace: trino
spec:
containers:
- name: trino-cli
image: trinodb/trino:361
command: ["tail", "-f", "/dev/null"]
imagePullPolicy: Always
restartPolicy: Always
4.3 k8s运行
建议字典配置下access-key和sercret-key,桶的权限可能不足
五、示例
5.1 基于streamx使用FlinkSql进行实时流处理入库
(1)FlinkSql
CREATE TABLE IF NOT EXISTS ods_log(
`log` string
) WITH (
'connector' = 'kafka',
'topic' = 'test-log',
'properties.bootstrap.servers' = '<kafka的消息端口>',
'properties.group.id' = 'test',
'scan.startup.mode' = 'earliest-offset',
'format' = 'raw'
);
CREATE CATALOG iceberg WITH (
'type'='iceberg',
'warehouse'='s3a://<>/warehouse/',
'catalog-type'='hive',
'uri'='thrift://xxxx:xxx'
);
CREATE DATABASE IF NOT EXISTS iceberg.test1;
CREATE TABLE IF NOT EXISTS iceberg.test1.ods_filebeat_log(
`log` string
) WITH ('write.format.default'='ORC');
insert into iceberg.test1.ods_filebeat_log select * from ods_log;
(2)参考jar包
(3)运行界面参考