Hudi on Flink 同步Hive集群运行报错,缺少HiveConf

Hudi on Flink sync Hive 实践

代码结构

def getHudiTableDDL(importConfig: ImportConfig): String = {
    s"""
       |    CREATE TABLE ${importConfig.sinkDatabase}.${importConfig.sinkTableName} ( ${importConfig.sinkFields} )
       |      ${if (importConfig.isPartition) "PARTITIONED BY (`partition`)" else ""}
       |      WITH (
       |      'connector' = 'hudi',
       |      'path' = '${ClusterConstant.HudiMeta.DIR}.${importConfig.sinkDatabase}.${importConfig.sinkTableName}',
       |      'table.type' = 'COPY_ON_WRITE',
       |      'read.streaming.enabled' = 'true',  -- this option enable the streaming read
       |      'read.streaming.check-interval' = '10', -- specifies the check interval for finding new source commits, default 60s.
       |      'hive_sync.enable'='true',           -- required,开启hive同步功能
       |      'hive_sync.table'='${importConfig.sinkTableName}',              -- required, hive 新建的表名
       |      'hive_sync.db'='${importConfig.sinkDatabase}',             -- required, hive 新建的数据库名
       |      'hive_sync.mode' = 'hms',            -- required, 将hive sync mode设置为hms, 默认jdbc
       |      'hive_sync.metastore.uris' = '${ClusterConstant.HiveMeta.META_URL}' -- required, metastore的端口
       |       )
       |
       |""".stripMargin
       
def getSqlserverTableDDL(importConfig: ImportConfig): String ={
   val a =  s"""
       |    CREATE TABLE IF NOT EXISTS ${importConfig.sourceDatabase}.${importConfig.sourceTableName} (
       |       ${importConfig.sourceFields}
       |      ) WITH (
       |    'connector' = 'sqlserver-cdc',
       |    'hostname' = '${importConfig.sourceUrl}',
       |    'port' = '${importConfig.sourcePort}',
       |    'username' = '${importConfig.sourceName}',
       |    'password' = '${importConfig.sourcePw}',
       |    'database-name' = '${importConfig.sourceDatabase}',
       |    'schema-name' = '${importConfig.sourceSchema}',
       |    'table-name' = '${importConfig.sourceTableName}',
       |    'server-time-zone' = 'Asia/Shanghai'
       |)
       |""".stripMargin
    a
  }
       

集群运行的问题

在集群上运行会报缺少 HiveConf或者Hive类,这种是Hudi-0.10.1没有依赖Hive环境的问题,需要在Hudi源码上加上Hive依赖,然后打开shade重新编译:

// mvn install -DskipTests -Drat.skip=true -Pflink-bundle-shade-hive2
    <profile>
      <id>flink-bundle-shade-hive2</id>
      <properties>
        <hive.version>2.3.1</hive.version>
        <flink.bundle.hive.scope>compile</flink.bundle.hive.scope>
      </properties>
      <dependencies>
        <dependency>
          <groupId>${hive.groupid}</groupId>
          <artifactId>hive-service-rpc</artifactId>
          <version>${hive.version}</version>
          <scope>${flink.bundle.hive.scope}</scope>
        </dependency>
        <dependency>
          <groupId>${hive.groupid}</groupId>
          <artifactId>hive-exec</artifactId>
          <version>${hive.version}</version>
          <scope>${flink.bundle.hive.scope}</scope>
        </dependency>
      </dependencies>
    </profile>
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值