【数据湖hudi应用】hudi同步NoSuchMethodError(Types$PrimitiveBuilder.as)修复

12 篇文章 5 订阅
3 篇文章 0 订阅

一、遭遇NoSuchMethodError异常

在开发hudi应用时,遇到了新的NoSuchMethodError:org.apache.parquet.schema.Types P r i m i t i v e B u i l d e r . a s ( L o r g / a p a c h e / p a r q u e t / s c h e m a / L o g i c a l T y p e A n n o t a t i o n ; ) L o r g / a p a c h e / p a r q u e t / s c h e m a / T y p e s PrimitiveBuilder.as(Lorg/apache/parquet/schema/LogicalTypeAnnotation;)Lorg/apache/parquet/schema/Types PrimitiveBuilder.as(Lorg/apache/parquet/schema/LogicalTypeAnnotation;)Lorg/apache/parquet/schema/TypesBuilder;
异常栈详细如下:

Caused by: java.lang.NoSuchMethodError: org.apache.parquet.schema.Types$PrimitiveBuilder.as(Lorg/apache/parquet/schema/LogicalTypeAnnotation;)Lorg/apache/parquet/schema/Types$Builder;
	at org.apache.hudi.io.storage.row.parquet.ParquetSchemaConverter.convertToParquetType(ParquetSchemaConverter.java:608) ~[hudi-flink1.14-bundle_2.12-0.11.1.jar:0.11.1]
	at org.apache.hudi.io.storage.row.parquet.ParquetSchemaConverter.convertToParquetType(ParquetSchemaConverter.java:549) ~[hudi-flink1.14-bundle_2.12-0.11.1.jar:0.11.1]
	at org.apache.hudi.io.storage.row.parquet.ParquetSchemaConverter.convertToParquetMessageType(ParquetSchemaConverter.java:543) ~[hudi-flink1.14-bundle_2.12-0.11.1.jar:0.11.1]
	at org.apache.hudi.io.storage.row.RowDataParquetWriteSupport.<init>(RowDataParquetWriteSupport.java:45) ~[hudi-flink1.14-bundle_2.12-0.11.1.jar:0.11.1]
	at org.apache.hudi.io.storage.row.HoodieRowDataParquetWriteSupport.<init>(HoodieRowDataParquetWriteSupport.java:47) ~[hudi-flink1.14-bundle_2.12-0.11.1.jar:0.11.1]
	at org.apache.hudi.io.storage.row.HoodieRowDataFileWriterFactory.newParquetInternalRowFileWriter(HoodieRowDataFileWriterFactory.java:68) ~[hudi-flink1.14-bundle_2.12-0.11.1.jar:0.11.1]
	at org.apache.hudi.io.storage.row.HoodieRowDataFileWriterFactory.getRowDataFileWriter(HoodieRowDataFileWriterFactory.java:54) ~[hudi-flink1.14-bundle_2.12-0.11.1.jar:0.11.1]
	at org.apache.hudi.io.storage.row.HoodieRowDataCreateHandle.createNewFileWriter(HoodieRowDataCreateHandle.java:204) ~[hudi-flink1.14-bundle_2.12-0.11.1.jar:0.11.1]
	at org.apache.hudi.io.storage.row.HoodieRowDataCreateHandle.<init>(HoodieRowDataCreateHandle.java:101) ~[hudi-flink1.14-bundle_2.12-0.11.1.jar:0.11.1]
	at org.apache.hudi.sink.bucket.BucketBulkInsertWriterHelper.getRowCreateHandle(BucketBulkInsertWriterHelper.java:83) ~[hudi-flink1.14-bundle_2.12-0.11.1.jar:0.11.1]
	at org.apache.hudi.sink.bucket.BucketBulkInsertWriterHelper.write(BucketBulkInsertWriterHelper.java:67) ~[hudi-flink1.14-bundle_2.12-0.11.1.jar:0.11.1]
	at org.apache.hudi.sink.bulk.BulkInsertWriteFunction.processElement(BulkInsertWriteFunction.java:124) ~[hudi-flink1.14-bundle_2.12-0.11.1.jar:0.11.1]
	at org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66) ~[flink-streaming-java_2.12-1.14.4.jar:1.14.4]
	at org.apache.flink.streaming.runtime.tasks.ChainingOutput.pushToOperator(ChainingOutput.java:99) ~[flink-streaming-java_2.12-1.14.4.jar:1.14.4]
	at org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:80) ~[flink-streaming-java_2.12-1.14.4.jar:1.14.4]
	at org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:39) ~[flink-streaming-java_2.12-1.14.4.jar:1.14.4]
	at org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:56) ~[flink-streaming-java_2.12-1.14.4.jar:1.14.4]
	at org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:29) ~[flink-streaming-java_2.12-1.14.4.jar:1.14.4]
	at org.apache.flink.table.runtime.util.StreamRecordCollector.collect(StreamRecordCollector.java:44) ~[flink-table-runtime_2.12-1.14.4.jar:1.14.4]
	at org.apache.hudi.sink.bulk.sort.SortOperator.endInput(SortOperator.java:113) ~[hudi-flink1.14-bundle_2.12-0.11.1.jar:0.11.1]
	at org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.endOperatorInput(StreamOperatorWrapper.java:91) ~[flink-streaming-java_2.12-1.14.4.jar:1.14.4]
	at org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.lambda$finish$0(StreamOperatorWrapper.java:127) ~[flink-streaming-java_2.12-1.14.4.jar:1.14.4]
	at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50) ~[flink-streaming-java_2.12-1.14.4.jar:1.14.4]
	at org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.finish(StreamOperatorWrapper.java:127) ~[flink-streaming-java_2.12-1.14.4.jar:1.14.4]
	at org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.finish(StreamOperatorWrapper.java:134) ~[flink-streaming-java_2.12-1.14.4.jar:1.14.4]
	at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.finishOperators(RegularOperatorChain.java:117) ~[flink-streaming-java_2.12-1.14.4.jar:1.14.4]
	at org.apache.flink.streaming.runtime.tasks.StreamTask.endData(StreamTask.java:549) ~[flink-streaming-java_2.12-1.14.4.jar:1.14.4]
	at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:508) ~[flink-streaming-java_2.12-1.14.4.jar:1.14.4]
	at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:203) ~[flink-streaming-java_2.12-1.14.4.jar:1.14.4]
	at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:809) ~[flink-streaming-java_2.12-1.14.4.jar:1.14.4]
	at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:761) ~[flink-streaming-java_2.12-1.14.4.jar:1.14.4]
	at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) ~[flink-runtime-1.14.4.jar:1.14.4]
	at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:937) ~[flink-runtime-1.14.4.jar:1.14.4]
	at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) ~[flink-runtime-1.14.4.jar:1.14.4]
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) ~[flink-runtime-1.14.4.jar:1.14.4]
	... 1 more

二、异常栈分析

根据异常栈,找到org.apache.hudi.io.storage.row.parquet.ParquetSchemaConverter.convertToParquetType的代码,对应代码段试图将timestampe类型转化为对应的parquet type。异常栈报的就是找不到对应的方法Types$PrimitiveBuilder.as。

  private static Type convertToParquetType(
      String name, LogicalType type, Type.Repetition repetition) {
    switch (type.getTypeRoot()) {
      case TIMESTAMP_WITHOUT_TIME_ZONE:
        TimestampType timestampType = (TimestampType) type;
        if (timestampType.getPrecision() == 3) {
          **return Types.primitive(PrimitiveType.PrimitiveTypeName.INT64, repetition)
              .as(LogicalTypeAnnotation.timestampType(true, TimeUnit.MILLIS))
              .named(name);**
        } else {
          return Types.primitive(PrimitiveType.PrimitiveTypeName.INT96, repetition)
              .named(name);
        }
}

明显这又是一个依赖冲突导致的异常。搜索Types P r i m i t i v e B u i l d e r 类,可以看到项目里有两个 T y p e s PrimitiveBuilder类,可以看到项目里有两个 Types PrimitiveBuilder类,可以看到项目里有两个TypesPrimitiveBuilder,一个来自org.apache.hudi:hudi-flink1.14-bundle_2.12,这个是我们想要的。一个来自parquet-hadoop-bundle,点进去看一下对应的代码,PrimitiveBuilder类里的确没有as方法。那么这个包我们就要排除掉。

  public static class PrimitiveBuilder<P> extends Types.BasePrimitiveBuilder<P, Types.PrimitiveBuilder<P>> {
    private PrimitiveBuilder(P parent, PrimitiveTypeName type) {
      super((Object)parent, type, null);
    }

    private PrimitiveBuilder(Class<P> returnType, PrimitiveTypeName type) {
      super((Class)returnType, type, null);
    }

    protected Types.PrimitiveBuilder<P> self() {
      return this;
    }
  }

通过依赖分析,获取依赖树,确认parquet-hadoop-bundle由hive-metastore引入。这时候只需要将项目里,依赖中包含parquet-hadoop-bundle的exclude掉即可。

+--- org.apache.hive:hive-metastore:3.0.0
|    +--- org.apache.hive:hive-serde:3.0.0
|    |    +--- org.apache.hive:hive-common:3.0.0 (*)
|    |    +--- org.apache.hive:hive-service-rpc:3.0.0
|    |    +--- org.apache.parquet:parquet-hadoop-bundle:1.9.0
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值