场景: Flink写入数据到ParquetFile
Flink版本: 1.15.3
在使用Flink的FlinkSink
将数据写入ParquetFile
时候报错内容如下:
Caused by: java.lang.NoSuchMethodError: org.apache.parquet.io.OutputFile.getPath()Ljava/lang/String;
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:284)
at org.apache.parquet.hadoop.ParquetWriter$Builder.build(ParquetWriter.java:653)
at org.apache.flink.formats.parquet.avro.AvroParquetWriters.createAvroParquetWriter(AvroParquetWriters.java:91)
at org.apache.flink.formats.parquet.avro.AvroParquetWriters.lambda$forSpecificRecord$824091b3$1(AvroParquetWriters.java:53)
at org.apache.flink.formats.parquet.ParquetWriterFactory.create(ParquetWriterFactory.java:56)
at org.apache.flink.streaming.api.functions.sink.filesystem.BulkBucketWriter.openNew(BulkBucketWriter.java:76)
at org.apache.flink.streaming.api.functions.sink.filesystem.OutputStreamBasedPartFileWriter$OutputStreamBasedBucketWriter.openNewInProgressFile(OutputStreamBasedPartFileWriter.java:124)
at org.apache.flink.streaming.api.functions.sink.filesystem.BulkBucketWriter.openNewInProgressFile(BulkBucketWriter.java:36)
at org.apache.flink.connector.file.sink.writer.FileWriterBucket.rollPartFile(FileWriterBucket.java:261)
at org.apache.flink.connector.file.sink.writer.FileWriterBucket.write(FileWriterBucket.java:188)
at org.apache.flink.connector.file.sink.writer.FileWriter.write(FileWriter.java:198)
at org.apache.flink.streaming.runtime.operators.sink.SinkWriterOperator.processElement(SinkWriterOperator.java:160)
at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:233)
at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:134)
at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:105)
at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:519)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:203)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:807)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:756)
at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563)
at java.lang.Thread.run(Thread.java:750)
报错中我们主要看java.lang.NoSuchMethodError: org.apache.parquet.io.OutputFile.getPath()Ljava/lang/String;
这部分即可,这部分是由于parquet-avro
库版本于flink-parquet
中的版本不一致导致的依赖冲突.
这里先看一下原来的pom
文件中的内容
<!-- File connector -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-avro</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-files</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-parquet</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.parquet</groupId>
<artifactId>parquet-avro</artifactId>
<version>1.10.0</version>
</dependency>
文件中Flink的版本是1.15.3
,引入的parquet-avro
版本是1.10.0
,这个版本和flink-parquet
中的版本是冲突的,所以导致报错.
这里讲一下如何解决这个问题,首先我们在Maven中找到flink-parquet
的org.apache.parquet:parquet-hadoop
的版本号,如下图
可以看到org.apache.parquet:parquet-hadoop
的版本是1.12.2
,然后我们将pom
文件中的parquet-avro
的版本也换成1.12.2
即可,如下:
<dependency>
<groupId>org.apache.parquet</groupId>
<artifactId>parquet-avro</artifactId>
<version>1.12.2</version>
</dependency>
更新一下maven
,到此解决问题.