前言
最近在学习Flink 如何读写Kakfa,在使用精确一次语义时遇到一些报错,这里整理并记录一下。
flink 1.15
kafka 2.7
pom
相关pom.xml
内容如下:
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-runtime-web</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-test-utils</artifactId>
<version>${flink.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-kafka</artifactId>
<version>1.15.0</version>
</dependency>
生产端
public class kafkaSinkDemo {
public static void main(String[] args) throws Exception {
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
DataStreamSource<String> stream = env.socketTextStream("59.110.32.152", 1234);
// 设置transaction 超时时间,不然程序会报错
Properties properties = new Properties();
properties.setProperty("transaction.timeout.ms", "10000");
KafkaSink<String> sink = KafkaSink.<String>builder()
.setBootstrapServers("59.110.32.152:9092")
.setRecordSerializer(KafkaRecordSerializationSchema.builder()
.setTopic("test")
.setValueSerializationSchema(new SimpleStringSchema())
.build()
)
// 设置 ts 的id前缀
.setTransactionalIdPrefix("ts")
// 精确一次生产
.setDeliverGuarantee(DeliveryGuarantee.EXACTLY_ONCE)
.setKafkaProducerConfig(properties)
.build();
stream.sinkTo(sink);
env.execute();
}
}
注意:如果要使用精确一次的语义,这里需要指定
transaction.timeout.ms
配置,不然程序会报The transaction timeout is larger than the maximum value allowed by the broker。
请参考官方文档:https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/kafka/#consumer-offset-committing
消费端
public class KafkaSourceDemo {
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
KafkaSource<String> source = KafkaSource.<String>builder()
.setBootstrapServers("59.110.32.152:9092")
.setTopics("test")
.setGroupId("my-group")
.setStartingOffsets(OffsetsInitializer.earliest())
.setDeserializer(KafkaRecordDeserializationSchema.valueOnly(StringDeserializer.class))
// 动态分区发现
.setProperty("partition.discovery.interval.ms", "10000")
.build();
DataStreamSource<String> kafka_source = env.fromSource(source, WatermarkStrategy.noWatermarks(), "Kafka Source");
kafka_source.print();
env.execute();
}
}