<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-kafka-0.11_2.12</artifactId>
<version>1.10.2</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-kafka_2.12</artifactId>
<version>1.10.2</version>
</dependency>
为什么先说一下pom呢?因为两个不同的依赖,会有对应的不同的实现方式。具体使用哪种,就看个人喜好。
首先我们说一下第一种。
public class CustomProducerSchema implements SerializationSchema<ObjectNode>, KafkaContextAware<ObjectNode> {
private String topic;
private int[] partitions;
public CustomProducerSchema(String topic, int[] partitions) {
super();
this.topic = topic;
this.partitions = partitions;
}
/**
* Returns the topic that the presented element should be sent to. This is not used for setting
* the topic (this is done via the {@link ProducerRecord} that
* is returned from {@link KafkaSerializationSchema#serialize(Object, Long)}, it is only used
* for getting the available partitions that are presented to {@link #setPartitions(int[])}.
*
* @param element
*/
@Override
public String getTargetTopic(ObjectNode element) {
if(Integer.parseInt(element.toString().replaceAll("[^\\d]+","")) % 2 > 0){
topic ="odd";
}
return topic;
}
/**
* Sets the available partitions for the topic returned from {@link #getTargetTopic(Object)}.
*
* @param partitions
*/
@Override
public void setPartitions(int[] partitions) {
this.partitions = partitions;
}
@Override
public byte[] serialize(ObjectNode element) {
String key = element.get("key").toString();
return key.getBytes(StandardCharsets.UTF_8);
}
}
在Flink sink时,调用的方式为:
FlinkKafkaProducer011<ObjectNode> producer = new FlinkKafkaProducer011<ObjectNode>(
topic,
new KeyedSerializationSchemaWrapper<>(new CustomProducerSchema()),
producerConfig,
Optional.of(new CustomProducerPartitioner()),
FlinkKafkaProducer011.Semantic.EXACTLY_ONCE,
9);
第二种pom的方式:
public class CustomProducerKafkaSchema implements KafkaSerializationSchema<ObjectNode> {
public CustomProducerKafkaSchema() {
super();
}
/**
* Serializes given element and returns it as a {@link ProducerRecord}.
*
* @param element element to be serialized
* @param timestamp timestamp (can be null)
* @return Kafka {@link ProducerRecord}
*/
@Override
public ProducerRecord<byte[], byte[]> serialize(ObjectNode element, @Nullable Long timestamp) {
String targetTopic = "";
int partition = element.get("partition").asInt();
long timeMillis = System.currentTimeMillis();
String key = element.get("key").toString();
String value = element.get("value").toString();
return new ProducerRecord<>(targetTopic, partition, timeMillis, key.getBytes(StandardCharsets.UTF_8), value.getBytes(StandardCharsets.UTF_8));
}
}
sink调用:
FlinkKafkaProducer<ObjectNode> producer = new FlinkKafkaProducer<>(
topic,
new CustomProducerKafkaSchema(),
producerConfig,
FlinkKafkaProducer.Semantic.EXACTLY_ONCE
);
两种方式都是行得通的。