java kafka 设置分区_在Kafka连接器中设置分区策略

最新推荐文章于 2023-10-18 10:33:20 发布

弗雷德里克·雷蒙德

最新推荐文章于 2023-10-18 10:33:20 发布

阅读量436

点赞数

文章标签： java kafka 设置分区

本文链接：https://blog.csdn.net/weixin_29532173/article/details/114876609

版权

源连接器可以通过 SourceRecord 的 partition 字段控制写入每个源记录的分区 . 如果这是您自己的连接器，这是最直接的 .

但是，如果要更改源连接器分区每条记录的方式，可以使用单个消息转换(SMT)覆盖源记录的 partition 字段 . 您可能必须通过实现 org.apache.kafka.connect.transforms.Transformation 并使用自己的分区逻辑来编写自定义SMT，但这实际上比编写自定义Kafka分区程序要容易一些 .

例如，这是一个名义上的自定义转换，它显示了如何使用配置属性以及如何使用所需的分区号创建新的 SourceRecord 实例 . 样本不完整，因为它实际上没有任何真正的分区逻辑，但它应该是一个很好的起点 .

package io.acme.example;

import org.apache.kafka.common.config.AbstractConfig;

import org.apache.kafka.common.config.ConfigDef;

import org.apache.kafka.common.config.ConfigDef.Importance;

import org.apache.kafka.common.config.ConfigDef.Type;

import org.apache.kafka.connect.source.SourceRecord;

import org.apache.kafka.connect.transforms.Transformation;

import java.util.Map;

public class CustomPartitioner implements Transformation {

private static final String MAX_PARTITIONS_CONFIG = "max.partitions";

private static final String MAX_PARTITIONS_DOC = "The maximum number of partitions";

private static final int MAX_PARTITIONS_DEFAULT = 1;

/**

* The definition of the configurations. We just define a single configuration property here,

* but you can chain multiple "define" methods together. Complex configurations may warrant

* pulling all the config-related things into a separate class that extends {@link AbstractConfig}

* and adds helper methods (e.g., "getMaxPartitions()"), and you'd use this class to parse the

* parameters in {@link #configure(Map)} rather than {@link AbstractConfig}.

private static final ConfigDef CONFIG_DEF = new ConfigDef().define(MAX_PARTITIONS_CONFIG, Type.INT, MAX_PARTITIONS_DEFAULT, Importance.HIGH, MAX_PARTITIONS_DOC);

private int maxPartitions;

@Override

public void configure(Map configs) {

// store any configuration parameters as fields ...

AbstractConfig config = new AbstractConfig(CONFIG_DEF, configs);

maxPartitions = config.getInt(MAX_PARTITIONS_CONFIG);

}

@Override

public SourceRecord apply(SourceRecord record) {

// Compute the desired partition here

int actualPartition = record.kafkaPartition();

int desiredPartition = ...

// Then create the new record with all of the existing fields except with the new partition ...

return record.newRecord(record.topic(), desiredPartition,

record.keySchema(), record.key(),

record.valueSchema(), record.value(),

record.timestamp());

}

@Override

public ConfigDef config() {

return CONFIG_DEF;

}

@Override

public void close() {

// do nothing

}

ConfigDef 和 AbstractConfig 功能非常有用，可以执行更多有趣的操作，包括使用自定义验证器和推荐器，以及具有依赖于其他属性的配置属性 . 如果您想了解更多相关信息，请查看一些现有的Kafka Connect连接器，它们也使用相同的框架 .

最后一件事 . 运行Kafka Connect独立或分布式工作线程时，但确保将CLASSPATH环境变量设置为指向包含自定义SMT的JAR文件以及SMT所依赖的任何JAR文件(Kafka提供的除外) . connect-standalone.sh 和 connect-distributed.sh 命令会自动将Kafka JAR添加到类路径中 .

弗雷德里克·雷蒙德

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
java kafka 设置分区_在Kafka连接器中设置分区策略

源连接器可以通过 SourceRecord 的 partition 字段控制写入每个源记录的分区 . 如果这是您自己的连接器，这是最直接的 .但是，如果要更改源连接器分区每条记录的方式，可以使用单个消息转换(SMT)覆盖源记录的 partition 字段 . 您可能必须通过实现 org.apache.kafka.connect.transforms.Transformation 并使用自己的分区逻...
复制链接

扫一扫