3）Kafka API实操、拦截器(Kafka producer interceptor)

最新推荐文章于 2023-03-26 18:41:19 发布

念达

最新推荐文章于 2023-03-26 18:41:19 发布

阅读量218

点赞数

分类专栏：大数据之Kafka

本文链接：https://blog.csdn.net/weixin_44757575/article/details/102397274

版权

大数据之Kafka 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

环境准备

启动zk和kafka集群，在kafka集群中打开一个消费者：
bin/kafka-console-consumer.sh --zookeeper hd101:2181 --topic first
导入pom依赖：

<dependencies>
    <!-- https://mvnrepository.com/artifact/org.apache.kafka/kafka-clients -->
    <dependency>
        <groupId>org.apache.kafka</groupId>
        <artifactId>kafka-clients</artifactId>
        <version>0.11.0.0</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.kafka/kafka -->
    <dependency>
        <groupId>org.apache.kafka</groupId>
        <artifactId>kafka_2.12</artifactId>
        <version>0.11.0.0</version>
    </dependency>
</dependencies>

Kafka生产者Java API

创建生产者（过时的API）

public class OldProducer {
	@SuppressWarnings("deprecation")
	public static void main(String[] args) {
	  Properties properties = new Properties();
  	  properties.put("metadata.broker.list", "hadoop102:9092");
  	  properties.put("request.required.acks", "1");
  	  properties.put("serializer.class", "kafka.serializer.StringEncoder");
  
  	  Producer<Integer, String> producer = new Producer<Integer,String>(new ProducerConfig(properties));
  
  	  KeyedMessage<Integer, String> message = new KeyedMessage<Integer, String>("first", "hello world");
  	  producer.send(message );
  	}
}

创建生产者（新API）

public class NewProducer {
 public static void main(String[] args) { 
   Properties prop = new Properties();
   //Kafka的服务端的主机名和端口号
   prop.put("bootstrap.servers","hd101:9092")；
   //等待所有副本的应答
   props.put("acks", "all");
   //消息发送最大尝试次数
   props.put("retries", 0);
   //一批消息处理大小
   props.put("batch.size", 16384);
   //请求延时
   props.put("linger.ms", 1);
   //发送缓冲区内存大小
   props.put("buffer.memory", 33554432);
   //key序列化
   props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
   //value序列化
   props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
   Producer<String, String> producer = new KafkaProducer<>(props);
   for (int i = 0; i < 50; i++) {
     producer.send(new ProducerRecord<String, String>("first", Integer.toString(i), "hello world-" + i));
  }
   producer.close();

创建生产者带回调函数（新API）

public class CallBackProducer {
 public static void main(String[] args) {
   Properties prop = new Properties();
   //Kafka的服务端的主机名和端口号
   prop.put("bootstrap.servers","hd101:9092")；
   //等待所有副本的应答
   props.put("acks", "all");
   //消息发送最大尝试次数
   props.put("retries", 0);
   //一批消息处理大小
   props.put("batch.size", 16384);
   //请求延时
   props.put("linger.ms", 1);
   //发送缓冲区内存大小
   props.put("buffer.memory", 33554432);
   //key序列化
   props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
   //value序列化
   props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
   Producer<String, String> producer = new KafkaProducer<>(props);
   for (int i = 0; i < 50; i++) {
   	kafkaProducer.send(new ProducerRecord<String, String>("first", "hello" + i), new Callback() {
   		@Override
    		public void onCompletion(RecordMetadata metadata, Exception exception) {
   			if (metadata != null) {
   				System.err.println(metadata.partition() + "---" + metadata.offset());
   			}
   		}
   	}); 
   kafkaProducer.close();  
  }
}

自定义分区生产者
需求：将所有数据存储到topic的第0号分区上：
1）定义一个类实现Partitioner接口，重写里面的方法（过时API）

public class CustomPartitioner implements Partitioner {

    public CustomPartitioner() {
        super();
    }

    @Override
    public int partition(Object key, int numPartitions) {
        // 控制分区
        return 0;
    }
}

2）自定义分区（新API）

public class CustomPartitioner implements Partitioner {

    @Override
    public void configure(Map<String, ?> configs) {

    }

    @Override
    public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
        // 控制分区
        return 0;
    }

    @Override
    public void close() {

    }
}

3）在代码中调用



public class PartitionerProducer {
 
    public static void main(String[] args) {
       
         Properties props = new Properties();

         // Kafka服务端的主机名和端口号

         props.put("bootstrap.servers", "hadoop103:9092");

         // 等待所有副本节点的应答

         props.put("acks", "all");

         // 消息发送最大尝试次数

         props.put("retries", 0);

         // 一批消息处理大小

         props.put("batch.size", 16384);

         // 增加服务端请求延时

         props.put("linger.ms", 1);

         // 发送缓存区内存大小

         props.put("buffer.memory", 33554432);

         // key序列化

         props.put("key.serializer",
"org.apache.kafka.common.serialization.StringSerializer");

         // value序列化

         props.put("value.serializer",
"org.apache.kafka.common.serialization.StringSerializer");

         // 自定义分区

         props.put("partitioner.class", "com.atguigu.kafka.CustomPartitioner");

 

         Producer<String, String> producer = new
KafkaProducer<>(props);

         producer.send(new ProducerRecord<String,
String>("first", "1", "atguigu"));


         producer.close();

    }

}

4）测试：
在hd101监控/opt/module/kafka/logs/目录下first主题3个分区的log日志动态变化情况:
[zy@hd101 first-0]$ tail -f 00000000000000000000.log
[zy@hd101 first-1]$ tail -f 00000000000000000000.log
[zy@hd101 first-2]$ tail -f 00000000000000000000.log

Kafka消费者Java API

高级API
0）在控制台创建发送者：
[zy@hd103 kafka]$ bin/kafka-console-producer.sh --broker-list hd101:9092 --topic first
>hello world
1）创建消费者（过时API）

public class CustomConsumer {

    @SuppressWarnings("deprecation")
    public static void main(String[] args) {
        Properties properties = new Properties();

        properties.put("zookeeper.connect", "hd101:2181");
        properties.put("group.id", "g1");
        properties.put("zookeeper.session.timeout.ms", "500");
        properties.put("zookeeper.sync.time.ms", "250");
        properties.put("auto.commit.interval.ms", "1000");

        // 创建消费者连接器
        ConsumerConnector consumer = Consumer.createJavaConsumerConnector(new ConsumerConfig(properties));

        HashMap<String, Integer> topicCount = new HashMap<>();
        topicCount.put("first", 1);

        Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer.createMessageStreams(topicCount);

        KafkaStream<byte[], byte[]> stream = consumerMap.get("first").get(0);

        ConsumerIterator<byte[], byte[]> it = stream.iterator();

        while (it.hasNext()) {
            System.out.println(new String(it.next().message()));
        }
    }
}

2）官方提供案例（自动维护消费情况）（新API）

public class CustomNewConsumer {

    public static void main(String[] args) {

        Properties props = new Properties();
        // 定义kakfa 服务的地址，不需要将所有broker指定上 
        props.put("bootstrap.servers", "hadoop102:9092");
        // 制定consumer group 
        props.put("group.id", "test");
        // 是否自动确认offset 
        props.put("enable.auto.commit", "true");
        // 自动确认offset的时间间隔 
        props.put("auto.commit.interval.ms", "1000");
        // key的序列化类
        props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        // value的序列化类 
        props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        // 定义consumer 
        KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);

        // 消费者订阅的topic, 可同时订阅多个 
        consumer.subscribe(Arrays.asList("first", "second","third"));

        while (true) {
            // 读取数据，读取超时时间为100ms 
            ConsumerRecords<String, String> records = consumer.poll(100);

            for (ConsumerRecord<String, String> record : records)
                System.out.printf("offset = %d, key = %s, value = %s%n", record.offset(), record.key(), record.value());
        }
    }
}

Kafka producer拦截器(interceptor)

拦截器原理
Producer拦截器(interceptor)是在Kafka 0.10版本被引入的，主要用于实现clients端的定制化控制逻辑。
对于producer而言，interceptor使得用户在消息发送前以及producer回调逻辑前有机会对消息做一些定制化需求，比如修改消息等。同时，producer允许用户指定多个interceptor按序作用于同一条消息从而形成一个拦截链(interceptor chain)。Intercetpor的实现接口是org.apache.kafka.clients.producer.ProducerInterceptor，其定义的方法包括：
- configure(configs)：
  获取配置信息和初始化数据时调用。
- onSend(ProducerRecord)：
  该方法封装进KafkaProducer.send方法中，即它运行在用户主线程中，Producer确保在消息被序列化以及计算分区前调用该方法。用户可以在该方法中对消息做任何操作，但最好保证不要修改消息所属的topic和分区，否则会影响目标分区的计算。
- onAcknowledgement(RecordMetadata, Exception)：
  该方法会在消息被应答或消息发送失败时调用，并且通常都是在producer回调逻辑触发之前。onAcknowledgement运行在producer的IO线程中，因此不要在该方法中放入很重的逻辑，否则会拖慢producer的消息发送效率。
- close()：
  关闭interceptor，主要用于执行一些资源清理工作；如前所述，interceptor可能被运行在多个线程中，因此在具体实现时用户需要自行确保线程安全。另外倘若指定了多个interceptor，则producer将按照指定顺序调用它们，并仅仅是捕获每个interceptor可能抛出的异常记录到错误日志中而非在向上传递。这在使用过程中要特别留意
拦截器案例
1）需求：实现一个简单的双interceptor组成的拦截链。第一个interceptor会在消息发送前将时间戳信息加到消息value的最前部；第二个interceptor会在消息发送后更新成功发送消息数或失败发送消息数。

2）案例实操：
①增加时间戳拦截器：

public class TimeInterceptor implements ProducerInterceptor<String, String> {

    @Override
    public void configure(Map<String, ?> configs) {

    }

    @Override
    public ProducerRecord<String, String> onSend(ProducerRecord<String, String> record) {
        // 创建一个新的record，把时间戳写入消息体的最前部
        return new ProducerRecord(record.topic(), record.partition(), record.timestamp(), record.key(),
                System.currentTimeMillis() + "," + record.value().toString());
    }

    @Override
    public void onAcknowledgement(RecordMetadata metadata, Exception exception) {

    }

    @Override
    public void close() {

    }
}

②统计发送消息成功和发送失败消息数，并在producer关闭时打印这两个计数器

public class CounterInterceptor implements ProducerInterceptor<String, String>{
    private int errorCounter = 0;
    private int successCounter = 0;

    @Override
    public void configure(Map<String, ?> configs) {

    }

    @Override
    public ProducerRecord<String, String> onSend(ProducerRecord<String, String> record) {
        return record;
    }

    @Override
    public void onAcknowledgement(RecordMetadata metadata, Exception exception) {
        // 统计成功和失败的次数
        if (exception == null) {
            successCounter++;
        } else {
            errorCounter++;
        }
    }

    @Override
    public void close() {
        // 保存结果
        System.out.println("Successful sent: " + successCounter);
        System.out.println("Failed sent: " + errorCounter);
    }
}

③producer主程序

public class InterceptorProducer {

    public static void main(String[] args) throws Exception {
        // 1 设置配置信息
        Properties props = new Properties();
        props.put("bootstrap.servers", "hd101:9092");
        props.put("acks", "all");
        props.put("retries", 0);
        props.put("batch.size", 16384);
        props.put("linger.ms", 1);
        props.put("buffer.memory", 33554432);
        props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");

        // 2 构建拦截链
        List<String> interceptors = new ArrayList<>();
        interceptors.add("com.atguigu.kafka.interceptor.TimeInterceptor");     interceptors.add("com.atguigu.kafka.interceptor.CounterInterceptor");
        props.put(ProducerConfig.INTERCEPTOR_CLASSES_CONFIG, interceptors);

        String topic = "first";
        Producer<String, String> producer = new KafkaProducer<>(props);

        // 3 发送消息
        for (int i = 0; i < 10; i++) {

            ProducerRecord<String, String> record = new ProducerRecord<>(topic, "message" + i);
            producer.send(record);
        }

        // 4 一定要关闭producer，这样才会调用interceptor的close方法
        producer.close();
    }
}

④测试：在kafka上启动消费者，然后运行客户端java程序

念达

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
3）Kafka API实操、拦截器(Kafka producer interceptor)

环境准备启动zk和kafka集群，在kafka集群中打开一个消费者：bin/kafka-console-consumer.sh --zookeeper hd101:2181 --topic first导入pom依赖：<dependencies> <!-- https://mvnrepository.com/artifact/org.apache.kafka/kaf...
复制链接

扫一扫