Kafka学习--进阶01

参考:深入理解Kafka核心设计和实践原理

1、基本概念

Producer:负责生产消息,并将其push到Kafka中。
Consumer:负责连接到Kafka并pull消息,对消息进行后续的处理。
Broker: 服务得到代理节点,你的一个虚拟机可以看做是一个单独的Kafka Broker:一个或者多个broker可以看做是一个Kafka集群。
Topic:逻辑概念,kafka的消息是以topic为单位进行归类。
Partiton: topic可细分为多个partition,一个partition只能属于单个的topic。同一topic下的不同partition包含的消息是不同的。producer采用append的方式将消息push到分区中的log文件中,并伴随着一个特定的offset,而这个offset是partition中消息的位移表示,Kafka通过这个offset保证分区内消息的有序性,由于partition不能跨区,因此Kafka能保证partition消息的有序性,不能保证topic的有序性。
Kafka为partition引入了多副本机制,进而增加整个集群的容灾性。同一个partition的不同副本中保存的是相同的消息(同一时刻,可能副本之间的消息并非完全一致,因为每个副本的同步消息的速度不同),副本之间采用的是master-slave的关系,其中leader副本只副本读写请求,follower副本只负责与leader副本之间进行消息的同步。副本处于不停的broker中,当其中的一个broker出现故障时,可以从其他的follower副本中选出新的leader副本,继续对外提供服务,同时保证数据不丢失。
AR(Assigned Replicas):partition中所有的副本统称为AR。
ISR(In-Sync Replicas): 所有与leader副本保持一定同步的副本(包括leader副本)组成ISR。
OSR(Out-Sync Replicas): 与leader副本同步滞后个过多的副本(不包括leader副本)组成OSR;
三者之间的关系:AR = ISR + OSR

HW(high waterMark): 高水位,是一个特定的offset,消费者只能pull到这个offset之前的消息。

LEO(log end offset):标识当前log文件中下一条待写入消息的位置
在这里插入图片描述

2、生产者代码
package com.paojiaojiang.producer;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.common.serialization.StringSerializer;
import java.util.Properties;

/**
 @Author: jja
 @Description:
 @Date: 2019/4/1 23:35
  */
  public class KafkaProducerExample {
  public static final String brokerList = "spark:9092,spark1,9092";
  public static final String topic = "user_info";
    //  public static final Integer partiton = 2;  // 执行将消息为key的消息发送到partition为2的分区。
  public static Properties initConfig() {
      Properties props = new Properties();
      props.put("bootstrap.servers", brokerList);
      props.put("key.serializer", StringSerializer.class.getName());
      props.put("value.serializer",StringSerializer.class.getName());
      props.put("client.id", "producer.client.id.demo");
      
      return props;
  }
  public static void main(String[] args) {
      Properties props = initConfig();
      KafkaProducer<String, String> kafkaProducer = new KafkaProducer<>(props);
      ProducerRecord<String, String> producerRecord = new ProducerRecord<>(topic,"hello, kafka");
      // 可用于同时启动多个生产者,将消息为key的消息发送到指定得到分区上
    //  ProducerRecord<String, String> producerRecord = new ProducerRecord<>(topic,"hello, kafka");
      try {
          kafkaProducer.send(producerRecord);
      } catch (Exception e) {
          e.printStackTrace();
      }
  }
  }

Kafka的消息的源码:

public class ProducerRecord<K, V> {
  private final String topic;  // topic
  private final Integer partition;  //分区
  private final Headers headers;  // 消息头
  private final K key;  // key 
  private final V value;  // value  消息体
  private final Long timestamp;  // 时间戳

参数解析
bootstrap.servers: 指定生产者客户端连接Kafka集群锁需要的broker的地址,这里并非需要将所有的broker的地址都写上,因为生产者会从给定的broker中查找其他的brokers的信息。

但是为了高可用,我们至少应该设置两个及以上的broker的地址,防止某个broker宕机导致消息不能正常的发送。

key.serializer、key.serializer:broker端接收消息必须以字节数据(byte[])的形式存在,因此需要将其先进行序列化才能在网络上进行传输。

KafkaProducer实现线程安全的,可以在多个线程中共享单个的KafkaProducer实例。

// The producer is thread safe and sharing a single producer instance across threads will generally be faster than having multiple instances.(生产者是线程安全的,跨线程共享单个生产者实例通常比拥有多个实例要快。)

3、Kafka的ProducerRecord源码解析:
/**
 * Creates a record with a specified timestamp to be sent to a specified topic and partition
 *(使用指定的时间戳创建要发送到指定topic和partition)
 * @param topic The topic the record will be appended to
 * @param partition The partition to which the record should be sent
 * @param timestamp The timestamp of the record
 * @param key The key that will be included in the record
 * @param value The record contents
 */
public ProducerRecord(String topic, Integer partition, Long timestamp, K key, V value) {
    this(topic, partition, timestamp, key, value, null);
}

/**
 * Creates a record to be sent to a specified topic and partition
 * (发送一条消息到指定的topic和partition)
 * @param topic The topic the record will be appended to
 * @param partition The partition to which the record should be sent
 * @param key The key that will be included in the record
 * @param value The record contents
 * @param headers The headers that will be included in the record 消息的头信息包含在record中
 */
public ProducerRecord(String topic, Integer partition, K key, V value,  Iterable<Header> headers) {
    this(topic, partition, null, key, value, headers);
}

/**
 * Creates a record to be sent to a specified topic and partition
 *(发送一条消息到指定的topic和partition,无头信息)
 * @param topic The topic the record will be appended to
 * @param partition The partition to which the record should be sent
 * @param key The key that will be included in the record
 * @param value The record contents
 */
public ProducerRecord(String topic, Integer partition, K key, V value) {
    this(topic, partition, null, key, value, null);
}

/**
 * Create a record to be sent to Kafka
 * (有key,同一个key的消息可以被划分到同一个分区中去,而分区是有序的)
 * @param topic The topic the record will be appended to
 * @param key The key that will be included in the record
 * @param value The record contents
 */
public ProducerRecord(String topic, K key, V value) {
    this(topic, null, null, key, value, null);
}

/**
 * Create a record with no key
 * (无key)
 * @param topic The topic this record should be sent to
 * @param value The record contents
 */
public ProducerRecord(String topic, V value) {
    this(topic, null, null, null, value, null);
}

3、Kafka生产者的send方法的源码

send:
 /**
 * Asynchronously send a record to a topic.(异步发送消息到topic)
 */
@Override
public Future<RecordMetadata> send(ProducerRecord<K, V> record) {
    return send(record, null);
}

send方法的解析:

 /**
   * Asynchronously send a record to a topic and invoke the provided callback when the send has been acknowledged. The send is asynchronous and this method will return immediately once the record has been stored in the buffer of records waiting to be sent. This allows sending many records in parallel without blocking to wait for the response after each one.
 (异步将记录发送到主题,并在确认发送后调用提供的回调。发送是异步的,一旦记录存储在等待发送的记录缓冲区中,此方法将立即返回。这允许在不阻塞的情况下并行发送多个记录,以等待每个记录之后的响应。)
 */
@Override
public Future<RecordMetadata> send(ProducerRecord<K, V> record, Callback callback) {
    // intercept the record, which can be potentially modified; this method does not throw exceptions(截获可能被修改的记录;此方法不引发异常)
    ProducerRecord<K, V> interceptedRecord = this.interceptors == null ? record : this.interceptors.onSend(record);
    return doSend(interceptedRecord, callback);
}

官网例子:

* ProducerRecord<byte[],byte[]> record = new ProducerRecord<byte[],byte[]>("the-topic", key, value);
 * producer.send(myRecord,
 *               new Callback() {
 *                   public void onCompletion(RecordMetadata metadata, Exception e) {
 *                       if(e != null) {  // 出现异常
 *                          e.printStackTrace();
 *                       } else {
 *                          System.out.println("The offset of the record we just sent is: " + metadata.offset());
 *                       }
 *                   }
 *               });
 * }
 *
4、自定义序列化:

需要序列化的实体类

package com.paojiaojiang.serialize;

/**

- @Author: jja
- @Description:
- @Date: 2019/4/2 0:33
  */

public class PersonInfo {

    private String name;
    private Integer age;
    
    // getter、setter、NoArgsConstructor、ArgsConstructor

}

自定义实体的序列化:

package com.paojiaojiang.serialize;
import org.apache.kafka.common.serialization.Serializer;
import java.io.UnsupportedEncodingException;
import java.nio.ByteBuffer;
import java.util.Map;

/**

- @Author: jja
- @Description: 自定义实体的序列化
- @Date: 2019/4/2 0:32
  */
  public class PersonSerialize implements Serializer<PersonInfo> {
  @Override
  public void configure(Map<String, ?> configs, boolean isKey) {
  }
  @Override
  public byte[] serialize(String topic, PersonInfo data) {
      try {
          if (data == null) {
              return null;
          }
          byte[] name = new byte[0], age = new byte[0];
      
          if (data.getName() != null) {
               data.getName().getBytes("UTF-8");
          }
          if (data.getAge() != null) {
              age = data.getAge().toString().getBytes("UTF-8");
          }
          ByteBuffer buffer = ByteBuffer.allocate(4 + 4 + name.length + age.length);
          buffer.putInt(name.length);
          buffer.put(name);
          buffer.putInt(age.length);
          buffer.put(age);
          
      } catch (UnsupportedEncodingException e) {
          e.printStackTrace();
      }
      
      return new byte[0];
  }
  @Override
  public void close() {
  }
  }

生产者:

package com.paojiaojiang.serialize;

import org.apache.kafka.clients.producer.KafkaProducer;

import org.apache.kafka.clients.producer.ProducerConfig;

import org.apache.kafka.clients.producer.ProducerRecord;

import org.apache.kafka.common.serialization.StringSerializer;

import java.util.Properties;

import java.util.concurrent.ExecutionException;

/**

- @Author: jja
- @Description:
- @Date: 2019/4/2 23:52
  */
  public class PersonInfoProducer {
  private static final String brokerList = "spark:9092,spark1:9092";
  private static final String topic  = "personInfo";
  public static Properties initConfig() throws ExecutionException, InterruptedException {
      Properties props = new Properties();
      props.put("bootstrap.servers",brokerList);
      props.put("client.id", "producer.client.id.personInfo");
      props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
      props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,PersonInfoProducer.class.getName());
      
      return props;
  }
  
  public static void main(String[] args) throws ExecutionException, InterruptedException {

    Properties props = initConfig();
    // 一般我们发送的是String类型的消息  这里发送的是一个对象实体PersonInfo
    KafkaProducer<String,PersonInfo> producer = new KafkaProducer<>(props);
    PersonInfo personInfo = new PersonInfo("paojiaojiang", 12);
    ProducerRecord<String,PersonInfo> record = new ProducerRecord<>(topic, personInfo);
    producer.send(record).get();

	}
}



  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值