作者 | 吴邪 大数据4年从业经验,目前就职于广州一家互联网公司,负责大数据基础平台自研、离线计算&实时计算研究
编辑 | auroral-L
全文共6000字,预计阅读30分钟。
Apache Kafka是一个高性能的开源分布式消息中间件,上一篇文章「浅谈Kafka」对kafka做了简单的介绍,让我们对kafka的架构、工作原理及优势有个大概的了解。从这篇文章开始,将深入剖析kafka核心功能的源码实现,让我们对kafka底层的原理有更深的认知。
通过上一篇文章,我们知道了Kafka消息队列主要有三部分组成:生产者(Producer)、消费者和Broker组成,所以我们从生产者入手开始我们的源码剖析之旅。我们公司目前使用的是kafka V2.11——2.1.1这个版本,源码剖析的时候可以根据个人喜好选择0.10之后的版本进行剖析,因为目前2.x版本的架构跟0.10相差不大,0.10之前的版本在架构和功能上差别非常大。这篇文章以2.x版本的Kafka作为剖析的对象。
生产者发送消息流程
Kafka的源码最核心的是由client模块和core模块构成,在开始剖析源码之前,先用一幅图大致介绍一下生产者发送消息的流程。
图1:生产者发送消息流程
将消息封装成ProducerRecord对象
Serializer对消息的key和value做序列化
根据Partitioner将消息分发到不同的分区,需要先获取集群的元数据
RecordAccumulator封装很多分区的消息队列,每个队列代表一个分区,每个分区里面有很多的批次,每个批次里面由多条消息组成
Sender会从RecordAccumulator拉取消息,封装成批次,发送请求
通过网络将请求发送到kafka集群
如果你不知道从何入手,那就打开Kafka源码example目录下的Producer这个类,层层递进。你会发现Producer继承了Thread,这就有点意思了,所以run()方法是我们一定要看的。
public class Producer extends Thread { //定义了KafkaProducer对象 private final KafkaProducer<Integer, String> producer; //消息的主题 private final String topic; //发送消息的方式:同步发送或者异步发送 private final Boolean isAsync; ......
通过Producer的代码结构可以看到,Producer主要包括了构造函数和内部类DemoCallBack。
/**
* 初始化生产者对象
* @param topic
* @param isAsync
*/
public Producer(String topic, Boolean isAsync) {
Properties props = new Properties();
// 指定kafka服务端的主机名和端口号,从kafka集群获取元数据信息
props.put("bootstrap.servers", "localhost:9092");
//客户端ID
props.put("client.id", "DemoProducer");
//IntegerSerializer将消息key序列化成字节数组
props.put("key.serializer", "org.apache.kafka.common.serialization.IntegerSerializer");
//StringSerializer将String对象序列化成字节数组
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
//初始化kafkaProducer,生产者的核心类
producer = new KafkaProducer<>(props);
this.topic = topic;
this.isAsync = isAsync;
}
初始KafkaProducer
/**
* A producer is instantiated by providing a set of key-value pairs as configuration. Valid configuration strings
* are documented <a href="http://kafka.apache.org/documentation.html#producerconfigs">here</a>.
* @param properties The producer configs
*/
//通过producer的配置文件进行KafkaProducer初始化
public KafkaProducer(Properties properties) {
this(new ProducerConfig(properties), null, null);
}
KafkaProducer核心功能详解
Producer初始化了KafkaProducer这个核心类,消息就是调用了这个类的send(xxx)方法进行发送的。感兴趣的小伙伴可以去看一下KafkaProducer这个类的注释,因为内容有点长,这里就简单做一下介绍,主要说明了KafkaProducer是发送消息的客户端,并且是线程安全的,支持高并发,kafkaProducer还包含一个用于缓冲待提交消息的缓冲空间,发送消息是异步的,支持ACK消息确认机制,以及包含producer的配置,详情如下,截取了重点部分,可以参考一下。
/**
* A Kafka client that publishes records to the Kafka cluster.
* <P>
* The producer is <i>thread safe</i> and sharing a single producer instance across threads will generally be faster than
* having multiple instances.
* <p>
* The producer consists of a pool of buffer space that holds records that haven't yet been transmitted to the server
* as well as a background I/O thread that is responsible for turning these records into requests and transmitting them
* to the cluster. Failure to close the producer after use will leak these resources.
* <p>
* The {@link #send(ProducerRecord) send()} method is asynchronous. When called it adds the record to a buffer of pending record sends
* and immediately returns. This allows the producer to batch together individual records for efficiency.
* <p>
* The <code>acks</code> config controls the criteria under which requests are considered complete. The "all" setting
* we have specified will result in blocking on the full commit of the record, the slowest but most durable setting.
*/
private static final Logger log = LoggerFactory.getLogger(KafkaProducer.class);
//用于生产clientID
private static final AtomicInteger PRODUCER_CLIENT_ID_SEQUENCE = new AtomicInteger(1);
private static final String JMX_PREFIX = "kafka.producer";
private String clientId;
//分区器
private final Partitioner partitioner;
//消息的最大长度
private final int maxRequestSize;
//缓冲区大小
private final long totalMemorySize;
//管理的元数据的对象
private final Metadata metadata;
//用于收集并缓存消息,等待sender线程调用
private final RecordAccumulator accumulator;
//消息发送任务,是一个线程,实现了Runnable接口
private final Sender sender;
private final Metrics metrics;
//执行sender任务,发送消息的线程
private final Thread ioThread;
//压缩算法,有none、gzip、snappy和lz4,针对RecordAccumulator的消息使用
private final CompressionType compressionType;
private final Sensor errors;
private final Time time;
//key的序列化器
private final Serializer<K> keySerializer;
//value的序列化器
private final Serializer<V> valueSerializer;
//配置对象
private final ProducerConfig producerConfig;
//等待更新kafka集群元数据的最大时长
private final long maxBlockTimeMs;
//消息超时时间
private final int requestTimeoutMs;
//消息拦截器,发送消息之前可以先初步过滤
private final ProducerInterceptors<K, V> interceptors;
了解了KafkaProducer主要的功能概述之后,继续回到KafkaProducer的初始化流程,找到KafkaProducer(xxx)构造方法。
@SuppressWarnings({"unchecked", "deprecation"})
private KafkaProducer(ProducerConfig config, Serializer<K> keySerializer, Serializer<V> valueSerializer) {
try {
log.trace("Starting the Kafka producer");
// 配置用户自定义的参数
Map<String, Object> userProvidedConfigs = config.originals();
this.producerConfig = config;
this.time = new SystemTime();
//生成clientId
clientId = config.getString(ProducerConfig.CLIENT_ID_CONFIG);
if (clientId.length() <= 0)
clientId = "producer-" + PRODUCER_CLIENT_ID_SEQUENCE.getAndIncrement();
Map<String, String> metricTags = new LinkedHashMap<String, String>();
metricTags.put("client-id", clientId);
MetricConfig metricConfig = new MetricConfig().samples(config.getInt(ProducerConfig.METRICS_NUM_SAMPLES_CONFIG)) .timeWindow(config.getLong(ProducerConfig.METRICS_SAMPLE_WINDOW_MS_CONFIG), TimeUnit.MILLISECONDS).tags(metricTags);
List<MetricsReporter> reporters = config.getConfiguredInstances(ProducerConfig.METRIC_REPORTER_CLASSES_CONFIG,
MetricsReporter.class);
reporters.add(new JmxReporter(JMX_PREFIX));
this.metrics = new Metrics(metricConfig, reporters, time);
//设置分区器,可以引用自定义的分区器
this.partitioner = config.getConfiguredInstance(ProducerConfig.PARTITIONER_CLASS_CONFIG, Partitioner.class);
//重试时间
//RETRY_BACKOFF_MS_CONFIG=retry.backoff.ms 默认100ms
long retryBackoffMs = config.getLong(ProducerConfig.RETRY_BACKOFF_MS_CONFIG);
//设置key序列化器
if (keySerializer == null) {
this.keySerializer = config.getConfiguredInstance(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
Serializer.class);
this.keySerializer.configure(config.originals(), true);
} else {
config.ignore(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG);
this.keySerializer = keySerializer;
}
//设置value序列化器
if (valueSerializer == null) {
this.valueSerializer = config.getConfiguredInstance(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
Serializer.class);
this.valueSerializer.configure(config.originals(), false);
} else {
config.ignore(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG);
this.valueSerializer = valueSerializer;
}
// load interceptors and make sure they get clientId
userProvidedConfigs.put(ProducerConfig.CLIENT_ID_CONFIG, clientId);
//设置拦截器,一般用不到
List<ProducerInterceptor<K, V>> interceptorList = (List) (new ProducerConfig(userProvidedConfigs)).getConfiguredInstances(ProducerConfig.INTERCEPTOR_CLASSES_CONFIG,
ProducerInterceptor.class);
this.interceptors = interceptorList.isEmpty() ? null : new ProducerInterceptors<>(interceptorList);
ClusterResourceListeners clusterResourceListeners = configureClusterResourceListeners(keySerializer, valueSerializer, interceptorList, reporters);
//METADATA_MAX_AGE_CONFIG=metadata.max.age.ms,默认5分钟
//生产者每隔一段时间都要去更新一下集群的元数据。
this.metadata = new Metadata(retryBackoffMs, config.getLong(ProducerConfig.METADATA_MAX_AGE_CONFIG), true, clusterResourceListeners);
//MAX_REQUEST_SIZE_CONFIG=max.request.size 默认是1M
//代表单个消息请求的最大值,可以根据公司的实际情况设置
this.maxRequestSize = config.getInt(ProducerConfig.MAX_REQUEST_SIZE_CONFIG);
//指的是用于存放待发送消息的缓存区大小,默认是32M
this.totalMemorySize = config.getLong(ProducerConfig.BUFFER_MEMORY_CONFIG);
//kafka可以通过设置压缩格式提高吞吐量,相应的会消耗更多的cpu资源
this.compressionType = CompressionType.forName(config.getString(ProducerConfig.COMPRESSION_TYPE_CONFIG));
/* check for user defined settings.
* If the BLOCK_ON_BUFFER_FULL is set to true,we do not honor METADATA_FETCH_TIMEOUT_CONFIG.
* This should be removed with release 0.9 when the deprecated configs are removed.
*/
if (userProvidedConfigs.containsKey(ProducerConfig.BLOCK_ON_BUFFER_FULL_CONFIG)) {
log.warn(ProducerConfig.BLOCK_ON_BUFFER_FULL_CONFIG + " config is deprecated and will be removed soon. " +
"Please use " + ProducerConfig.MAX_BLOCK_MS_CONFIG);
boolean blockOnBufferFull = config.getBoolean(ProducerConfig.BLOCK_ON_BUFFER_FULL_CONFIG);
if (blockOnBufferFull) {
this.maxBlockTimeMs = Long.MAX_VALUE;
} else if (userProvidedConfigs.containsKey(ProducerConfig.METADATA_FETCH_TIMEOUT_CONFIG)) {
log.warn(ProducerConfig.METADATA_FETCH_TIMEOUT_CONFIG + " config is deprecated and will be removed soon. " +
"Please use " + ProducerConfig.MAX_BLOCK_MS_CONFIG);
this.maxBlockTimeMs = config.getLong(ProducerConfig.METADATA_FETCH_TIMEOUT_CONFIG);
} else {
this.maxBlockTimeMs = config.getLong(ProducerConfig.MAX_BLOCK_MS_CONFIG);
}
} else if (userProvidedConfigs.containsKey(ProducerConfig.METADATA_FETCH_TIMEOUT_CONFIG)) {
log.warn(ProducerConfig.METADATA_FETCH_TIMEOUT_CONFIG + " config is deprecated and will be removed soon. " +
"Please use " + ProducerConfig.MAX_BLOCK_MS_CONFIG);
this.maxBlockTimeMs = config.getLong(ProducerConfig.METADATA_FETCH_TIMEOUT_CONFIG);
} else {
this.maxBlockTimeMs = config.getLong(ProducerConfig.MAX_BLOCK_MS_CONFIG);
}
/* check for user defined settings.
* If the TIME_OUT config is set use that for request timeout.
* This should be removed with release 0.9
*/
if (userProvidedConfigs.containsKey(ProducerConfig.TIMEOUT_CONFIG)) {
log.warn(ProducerConfig.TIMEOUT_CONFIG + " config is deprecated and will be removed soon. Please use " +
ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG);
this.requestTimeoutMs = config.getInt(ProducerConfig.TIMEOUT_CONFIG);
} else {
this.requestTimeoutMs = config.getInt(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG);
}
//创建了RecordAccumulator这个重要的对象
this.accumulator = new RecordAccumulator(config.getInt(ProducerConfig.BATCH_SIZE_CONFIG),
this.totalMemorySize,
this.compressionType,
config.getLong(ProducerConfig.LINGER_MS_CONFIG),
retryBackoffMs,
metrics,
time);
List<InetSocketAddress> addresses = ClientUtils.parseAndValidateAddresses(config.getList(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG));
//去更新元数据
this.metadata.update(Cluster.bootstrap(addresses), time.milliseconds());
ChannelBuilder channelBuilder = ClientUtils.createChannelBuilder(config.values());
//初始化了网络组件,用于接收ClientRequest请求,通过Selector将请求发送给kafka集群
//CONNECTIONS_MAX_IDLE_MS_CONFIG=connections.max.idle.ms:默认值是9分钟,表明网络连接处于空闲状态的最长时间,超过设置的时间就关闭网络
//MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION=max.in.flight.requests.per.connection,默认值是5,限制客户端在单个连接上能够发送的未响应请求的个数
//RECONNECT_BACKOFF_MS_CONFIG=reconnect.backoff.ms,尝试重新连接到给定主机之前等待的时间
//SEND_BUFFER_CONFIG=send.buffer.bytes,表示socket发送数据的缓冲区的大小,默认值是128K
//RECEIVE_BUFFER_CONFIG=receive.buffer.bytess,表示socket接受数据的缓冲区的大小,默认值是32K。
NetworkClient client = new NetworkClient(
new Selector(config.getLong(ProducerConfig.CONNECTIONS_MAX_IDLE_MS_CONFIG), this.metrics, time, "producer", channelBuilder),
this.metadata,
clientId, config.getInt(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION), config.getLong(ProducerConfig.RECONNECT_BACKOFF_MS_CONFIG),
config.getInt(ProducerConfig.SEND_BUFFER_CONFIG),
config.getInt(ProducerConfig.RECEIVE_BUFFER_CONFIG),
this.requestTimeoutMs, time);
//这个就是一个线程
this.sender = new Sender(client,
this.metadata,
this.accumulator, config.getInt(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION) == 1, config.getInt(ProducerConfig.MAX_REQUEST_SIZE_CONFIG), (short)parseAcks(config.getString(ProducerConfig.ACKS_CONFIG)),
config.getInt(ProducerConfig.RETRIES_CONFIG),
this.metrics,
new SystemTime(),
clientId,
this.requestTimeoutMs);
String ioThreadName = "kafka-producer-network-thread" + (clientId.length() > 0 ? " | " + clientId : "");
//创建KafkaThread线程,将sender对象传入,点进去会发现KafkaThread只是接收了线程的名字,并把线程设置为后台线程,没有其他逻辑操作,这样做的好处是可以实现线程控制和业务逻辑解耦。
this.ioThread = new KafkaThread(ioThreadName, this.sender, true);
//启动线程。
this.ioThread.start();
this.errors = this.metrics.sensor("errors");
config.logUnused();
AppInfoParser.registerAppInfo(JMX_PREFIX, clientId);
log.debug("Kafka producer started");
} catch (Throwable t) {
// call close methods if internal objects are already constructed
// this is to prevent resource leak. see KAFKA-2121
close(0, TimeUnit.MILLISECONDS, true);
// now propagate the exception
throw new KafkaException("Failed to construct kafka producer", t);
}
}
到这里就完成了KafkaProducer的初始化流程分析。接下来就要看一下发送消息的流程,我们重点看一下run()方法。
public void run() {
//消息的key
int messageNo = 1;
while (true) {
//消息的value
String messageStr = "Message_" + messageNo;
long startTime = System.currentTimeMillis();
//isAsync: kafka发送数据的方式,true的时候是异步发送,false就是同步发送
if (isAsync) { // Send asynchronously
//ProducerRecord封装了消息的topic、消息的kye和value,并且调用CallBack回调对象,里面包含回调方法,当生产者收到kafka发送过来的ACK确认信号时,就会调用CallBack对象里面的onCompletion()方法
//异步发送,一直发送消息,消息响应结果交给回调函数处理,性能好,生产者绝大部分场景都是使用这种机制
producer.send(new ProducerRecord<>(topic,
messageNo,
messageStr), new DemoCallBack(startTime, messageNo, messageStr));
} else { // Send synchronously
try {
//同步发送,发送一条消息,需要等到这条消息确认已经发送出去了,才能继续发送下一条,性能很差,一般不使用
producer.send(new ProducerRecord<>(topic,
messageNo,
messageStr)).get();
System.out.println("Sent message: (" + messageNo + ", " + messageStr + ")");
} catch (InterruptedException | ExecutionException e) {
e.printStackTrace();
}
}
//对消息的key进行递增
++messageNo;
}
}
我们接着先看CallBack这个回调对象,然后再回头深究producer的send(xxx)方法。
class DemoCallBack implements Callback {
private final long startTime;
private final int key;
private final String message;
public DemoCallBack(long startTime, int key, String message) {
this.startTime = startTime;
this.key = key;
this.message = message;
}
/**
* A callback method the user can implement to provide asynchronous handling of request completion. This method will
* be called when the record sent to the server has been acknowledged. Exactly one of the arguments will be
* non-null.
*
* @param metadata The metadata for the record that was sent (i.e. the partition and offset). Null if an error
* occurred.
* @param exception The exception thrown during processing of this record. Null if no error occurred.
*/
public void onCompletion(RecordMetadata metadata, Exception exception) {
long elapsedTime = System.currentTimeMillis() - startTime;
//RecordMetadata 包含了分区信息、offset信息等等
if (metadata != null) {
System.out.println(
"message(" + key + ", " + message + ") sent to partition(" + metadata.partition() +
"), " +
"offset(" + metadata.offset() + ") in " + elapsedTime + " ms");
} else {
exception.printStackTrace();
}
}
把关注点先放到send(xxx)这个方法上面来,这里附上send()方法的时序图帮助读者阅读。
@Override
public Future<RecordMetadata> send(ProducerRecord<K, V> record, Callback callback) {
// intercept the record, which can be potentially modified; this method does not throw exceptions
ProducerRecord<K, V> interceptedRecord = this.interceptors == null ? record : this.interceptors.onSend(record);
//一把抓住重点
return doSend(interceptedRecord, callback);
}
/**
* 将消息异步发送给topic
* Implementation of asynchronously send a record to a topic.
*/
private Future<RecordMetadata> doSend(ProducerRecord<K, V> record, Callback callback) {
TopicPartition tp = null;
try {
ClusterAndWaitTime clusterAndWaitTime = waitOnMetadata(record.topic(), record.partition(), maxBlockTimeMs);
//clusterAndWaitTime.waitedOnMetadataMs 代表的是拉取元数据耗费的时间。
long remainingWaitMs = Math.max(0, maxBlockTimeMs - clusterAndWaitTime.waitedOnMetadataMs);
//更新集群的元数据
Cluster cluster = clusterAndWaitTime.cluster;
//对消息的key进行序列化。
byte[] serializedKey;
try {
serializedKey = keySerializer.serialize(record.topic(), record.key());
} catch (ClassCastException cce) {
throw new SerializationException("Can't convert key of class " + record.key().getClass().getName() +
" to class " + producerConfig.getClass(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG).getName() +
" specified in key.serializer");
}
//对消息的value进行序列化。
byte[] serializedValue;
try {
serializedValue = valueSerializer.serialize(record.topic(), record.value());
} catch (ClassCastException cce) {
throw new SerializationException("Can't convert value of class " + record.value().getClass().getName() +
" to class " + producerConfig.getClass(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG).getName() +
" specified in value.serializer");
}
//根据分区器和集群元数据信息将消息分发到对应的分区。
int partition = partition(record, serializedKey, serializedValue, cluster);
int serializedSize = Records.LOG_OVERHEAD + Record.recordSize(serializedKey, serializedValue);
//根据分区器选择消息应该发送的分区,默认是1M
ensureValidRecordSize(serializedSize);
//根据元数据信息,封装分区对象
tp = new TopicPartition(record.topic(), partition);
long timestamp = record.timestamp() == null ? time.milliseconds() : record.timestamp();
log.trace("Sending record {} with callback {} to topic {} partition {}", record, callback, record.topic(), partition);
// producer callback will make sure to call both 'callback' and interceptor callback
//消息绑定回调函数
Callback interceptCallback = this.interceptors == null ? callback : new InterceptorCallback<>(callback, this.interceptors, tp);
//把消息放入accumulator,缓冲区大小默认是32M,
RecordAccumulator.RecordAppendResult result = accumulator.append(tp, timestamp, serializedKey, serializedValue, interceptCallback, remainingWaitMs);
//如果批次满了或者新创建出来一个批次
if (result.batchIsFull || result.newBatchCreated) {
log.trace("Waking up the sender since topic {} partition {} is either full or getting a new batch", record.topic(), partition);
//唤醒sender线程,sender才是真正发送数据的线程。
this.sender.wakeup();
}
return result.future;
// handling exceptions and record the errors;
// for API exceptions return them in the future,
// for other exceptions throw directly
} catch (ApiException e) {
log.debug("Exception occurred during message send:", e);
if (callback != null)
callback.onCompletion(null, e);
this.errors.record();
if (this.interceptors != null)
this.interceptors.onSendError(record, tp, e);
return new FutureFailure(e);
} catch (InterruptedException e) {
this.errors.record();
if (this.interceptors != null)
this.interceptors.onSendError(record, tp, e);
throw new InterruptException(e);
} catch (BufferExhaustedException e) {
this.errors.record();
this.metrics.sensor("buffer-exhausted-records").record();
if (this.interceptors != null)
this.interceptors.onSendError(record, tp, e);
throw e;
} catch (KafkaException e) {
this.errors.record();
if (this.interceptors != null)
this.interceptors.onSendError(record, tp, e);
throw e;
} catch (Exception e) {
// we notify interceptor about all exceptions, since onSend is called before anything else in this method
if (this.interceptors != null)
this.interceptors.onSendError(record, tp, e);
throw e;
}
sender线程被唤醒之后就开始往kafka集群发送消息,以上就是对生产者发送消息流程的整体分析,本篇文章分析的粒度比较粗,主要包括了消息的封装、序列化、分区,批次封装、sender线程通过NetworkClient发送请求到Kafka集群这几个重要流程,还介绍了一些常见配置参数,具体的细节没有进行展开。
总结
相信看到这里有很多小伙伴都会有疑问,难道生产者发送消息这么重要的过程就这么简单地介绍完啦,答案肯定不是的,因为生产者发送消息的源码以及实现过程其实还是比较复杂的,为了方便读者阅读,并且更好地理解生产者发送消息的整个过程,我选择将生产者发送消息的整体流程和相关的细节剖析分为两篇写。细节剖析篇涉及的内容比较多,主要包括了集群元数据的管理,元数据的结构,元数据的获取和更新,分区器解析,RecordAccumulator消息核心类封装解析,batch的封装和发送,发送消息的机制,以及如何实现读写安全和高性能,内存池的设计和使用,还有网络通信组件NetworkClient应用等等这些重要的内容,实际展开的时候可能会引出更多的内容。以便于大家更好地理解和串联生产者发送消息的流程,所以有兴趣的童鞋可以持续关注kafka源码剖析系列,内容的关联性很强,尽在#公众号:数据与智能 大数据源码系列,欢迎大家留言,一起交流讨论,指正不足的地方。