上一篇博客,对producer的设计中的我所关注的点,比如如何进行partition,如何保证可靠发送(ack),消息在producer端的内存数据结构等,根据消息发送的流程,对代码进行了一个稍微的梳理和学习,但是还有一些疑问,暂时不打算关注,比如如何确保有序,协议层和网络层等。 看完了之前的流程,头脑有点晕晕的,今天打开client的源码包,挑出上次梳理流程中出现过的api和之前没有关注的api, 出来梳理和学习下
先看下package的结构。
1. BufferPool
BufferPool 能够提供一块特定大小的内存区域的循环使用,避免频繁的开辟和释放的开销;同时,保证对producer的公平,即优先给等待时间最长的producer提供可以使用的内存。
接下来看构造函数
/**
* Create a new buffer pool
*
* @param memory The maximum amount of memory that this buffer pool can allocate
* @param poolableSize The buffer size to cache in the free list rather than deallocating
* @param metrics instance of Metrics
* @param time time instance
* @param metricGrpName logical group name for metrics
*/
public BufferPool(long memory, int poolableSize, Metrics metrics, Time time, String metricGrpName) {
this.poolableSize = poolableSize;
this.lock = new ReentrantLock();
this.free = new ArrayDeque<ByteBuffer>();
this.waiters = new ArrayDeque<Condition>();
this.totalMemory = memory;
this.availableMemory = memory;
this.metrics = metrics;
this.time = time;
this.waitTime = this.metrics.sensor("bufferpool-wait-time");
MetricName metricName = metrics.metricName("bufferpool-wait-ratio",
metricGrpName,
"The fraction of time an appender waits for space allocation.");
this.waitTime.add(metricName, new Rate(TimeUnit.NANOSECONDS));
}
其中关注memory和poolableSize参数,memory表示当前缓冲池能申请的最大内存,poolableSize表示buffer list中cached的buffer的size。
接下来看allocate方法
/**
* Allocate a buffer of the given size. This method blocks if there is not enough memory and the buffer pool
* is configured with blocking mode.
*
* @param size The buffer size to allocate in bytes
* @param maxTimeToBlockMs The maximum time in milliseconds to block for buffer memory to be available
* @return The buffer
* @throws InterruptedException If the thread is interrupted while blocked
* @throws IllegalArgumentException if size is larger than the total memory controlled by the pool (and hence we would block
* forever)
*/
public ByteBuffer allocate(int size, long maxTimeToBlockMs) throws InterruptedException {
if (size > this.totalMemory)
throw new IllegalArgumentException("Attempt to allocate " + size
+ " bytes, but there is a hard limit of "
+ this.totalMemory
+ " on memory allocations.");
this.lock.lock();
try {
// check if we have a free buffer of the right size pooled
if (size == poolableSize && !this.free.isEmpty())
return this.free.pollFirst();
// now check if the request is immediately satisfiable with the
// memory on hand or if we need to block
int freeListSize = this.free.size() * this.poolableSize;
if (this.availableMemory + freeListSize >= size) {
// we have enough unallocated or pooled memory to immediately
// satisfy the request
freeUp(size);
this.availableMemory -= size;
lock.unlock();
return ByteBuffer.allocate(size);
} else {
// we are out of memory and will have to block
int accumulated = 0;
ByteBuffer buffer = null;
Condition moreMemory = this.lock.newCondition();
long remainingTimeToBlockNs = TimeUnit.MILLISECONDS.toNanos(maxTimeToBlockMs);
this.waiters.addLast(moreMemory);
// loop over and over until we have a buffer or have reserved
// enough memory to allocate one
while (accumulated < size) {
long startWaitNs = time.nanoseconds();
long timeNs;
boolean waitingTimeElapsed;
try {
waitingTimeElapsed = !moreMemory.await(remainingTimeToBlockNs, TimeUnit.NANOSECONDS);
} catch (InterruptedException e) {
this.waiters.remove(moreMemory);
throw e;
} finally {
long endWaitNs = time.nanoseconds();
timeNs = Math.max(0L, endWaitNs - startWaitNs);
this.waitTime.record(timeNs, time.milliseconds());
}
if (waitingTimeElapsed) {
this.waiters.remove(moreMemory);
throw new TimeoutException("Failed to allocate memory within the configured max blocking time " + maxTimeToBlockMs + " ms.");
}
remainingTimeToBlockNs -= timeNs;
// check if we can satisfy this request from the free list,
// otherwise allocate memory
if (accumulated == 0 && size == this.poolableSize && !this.free.isEmpty()) {
// just grab a buffer from the free list
buffer = this.free.pollFirst();
accumulated = size;
} else {
// we'll need to allocate memory, but we may only get
// part of what we need on this iteration
freeUp(size - accumulated);
int got = (int) Math.min(size - accumulated, this.availableMemory);
this.availableMemory -= got;
accumulated += got;
}
}
// remove the condition for this thread to let the next thread
// in line start getting memory
Condition removed = this.waiters.removeFirst();
if (removed != moreMemory)
throw new IllegalStateException("Wrong condition: this shouldn't happen.");
// signal any additional waiters if there is more memory left
// over for them
if (this.availableMemory > 0 || !this.free.isEmpty()) {
if (!this.waiters.isEmpty())
this.waiters.peekFirst().signal();
}
// unlock and return the buffer
lock.unlock();
if (buffer == null)
return ByteBuffer.allocate(size);
else
return buffer;
}
} finally {
if (lock.isHeldByCurrentThread())
lock.unlock();
}
}
注释足够清晰,这里我来给自己梳理下:申请给定大小的内存,如果没有足够的内存,线程会被阻塞,这个问题会在producer端消息的发送速度和消息的生产速度不匹配时被放大,因为线程的阻塞会影响应用的可用性,所以在设计和配置缓存大小的时候需要注意这个点。
进入代码,
if (size == poolableSize && !this.free.isEmpty())
return this.free.pollFirst();
如果缓存中刚好有一块内存,且大小刚好和申请的一样,则直接将队列中的byteBuffer返回就可以了;不然,则需要从物理内存中申请,申请代码如下:
if (this.availableMemory + freeListSize >= size) {
// we have enough unallocated or pooled memory to immediately
// satisfy the request
freeUp(size);
this.availableMemory -= size;
lock.unlock();
return ByteBuffer.allocate(size);
}
如果内存不够,则需要阻塞当前线程。
Condition moreMemory = this.lock.newCondition();
long remainingTimeToBlockNs = TimeUnit.MILLISECONDS.toNanos(maxTimeToBlockMs);
this.waiters.addLast(moreMemory);
waites是一个deque,BufferPool就是依靠这个双端队列维护对producer的“公平”。
阻塞方法:
waitingTimeElapsed = !moreMemory.await(remainingTimeToBlockNs, TimeUnit.NANOSECONDS);
还支持部分内存申请
// we'll need to allocate memory, but we may only get
// part of what we need on this iteration
freeUp(size - accumulated);
int got = (int) Math.min(size - accumulated, this.availableMemory);
this.availableMemory -= got;
accumulated += got;
自己申请完毕后,将自己从waiters中移除,并通知下个排队的人进入,很“友好”,
// remove the condition for this thread to let the next thread
// in line start getting memory
Condition removed = this.waiters.removeFirst();
if (removed != moreMemory)
throw new IllegalStateException("Wrong condition: this shouldn't happen.");
// signal any additional waiters if there is more memory left
// over for them
if (this.availableMemory > 0 || !this.free.isEmpty()) {
if (!this.waiters.isEmpty())
this.waiters.peekFirst().signal();
}
最后释放锁,至此整个申请过程完毕。
接下来我们看释放内存的方法deallocate
/**
* Return buffers to the pool. If they are of the poolable size add them to the free list, otherwise just mark the
* memory as free.
*
* @param buffer The buffer to return
* @param size The size of the buffer to mark as deallocated, note that this maybe smaller than buffer.capacity
* since the buffer may re-allocate itself during in-place compression
*/
public void deallocate(ByteBuffer buffer, int size) {
lock.lock();
try {
if (size == this.poolableSize && size == buffer.capacity()) {
buffer.clear();
this.free.add(buffer);
} else {
this.availableMemory += size;
}
Condition moreMem = this.waiters.peekFirst();
if (moreMem != null)
moreMem.signal();
} finally {
lock.unlock();
}
}
代码很简单,如果是缓存池缓存队列规格大小的缓存,则直接入队列,否则,对availableMemory扩容,最后唤醒队列的第一个“同学”,去 拿内存吧,这就是整个释放的逻辑。
整个BufferPool到此学习完毕,BufferPool的核心作用就是管理kafka producer端的异步发送缓存区的内存申请和释放,设计的优劣直接决定了Kafka producer端的可用性。其中对poolableSize的Bytebuffer进行cached和利用deque保证公平性的设计,都值得我们借鉴和学习。
2. ProducerInterceptors && ProducerInterceptor
public ProducerInterceptors(List<ProducerInterceptor<K, V>> interceptors) {
this.interceptors = interceptors;
}
其实ProducerInterceptors 可以看作是ProducerInterceptor的包装,构造函数就是传入一个ProducerInterceptor的列表,然后依次调用。这里我们重点关注ProducerInterceptor接口
先看代码
public ProducerRecord<K, V> onSend(ProducerRecord<K, V> record);
public void onAcknowledgement(RecordMetadata metadata, Exception exception);
public void close();
总共的api就是三个,onSend在KafkaProducer#send方法调用时调用,并且在key和value进行序列化以及parition指定之前调用,需要注意onSend方法的返回是ProducerRecord,通过这点可以判断ProducerInterceptor应该是支持拦截链的,并且可以对ProducerRecord进行拦截修改哦;onAcknowledgement在服务器接收到结果的时候(This method is called when the record sent to the server has been acknowledged, or when sending the record fails before
* it gets sent to the server.);close是在interceptor关闭的时候。
可以通过在配置KafkaProducer的时候使用如下代码启用
properties.put(ProducerConfig.INTERCEPTOR_CLASSES_CONFIG,
"org.zh.kafka_study.producerpart.KafkaProducerSendInterceptor");
这个功能我很喜欢,尤其是在二次开发后和spring在集成的时候很有用,具体的好处大家可以在实践中去体会。
3. Partitioner && DefaultPartitioner
partitioner是producer实现消息分区分发的核心接口,kafkaProducer在每次调用send方法的时候,会首先调用partition方法,如果在发送的时候指定了分区,则使用指定的分区,否则,调用partitioner接口获取当前消息要发放的分区,这就是producer的分区策略,具体可以看代码
/**
* computes partition for given record.
* if the record has partition returns the value otherwise
* calls configured partitioner class to compute the partition.
*/
private int partition(ProducerRecord<K, V> record, byte[] serializedKey , byte[] serializedValue, Cluster cluster) {
Integer partition = record.partition();
if (partition != null) {
List<PartitionInfo> partitions = cluster.partitionsForTopic(record.topic());
int lastPartition = partitions.size() - 1;
// they have given us a partition, use it
if (partition < 0 || partition > lastPartition) {
throw new IllegalArgumentException(String.format("Invalid partition given with record: %d is not in the range [0...%d].", partition, lastPartition));
}
return partition;
}
return this.partitioner.partition(record.topic(), record.key(), serializedKey, record.value(), serializedValue,
cluster);
}
这个在上一篇博客中已经讲到了,这里就不讲了。这里我们还是关注partitioner接口和kafkaProducer提供的默认实现。
先看partition接口
/**
* Partitioner Interface
*/
public interface Partitioner extends Configurable {
/**
* Compute the partition for the given record.
*
* @param topic The topic name
* @param key The key to partition on (or null if no key)
* @param keyBytes The serialized key to partition on( or null if no key)
* @param value The value to partition on or null
* @param valueBytes The serialized value to partition on or null
* @param cluster The current cluster metadata
*/
public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster);
/**
* This is called when partitioner is closed.
*/
public void close();
}
核心就是partition方法,根据topic和key,value以及集群的相关信息,返回计算出来的分区number。
接下来我们再关注默认的实现,先看代码
/**
* Compute the partition for the given record.
*
* @param topic The topic name
* @param key The key to partition on (or null if no key)
* @param keyBytes serialized key to partition on (or null if no key)
* @param value The value to partition on or null
* @param valueBytes serialized value to partition on or null
* @param cluster The current cluster metadata
*/
public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);
int numPartitions = partitions.size();
if (keyBytes == null) {
int nextValue = counter.getAndIncrement();
List<PartitionInfo> availablePartitions = cluster.availablePartitionsForTopic(topic);
if (availablePartitions.size() > 0) {
int part = DefaultPartitioner.toPositive(nextValue) % availablePartitions.size();
return availablePartitions.get(part).partition();
} else {
// no partitions are available, give a non-available partition
return DefaultPartitioner.toPositive(nextValue) % numPartitions;
}
} else {
// hash the keyBytes to choose a partition
return DefaultPartitioner.toPositive(Utils.murmur2(keyBytes)) % numPartitions;
}
}
代码很简单,获取当前消息的分区大小,然后看消息是否指定了key。如果指定了key,则对key的byte数组进行hash然后对hash值取余;如果没有指定的key,则看集群中是否有availablePartitions,如果有,则从availablePartitions轮询,否则,从所有的分区列表中轮询。
暂时先写这么多,后期再接着写