序
本文主要研究一下kafka的partition分配,主要是key到parition的映射,partition对consumer的分配,以及partition的replica对broker/machine的分配。
1.key到partition的映射
在kafka0.8版本的时候,是这样的
kafka-clients-0.8.2.2-sources.jar!/org/apache/kafka/clients/producer/internals/Partitioner.java
/**
* The default partitioning strategy:
*
*
If a partition is specified in the record, use it*
If no partition is specified but a key is present choose a partition based on a hash of the key*
If no partition or key is present choose a partition in a round-robin fashion*/
public class Partitioner {
private final AtomicInteger counter = new AtomicInteger(new Random().nextInt());
/**
* Compute the partition for the given record.
*
* @param record The record being sent
* @param cluster The current cluster metadata
*/
public int partition(ProducerRecord record, Cluster cluster) {
List partitions = cluster.partitionsForTopic(record.topic());
int numPartitions = partitions.size();
if (record.partition() != null) {
// they have given us a partition, use it
if (record.partition() < 0 || record.partition() >= numPartitions)
throw new IllegalArgumentException("Invalid partition given with record: " + record.partition()
+ " is not in the range [0..."
+ numPartitions
+ "].");
return record.partition();
} else if (record.key() == null) {
int nextValue = counter.getAndIncrement();
List availablePartitions = cluster.availablePartitionsForTopic(record.topic());
if (availablePartitions.size() > 0) {
int part = Utils.abs(nextValue) % availablePartitions.size();
return availablePartitions.get(part).partition();
} else {
// no partitions are available, give a non-available partition
return Utils.abs(nextValue) % numPartitions;
}
} else {
// hash the key to choose a partition
return Utils.abs(Utils.murmur2(record.key())) % numPartitions;
}
}
}
kafka0.9+版本
0.9+版本支持了自定义parition,可以通过partitioner.class这个属性来设置。原来的Partitioner变成一个接口:kafka-clients-0.9.0.1-sources.jar!/org/apache/kafka/clients/producer/Partitioner.java
public interface Partitioner extends Configurable {
/**
* Compute the partition for the given record.
*
* @param topic The topic name
* @param key The key to partition on (or null if no key)
* @param keyBytes The serialized key to partition on( or null if no key)
* @param value The value to partition on or null
* @param valueBytes The serialized value to partition on or null
* @param cluster The current cluster metadata
*/
public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster);
/**
* This is called when partitioner is closed.
*/
public void close();
}
然后之前的实现改为默认的实现
kafka-clients-0.9.0.1-sources.jar!/org/apache/kafka/clients/producer/internals/DefaultPartitioner.java
/**
* The default partitioning strategy:
*
*
If a partition is specified in the record, use it*
If no partition is specified but a key is present choose a partition based on a hash of the key*
If no partition or key is present choose a partition in a round-robin fashion*/
public class DefaultPartitioner implements Partitioner {
private final AtomicInteger counter = new AtomicInteger(new Random().nextInt());
/**
* A cheap way to deterministically convert a number to a positive value. When the input is
* positive, the original value is returned. When the input number is negative, the returned
* positive value is the original value bit AND against 0x7fffffff which is not its absolutely
* value.
*
* Note: changing this method in the future will possibly cause partition selection not to be
* compatible with the existing messages already placed on a partition.
*
* @param number a given number
* @return a positive number.
*/
private static int toPositive(int number) {
return number & 0x7fffffff;
}
public void configure(Map configs) {}
/**
* Compute the partition for the given record.
*
* @param topic The topic name
* @param key The key to partition on (or null if no key)
* @param keyBytes serialized key to partition on (or null if no key)
* @param value The value to partition on or null
* @param valueBytes serialized value to partition on or null
* @param cluster The current cluster metadata
*/
public int p