最近项目使用Kafka收发消息,用的是Java的API调用Kafka,同事从网上下载了一个通用的写法,就是将Topic直接传入Consumer的构造类,也就是根据分区(partition)的多少建立相应的线程数,然后读取这个Topic里面的消息。直接上代码:
public KafkaMessageConsumer(String topic, int partitionsNum, MessageExecutor executor) throws Exception {
Properties properties = new Properties();
InputStream fis = KafkaMessageConsumer.class.getClassLoader()
.getResourceAsStream("consumer.properties");
properties.load(fis);
config = new ConsumerConfig(properties);
this.topic = topic;
this.partitionsNum = partitionsNum;
this.executor = executor;
}
public void kafkaStart() throws Exception {
connector = Consumer.createJavaConsumerConnector(config);
Map<String, Integer> topics = new HashMap<String, Integer>();
topics.put(topic, partitionsNum);
Map<String, List<KafkaStream<byte[], byte[]>>> streams = connector.createMessageStreams(topics);
List<KafkaStream<byte[], byte[]>> partitions = streams.get(topic);
threadPool = Executors.newFixedThreadPool(partitionsNum);
for (KafkaStream<byte[], byte[]> partition : partitions) {
threadPool.execute(new MessageRunner(partition));
}
}
class MessageRunner implements Runnable {
private KafkaStream<byte[], byte[]> partition;
MessageRunner(KafkaStream<byte[], byte[]> partition) {
this.partition = partition;
}
public void run() {
ConsumerIterator<byte[], byte[]> it = partition.iterator();
while (it.hasNext()) {
MessageAndMetadata<byte[], byte[]> item = it.next();
System.out.println("partiton:" + item.partition());
System.out.println("offset:" + item.offset());
executor.execute(new String(item.message()));//UTF-8
}
}
}
interface MessageExecutor {
public void execute(String message);
}
}
然后写个调用类:
public class Test {
public static void start() {
KafkaMessageConsumer consumer = null;
log.info("kafka Message Queue listen is running...");
try {
MessageExecutor executor = new MessageExecutor() {
public void execute(String message) {
log.info(message);
try {
kafkaMessageDispatcher.processMessage(message);
} catch (Exception e) {
e.printStackTrace();
}
}
};
consumer = new KafkaMessageConsumer(Constants.MY_TOPIC,3, executor);
consumer.kafkaStart();
} catch (Exception e) {
e.printStackTrace();
}
}
}
在测试类里面直接调用KafkaMessageConsumer传入我们自己的Topic和partition,最后就是真正的收到消息后的逻辑执行实例executor了。
这样写也挺好,但是用到了好几个内部类,看着就累。而且最近学习了插件开发,感觉只需要继承某个类或者实现某个接口,就可以让自己的代码被加载执行。
于是,我的改写大致上这样的:
1. 建立一个接口,接口里只有一个方法,实现这个接口的类需要使用这个方法实现自己消费消息的业务逻辑。
2. 实现类实现这个接口,同时在构造方法上使用注解的方式,写上该实现类关注的Topic已经相应的Partition
3. 然后修改Kafka Consumer类,使用反射的方式读取所有实现这个接口的类,并实例化,然后读取每个实例的构造函数上的注解里面的信息,利用这些信息对每个实例类进行线程构造,线程里监听关注的Topic。
接口类的定义:
public interface IKafkaConsumer {
void consumeMessage(MessageAndMetadata<byte[], byte[]> binaryMsg);
}
注解的定义:
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.CONSTRUCTOR) //on constructor level
public @interface KafkaTopicSetup {
String topic();
int partition();
}
KafkaConsumer类的重写:
public class KafkaMsgConsumer {
private static final Log log = LogFactory.getLog(KafkaMsgConsumer.class);
private ConsumerConfig config;
private ConsumerConnector connector;
private ExecutorService threadPool;
public KafkaMsgConsumer() throws Exception {
Properties properties = new Properties();
InputStream fis = KafkaMsgConsumer.class.getClassLoader()
.getResourceAsStream("consumer.properties");
properties.load(fis);
config = new ConsumerConfig(properties);
connector = Consumer.createJavaConsumerConnector(config);
}
public void kafkaStart() throws Exception {
log.info("Come to kafkaStart method for openam!!");
Map<String, List<KafkaStream<byte[], byte[]>>> streams = connector.createMessageStreams(getTopics());
Reflections reflections = new Reflections("com.cpinec.*");
Set<Class<? extends IKafkaConsumer>> subTypes = reflections.getSubTypesOf(IKafkaConsumer.class);
for (Class<? extends IKafkaConsumer> elem : subTypes){
try {
Constructor<? extends IKafkaConsumer> constructor = elem.getDeclaredConstructor();
IKafkaConsumer consumer = constructor.newInstance();
KafkaTopicSetup topicStuff = constructor.getAnnotation(KafkaTopicSetup.class);
String topicName = topicStuff.topic();
int partitionNum = topicStuff.partition();
if (topicName.trim().length() == 0 || partitionNum <= 0){
throw new Exception("Topic name is not set or partition number is less than 0.");
}
List<KafkaStream<byte[], byte[]>> partitions = streams.get(topicName);
threadPool = Executors.newFixedThreadPool(partitionNum);
for (KafkaStream<byte[], byte[]> partition : partitions) {
threadPool.execute(new MessageRunner(partition, consumer));
}
}
catch (Exception e) {
e.printStackTrace();
}
}
}
class MessageRunner implements Runnable {
private KafkaStream<byte[], byte[]> partition;
private IKafkaConsumer consumer;
MessageRunner(KafkaStream<byte[], byte[]> partition, IKafkaConsumer consumer) {
this.partition = partition;
this.consumer = consumer;
}
public void run() {
ConsumerIterator<byte[], byte[]> it = partition.iterator();
while (it.hasNext()) {
consumer.consumeMessage(it.next());//UTF-8
}
}
}
}
这个里面最主要的修改在kafkaStart()这个方法里,首先使用反射获取实现类,第一个for loop实例化每个实现类,然后获取构造函数的注解信息,构造线程池,然后MessageRunner里面的run() 监听该实现类关注Topic的消息,如果有就执行接口里面的方法。
最后就是实现类的定义:
public class OpenAMServlet implements IKafkaConsumer {
private static final Log log = LogFactory.getLog(OpenAMServlet.class);
@KafkaTopicSetup(topic = KAFKA_FRONTEND_TOPIC, partition = 3)
public OpenAMServlet(){}
@Override
public void consumeMessage(MessageAndMetadata<byte[], byte[]> binaryMsg){
.............
}
}
注意的两点: 1. 实现IKafkaConsumer接口; 2. 构造函数必须为无参数的,同时上面用注解写上该类要消费的Topic及partition。
最后就是将这个Kafka Consumer类注册为一个Spring的一个Bean,设置init-method为 kafkaStart()方法。
<bean class="com.cpinec.member.common.kafka.KafkaMsgConsumer" init-method="kafkaStart" scope="singleton"/>
而网上的例子是直接将Topic,partition信息通过调用KafkaConsumer的构造函数传入,这个是不是更快捷一点?别的不说,每次new一个实例的时候,构造函数都会被执行一次,那么每次都要读取consumer.properties文件,这个也是一个开销,我的这个方法只有一次,就是Spring实例化Bean的时候。