Spring 与 Kafka集成实战

最新推荐文章于 2024-08-22 22:26:16 发布

钟艾伶

最新推荐文章于 2024-08-22 22:26:16 发布

阅读量9.1k

点赞数 1

本文以单机的环境演示如何将Kafka和Spring集成。
单机的环境最容易搭建，并且只需在自己的PC上运行即可，不需要很多的硬件环境，便于学习。况且，本文的目的不是搭建ZooKeeper的集群环境，而是重点介绍Kafka和Spring的应用。

具体的软件环境如下：

OS: CentOS 6.4
Zookepper： zookeeper-3.4.6
Kafka： kafka_2.9.1-0.8.2-beta
Java： JDK 1.7.0_45-b18
Spring:4.0.6

本例子在我的这个环境中运行正常，全部代码可以到 github 下载。

网盘下载：

spring-kafka-demo.rar (65.22 KB, 下载次数: 83)

本文所有的操作系统用户都是root。实际产品中可能安全标准需要特定的用户如zookeeper, kafka等。

安装Zookeeper

首先下载解压zookeeper,选择合适的镜像站点以加快下载速度。

我们可以将zookeeper加到系统服务中，增加一个/etc/init.d/zookeeper文件。

cd /opt
wget  http://apache.fayea.com/apache-mirror/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz
tar zxvf zookeeper-3.4.6.tar.gz
vi /etc/init.d/zookeeper

将 https://raw.githubusercontent.com/apache/zookeeper/trunk/src/packages/rpm/init.d/zookeeper 文件的内容拷贝到这个文件，修改其中的运行zookeeper的用户以及zookeeper的文件夹位置。

......
start() {
  echo -n [        DISCUZ_CODE_1        ]quot;Starting $desc (zookeeper): "
  daemon --user root /opt/zookeeper-3.4.6/zkServer.sh start
  RETVAL=$?
  echo
  [ $RETVAL -eq 0 ] && touch /var/lock/subsys/zookeeper
  return $RETVAL
}
stop() {
  echo -n [        DISCUZ_CODE_1        ]quot;Stopping $desc (zookeeper): "
  daemon --user root /opt/zookeeper-3.4.6/zkServer.sh stop
  RETVAL=$?
  sleep 5
  echo
  [ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/zookeeper $PIDFILE
}
......

chmod 755 /etc/init.d/zookeeper
service zookeeper start

如果你不想加到服务，也可以直接运行zookeeper。/opt/zookeeper-3.4.6/zkServer.sh start

安装Kafka

从合适的镜像站点下载最新的kafka并解压。启动、创建topic

wget http://apache.01link.hk/kafka/0.8.2-beta/kafka_2.9.1-0.8.2-beta.tgz
tar zxvf kafka_2.9.1-0.8.2-beta.tgz
cd kafka_2.9.1-0.8.2-beta
bin/kafka-server-start.sh config/server.properties
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test

更多的介绍可以查看我翻译整理的

kafka入门：简介、使用场景、设计原理、主要配置及集群搭建

创建一个Spring项目

以上的准备环境完成，让我们开始创建一个项目。
以前我写过一篇简单介绍: Spring 集成 Kafka.
spring-integration-kafka这个官方框架我就不介绍了。我们主要使用它做集成。

首先我们先看一下使用Kafka自己的Producer/Consumer API发送/接收消息的例子。

使用Producer API发送消息到Kafka

OK，现在我们先看一个使用Kafka 自己的producer API发送消息的例子：

public class NativeProducer {
        public static void main(String[] args) {
                String topic= "test";
                long events = 100;
        Random rand = new Random();
 
        Properties props = new Properties();
        props.put("metadata.broker.list", "localhost:9092");
        props.put("serializer.class", "kafka.serializer.StringEncoder");
        props.put("request.required.acks", "1");
 
        ProducerConfig config = new ProducerConfig(props);
 
        Producer<String, String> producer = new Producer<String, String>(config);
 
        for (long nEvents = 0; nEvents < events; nEvents++) {                
               String msg = "NativeMessage-" + rand.nextInt() ; 
               KeyedMessage<String, String> data = new KeyedMessage<String, String>(topic, nEvents + "", msg);
               producer.send(data);
        }
        producer.close();
        }
}

这个例子中首先初始化Producer对象，指定相应的broker和serializer，然后发送100个字符串消息给Kafka。

运行mvn package编译代码，执行查看结果：

java -cp target/lib/*:target/spring-kafka-demo-0.2.0-SNAPSHOT.jar com.colobu.spring_kafka_demo.NativeProducer

使用Kafka High Level API接收消息

用High level Consumer API接收消息，

import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Properties;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import kafka.consumer.ConsumerConfig;
import kafka.consumer.ConsumerIterator;
import kafka.consumer.KafkaStream;
import kafka.javaapi.consumer.ConsumerConnector;
public class NativeConsumer {
        private final ConsumerConnector consumer;
        private final String topic;
        private ExecutorService executor;
        public NativeConsumer(String a_zookeeper, String a_groupId, String a_topic) {
                consumer = kafka.consumer.Consumer.createJavaConsumerConnector(createConsumerConfig(a_zookeeper, a_groupId));
                this.topic = a_topic;
        }
        public void shutdown() {
                if (consumer != null)
                        consumer.shutdown();
                if (executor != null)
                        executor.shutdown();
        }
        public void run(int a_numThreads) {
                Map<String, Integer> topicCountMap = new HashMap<String, Integer>();
                topicCountMap.put(topic, new Integer(a_numThreads));
                Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer.createMessageStreams(topicCountMap);
                List<KafkaStream<byte[], byte[]>> streams = consumerMap.get(topic);
                // now launch all the threads
                //
                executor = Executors.newFixedThreadPool(a_numThreads);
                // now create an object to consume the messages
                //
                int threadNumber = 0;
                for (final KafkaStream stream : streams) {
                        executor.submit(new ConsumerTest(stream, threadNumber));
                        threadNumber++;
                }
        }
        private static ConsumerConfig createConsumerConfig(String a_zookeeper, String a_groupId) {
                Properties props = new Properties();
                props.put("zookeeper.connect", a_zookeeper);
                props.put("group.id", a_groupId);
                props.put("zookeeper.session.timeout.ms", "400");
                props.put("zookeeper.sync.time.ms", "200");
                props.put("auto.commit.interval.ms", "1000");
                return new ConsumerConfig(props);
        }
        public static void main(String[] args) {
                String zooKeeper = "localhost:2181";
                String groupId = "mygroup";
                String topic = "test";
                int threads = 1;
                NativeConsumer example = new NativeConsumer(zooKeeper, groupId, topic);
                example.run(threads);
                try {
                        Thread.sleep(10000);
                } catch (InterruptedException ie) {
                }
                //example.shutdown();
        }
}
class ConsumerTest implements Runnable {
    private KafkaStream m_stream;
    private int m_threadNumber;
 
    public ConsumerTest(KafkaStream a_stream, int a_threadNumber) {
        m_threadNumber = a_threadNumber;
        m_stream = a_stream;
    }
 
    public void run() {
        ConsumerIterator<byte[], byte[]> it = m_stream.iterator();
        while (it.hasNext())
            System.out.println("Thread " + m_threadNumber + ": " + new String(it.next().message()));
        System.out.println("Shutting down Thread: " + m_threadNumber);
    }
}

在生产者控制台输入几条消息，可以看到运行这个例子的控制台可以将这些消息打印出来。

教程的代码中还包括一个使用Simple Consumer API接收消息的例子。因为spring-integration-kafka不支持这种API，这里也不列出对比代码了。

使用spring-integration-kafka发送消息

Outbound Channel Adapter用来发送消息到Kafka。消息从Spring Integration Channel中读取。你可以在Spring application context指定这个channel。
一旦配置好这个Channel，就可以利用这个Channel往Kafka发消息。明显地，Spring Integration特定的消息发送给这个Adaptor，然后发送前在内部被转为Kafka消息。当前的版本要求你必须指定消息key和topic作为头部数据 (header)，消息作为有载荷(payload)。
例如

import java.util.Random;
import org.springframework.context.support.ClassPathXmlApplicationContext;
import org.springframework.integration.support.MessageBuilder;
import org.springframework.messaging.MessageChannel;
public class Producer {
        private static final String CONFIG = "/context.xml";
        private static Random rand = new Random();
        public static void main(String[] args) {
                final ClassPathXmlApplicationContext ctx = new ClassPathXmlApplicationContext(CONFIG, Producer.class);
                ctx.start();
                final MessageChannel channel = ctx.getBean("inputToKafka", MessageChannel.class);
                for (int i = 0; i < 100; i++) {
                        channel.send(MessageBuilder.withPayload("Message-" + rand.nextInt()).setHeader("messageKey", String.valueOf(i)).setHeader("topic", "test").build());
                }
                try {
                        Thread.sleep(100000);
                } catch (InterruptedException e) {
                        e.printStackTrace();
                }
                ctx.close();
        }
}

Spring 配置文件：

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:int="http://www.springframework.org/schema/integration"
       xmlns:int-kafka="http://www.springframework.org/schema/integration/kafka"
       xmlns:task="http://www.springframework.org/schema/task"
       xsi:schemaLocation="http://www.springframework.org/schema/integration/kafka http://www.springframework.org/schema/integration/kafka/spring-integration-kafka.xsd
                http://www.springframework.org/schema/integration http://www.springframework.org/schema/integration/spring-integration.xsd
                http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
                http://www.springframework.org/schema/task http://www.springframework.org/schema/task/spring-task.xsd">
    <int:channel id="inputToKafka">
        <int:queue/>
    </int:channel>
    <int-kafka:outbound-channel-adapter id="kafkaOutboundChannelAdapter"
                                        kafka-producer-context-ref="kafkaProducerContext"
                                        auto-startup="false"
                                        channel="inputToKafka"
                                        order="3"
            >
        <int:poller fixed-delay="1000" time-unit="MILLISECONDS" receive-timeout="0" task-executor="taskExecutor"/>
    </int-kafka:outbound-channel-adapter>
    <task:executor id="taskExecutor" pool-size="5" keep-alive="120" queue-capacity="500"/>
        <bean id="producerProperties"
                class="org.springframework.beans.factory.config.PropertiesFactoryBean">
                <property name="properties">
                        <props>
                                <prop key="topic.metadata.refresh.interval.ms">3600000</prop>
                                <prop key="message.send.max.retries">5</prop>
                                <prop key="serializer.class">kafka.serializer.StringEncoder</prop>
                                <prop key="request.required.acks">1</prop>
                        </props>
                </property>
        </bean>
        <int-kafka:producer-context id="kafkaProducerContext"
                producer-properties="producerProperties">
                <int-kafka:producer-configurations>
                        <int-kafka:producer-configuration broker-list="localhost:9092"
                       topic="test"
                       compression-codec="default"/>
                </int-kafka:producer-configurations>
        </int-kafka:producer-context>
</beans>

int:channel是配置Spring Integration Channel, 此channel基于queue。
int-kafka:outbound-channel-adapter是outbound-channel-adapter对象，内部使用一个线程池处理消息。关键是kafka-producer-context-ref。
int-kafka:producer-context配置producer列表，要处理的topic，这些Producer最终要转换成Kafka的Producer。

producer的配置参数如下：

broker-list             List of comma separated brokers that this producer connects to
topic                   Topic name or Java regex pattern of topic name
compression-codec       Compression method to be used. Default is no compression. Supported compression codec are gzip and snappy.
                        Anything else would result in no compression
value-encoder           Serializer to be used for encoding messages.
key-encoder             Serializer to be used for encoding the partition key
key-class-type          Type of the key class. This will be ignored if no key-encoder is provided
value-class-type        Type of the value class. This will be ignored if no value-encoder is provided.
partitioner             Custom implementation of a Kafka Partitioner interface.
async                   True/False - default is false. Setting this to true would make the Kafka producer to use
                        an async producer
batch-num-messages      Number of messages to batch at the producer. If async is false, then this has no effect.

Spring Integration Kafka 也提供了个基于Avro的Encoder。 Avro也是Apache的一个项目，在大数据处理时也是一个常用的序列化框架。
不指定Encoder将使用Kafka缺省的Encoder (kafka.serializer.DefaultEncoder, byte[] —> same byte[])。

producerProperties可以用来设置配置属性进行调优。配置属性列表请参考 http://kafka.apache.org/documentation.html#producerconfigs

使用spring-integration-kafka接收消息

同样的原理实现一个消费者：

package com.colobu.spring_kafka_demo;
import java.util.Collection;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.Random;
import java.util.Set;
import java.util.concurrent.ConcurrentHashMap;
import org.slf4j.LoggerFactory;
import org.springframework.context.support.ClassPathXmlApplicationContext;
import org.springframework.integration.channel.QueueChannel;
import org.springframework.messaging.Message;
import ch.qos.logback.classic.Level;
public class Consumer {
        private static final String CONFIG = "/consumer_context.xml";
        private static Random rand = new Random();
        @SuppressWarnings({ "unchecked", "unchecked", "rawtypes" })
        public static void main(String[] args) {
                ch.qos.logback.classic.Logger rootLogger = (ch.qos.logback.classic.Logger)LoggerFactory.getLogger(ch.qos.logback.classic.Logger.ROOT_LOGGER_NAME);
                rootLogger.setLevel(Level.toLevel("info"));
                
                final ClassPathXmlApplicationContext ctx = new ClassPathXmlApplicationContext(CONFIG, Consumer.class);
                ctx.start();
                final QueueChannel channel = ctx.getBean("inputFromKafka", QueueChannel.class);
                Message msg;                
                while((msg = channel.receive()) != null) {
                        HashMap map = (HashMap)msg.getPayload();
                        Set<Map.Entry> set = map.entrySet();
                        for (Map.Entry entry : set) {
                                String topic = (String)entry.getKey();
                                System.out.println("Topic:" + topic);
                                ConcurrentHashMap<Integer,List<byte[]>> messages = (ConcurrentHashMap<Integer,List<byte[]>>)entry.getValue();
                                Collection<List<byte[]>> values = messages.values();
                                for (Iterator<List<byte[]>> iterator = values.iterator(); iterator.hasNext();) {
                                        List<byte[]> list = iterator.next();
                                        for (byte[] object : list) {
                                                String message = new String(object);
                                                System.out.println("\tMessage: " + message);
                                        }
                                        
                                }
                        
                        }
                        
                }
                
                try {
                        Thread.sleep(100000);
                } catch (InterruptedException e) {
                        e.printStackTrace();
                }
                ctx.close();
        }
}

转自********************http://www.aboutyun.com/forum.php?mod=viewthread&tid=10321*************

spring-integration-kafka是Spring官方提供的一个Spring集成框架的扩展，用来为使用Spring框架的应用程序提供Kafka框架的集成。
当前spring-integration-kafka仅提供Kafka 0.8的集成，低版本的Kafka并不支持。

新的文章介绍了代码实践： Kafka和Spring集成实践

spring-integration-kafka仅仅支持两个组件，分别对应Producer和 High Level Consumer。它们分别是：