本文实现kafka与Spark Streaming之间的通信,其中Kafka端producer实现使用Java,Spark Streaming端Consumer使用Python实现。
首先安装kafka与spark streaming环境,kafka测试连通测试参考上文,本文的实验环境都为本地单机版本。
Kafka
import org.apache.kafka.clients.producer.*;
import org.apache.kafka.common.serialization.StringSerializer;
import java.util.Properties;
public class producer {
private final static String TOPIC = "data-message";
private final static String BOOTSTRAP_SERVER = "127.0.0.1:9092";
public static Producer<String,String> createProducer() {
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,BOOTSTRAP_SERVER);
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
return new KafkaProducer<>(props);
}
// 实现自定义partition
public